Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #77357 > unrolled thread

Working efficiently with 32-bit Unicode output streams, locale etc.

Started by"Morten W. Petersen" <morphex@gmail.com>
First post2015-11-29 01:06 +0100
Last post2015-12-02 09:58 -0800
Articles 20 on this page of 210 — 25 participants

Back to article view | Back to comp.lang.c


Contents

  Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 01:06 +0100
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-11-29 02:01 +0000
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 03:31 +0100
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 00:09 -0600
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Robert Wessel <robertwessel2@yahoo.com> - 2015-11-29 00:22 -0600
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-11-29 14:31 -0500
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-11-29 23:51 +0000
          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 01:21 +0100
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 00:41 -0800
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 03:16 -0600
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-11-29 08:28 +0000
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 02:54 -0600
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-11-29 16:30 +1300
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-28 23:53 -0800
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 02:23 -0600
          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 00:30 -0800
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 01:33 +0100
              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 13:54 +1300
                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 02:03 +0100
                  Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 14:15 +1300
                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 02:34 +0100
                      Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 14:42 +1300
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 04:16 +0100
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 20:20 -0600
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 04:34 +0100
                          Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 17:09 +1300
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 06:17 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 19:44 +1300
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 23:36 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 07:39 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:56 -0600
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-01 09:17 +0100
                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:40 -0600
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 00:34 +0100
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 16:03 -0800
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 23:07 -0800
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 08:20 +0100
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 23:40 -0800
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 08:48 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 20:52 +1300
                                Re: Working efficiently with 32-bit Unicode output streams, locale     etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 21:04 +1300
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 00:34 -0800
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 03:50 -0600
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-11-30 12:16 +0000
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 06:11 -0800
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:23 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:18 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 13:23 -0800
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-11-30 22:32 +0000
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 15:10 -0800
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 21:05 -0600
                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 12:38 +0000
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-01 14:43 +0000
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 12:09 -0800
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-12-02 09:14 +1300
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 12:27 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-12-02 10:14 +1300
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-01 18:01 -0600
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 20:41 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-01 12:53 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 21:32 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-01 13:55 -0800
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:30 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-01 18:46 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Say, what? <<nothing@nowhere.nohow>> - 2015-12-01 14:07 -0800
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 23:54 +0000
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Say, what? <<nothing@nowhere.nohow>> - 2015-12-01 17:13 -0800
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 09:08 -0600
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 20:02 +0000
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 17:03 -0600
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 00:17 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 16:53 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 21:17 -0600
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 09:37 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. James Kuyper <jameskuyper@verizon.net> - 2015-12-02 10:59 -0500
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 17:43 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:22 -0600
                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:32 +1300
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 21:12 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 10:36 +1300
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 22:00 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 17:55 -0600
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-02 17:04 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 01:11 +0000
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 14:19 +1300
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 23:16 -0600
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 00:54 -0600
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 04:07 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 18:31 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Eric Sosman <esosman@comcast-dot-net.invalid> - 2015-12-03 13:59 -0500
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 19:45 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 14:38 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 22:43 +0000
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 12:14 +0000
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 12:38 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 13:19 +0000
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:54 -0800
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:50 +0000
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 14:26 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 09:19 -0600
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:25 +0100
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 15:33 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:47 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 16:54 +0000
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 09:32 -0800
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 18:53 +0100
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-03 19:00 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-04 14:07 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:41 +0000
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-05 16:09 +0100
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-05 21:15 +0000
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-06 12:35 +0100
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-03 09:02 -0800
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 19:12 +0000
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 16:58 -0600
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 15:47 +0100
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 14:51 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:50 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:55 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 08:56 -0600
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:24 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-04 08:49 +1300
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 07:07 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:27 -0600
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 09:01 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-03 10:16 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 01:21 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 16:42 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-04 11:15 +0100
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-08 01:57 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-08 09:08 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 09:44 -0600
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 15:58 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 11:43 -0600
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Geoff <geoff@invalid.invalid> - 2015-12-04 10:56 -0800
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-04 11:20 -0800
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:24 -0600
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 09:30 -0600
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 15:52 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 09:07 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 09:53 -0800
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 10:56 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:04 -0600
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-04 21:32 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 13:38 -0800
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 16:13 -0600
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 16:21 -0800
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 19:10 -0600
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Geoff <geoff@invalid.invalid> - 2015-12-04 19:16 -0800
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 21:19 -0800
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 12:44 -0600
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-06 09:01 -0800
                                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 12:34 -0600
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-06 18:32 -0800
                                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 10:43 -0600
                                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-07 10:02 -0800
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 03:53 -0800
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-05 09:39 -0800
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-05 18:36 +0000
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 12:26 -0600
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 11:36 -0800
                                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Udyant Wig <udyantw@gmail.com> - 2015-12-06 16:42 +0530
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 03:59 -0800
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Robert Wessel <robertwessel2@yahoo.com> - 2015-12-07 02:17 -0600
                                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-07 07:33 -0800
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-03 03:57 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 00:58 +0100
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 01:34 +0000
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 11:38 +0000
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 14:09 +0000
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:10 -0600
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 08:28 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 21:33 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-02 21:47 +0000
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 16:05 -0600
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-02 14:12 -0800
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 22:47 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 14:00 +1300
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 01:38 -0600
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 02:20 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:40 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-12-03 02:42 +0000
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-12-01 20:48 -0500
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 12:08 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 04:21 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 14:05 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 01:31 +0100
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 14:23 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 08:00 -0800
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 16:49 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 11:50 -0800
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 20:02 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 12:31 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 01:43 +0000
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-02 09:21 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-12-02 07:29 -0500
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 05:47 -0800
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:03 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 14:16 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:56 +1300
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:49 -0600
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Philip Lantz <prl@canterey.us> - 2015-12-02 22:11 -0800
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:06 -0600
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-11-30 22:14 +0000
              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 23:03 -0600
                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 06:26 +0100
                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 00:39 -0800
                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 01:57 -0800
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 15:32 +0100
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-02 09:58 -0800

Page 7 of 11 — ← Prev page 1 … 5 6 [7] 8 9 … 11  Next page →


#77735

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-03 05:24 -0800
Message-ID<d77c91ab-e992-4af8-aae9-5e37f82b21c4@googlegroups.com>
In reply to#77729
On Thursday, December 3, 2015 at 12:14:59 PM UTC, Bart wrote:
> On 03/12/2015 06:54, Stephen Sprunk wrote:
>
> But if my approach ends up working well /for most practical purposes/, 
> then why shouldn't it work elsewhere?
> 
> (I believe many things are far more complex that they need to be. For 
> example, DOCX, the de factor standard for describing written documents, 
> is over 5000 pages! All I'm interested in is specifying Bold, Italic, 
> Underline, Font, Size and Colour -- a mark-up scheme for those surely 
> wouldn't take more than one page. So what are the over 4999 pages about?
> 
> Maybe Unicode suffers from the same problem.)
> 
Unicode is designed to support everything to do with electronic text, at least
as far as meaning goes. it doesn't try to handle presentation or layout, except
at the glyph level. 
A huge number of applications never use anything more than the ascii subset.
And only a few people have to deal with things like Hebrew religious texts.
However the users of such texts are extremely fussy about every jot and tittle,
if a job does involve laying out such texts, the customers will certainly
complain if even one dot is in the wrong place. 

[toc] | [prev] | [next] | [standalone]


#77776 — Re: Working efficiently with 32-bit Unicode output streams, locale etc.

FromIan Collins <ian-news@hotmail.com>
Date2015-12-04 08:49 +1300
SubjectRe: Working efficiently with 32-bit Unicode output streams, locale etc.
Message-ID<dcbklqFi96mU11@mid.individual.net>
In reply to#77729
BartC wrote:
>
> (I believe many things are far more complex that they need to be. For
> example, DOCX, the de factor standard for describing written documents,
> is over 5000 pages! All I'm interested in is specifying Bold, Italic,
> Underline, Font, Size and Colour -- a mark-up scheme for those surely
> wouldn't take more than one page. So what are the over 4999 pages about?

Surely the de facto standard for describing written documents is ODF? 
At under 900 pages it is a waif for a specification...

-- 
Ian Collins

[toc] | [prev] | [next] | [standalone]


#77748

Fromsupercat@casperkitty.com
Date2015-12-03 07:07 -0800
Message-ID<58d30051-446a-47fa-93b3-8657602ceef6@googlegroups.com>
In reply to#77709
On Thursday, December 3, 2015 at 12:54:34 AM UTC-6, Stephen Sprunk wrote:
> Like it or not, combining characters are here to stay, and they are used
> in _many_ scripts where no precomposed forms exist, or at least not for
> every possible combination.  Note that multiple combining characters can
> be applied to the same base character, e.g. à̖̃ē̘̂i̡̇̈̅o̤̊̋ǔ̻̎.  That is actually
> necessary to write some languages, e.g. Vietnamese, and is also useful
> for some transliteration systems, e.g. Pinyin.

I don't mind combining diacritics because they can be thought of as
simply being characters with an escapement width of zero that extend
outside their character box.  Of course to achieve good-looking rendering
one should try to have kerning pairs and ligatures for common combinations,
but one could achieve a clumsy-looking-but-decipherable rendering without
such things.

The real fun comes with characters like Hebrew letters which are generally
set right-to-left.

> In some cases, Æ is merely a ligature that decomposes to AE (or maybe
> Ae) and reversed as EA (or maybe eA); in others, it's a distinct letter
> and not reversed at all.  There is no way that a generic library
> function could reliably determine the correct action; I doubt many, if
> any, humans could reliably get it right either.

IMHO, Unicode should have split many characters which have multiple meanings
into multiple code points, such that something like the AE would have at
least three forms:

1. A letter

2. A ligature of AE

3. The glyph "AE", from an origin which does not make clear which of the
   above it should be interpreted as.

Likewise, there should an "uppercase Turkish dotless 'i'" and "lowercase
Turkish dotted 'I'" should have distinct characters from the Latin "I"
and "i", respectively, so as to make uppercase/lowercase conversions
invertible, and scripts like Hebrew should have separate right-to-left
and left-to-right forms, so that mathematicians who are writing about
Aleph Null within left-to-right text could use the left-to-right form and
not have to worry that pasting in a Hebrew aleph character would do
weird things to the text rendering.

[toc] | [prev] | [next] | [standalone]


#77761

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-03 10:27 -0600
Message-ID<n3pqd8$d24$1@dont-email.me>
In reply to#77748
On 03-Dec-15 09:07, supercat@casperkitty.com wrote:
> Stephen Sprunk wrote:
>> Like it or not, combining characters are here to stay, and they are
>> used in _many_ scripts where no precomposed forms exist, or at
>> least not for every possible combination.  Note that multiple
>> combining characters can be applied to the same base character,
>> e.g. à̖̃ē̘̂i̡̇̈̅o̤̊̋ǔ̻̎.  That is actually necessary to write some languages,
>> e.g. Vietnamese, and is also useful for some transliteration
>> systems, e.g. Pinyin.
> 
> I don't mind combining diacritics because they can be thought of as 
> simply being characters with an escapement width of zero that extend 
> outside their character box.  Of course to achieve good-looking
> rendering one should try to have kerning pairs and ligatures for
> common combinations, but one could achieve a
> clumsy-looking-but-decipherable rendering without such things.
> 
> The real fun comes with characters like Hebrew letters which are
> generally set right-to-left.

Actually, if you are _only_ working with RTL scripts, it's very simple:
invert the X axis of the entire display!  Note: you must counter-invert
the glyphs or they'll look backward.

The challenge comes when using RTL and LTR scripts together; luckily for
most of us, modern OSes handle all of that, so all we have to do is put
the right code points in your strings and each will be rendered in the
correct direction automatically--even with multiple quoting levels.  The
cursor's behavior seems bizarre when you first try to edit such text,
but those who work with RTL languages are used to it by now, and the
rest of us can simply ignore it.

>> In some cases, Æ is merely a ligature that decomposes to AE (or
>> maybe Ae) and reversed as EA (or maybe eA); in others, it's a
>> distinct letter and not reversed at all.  There is no way that a
>> generic library function could reliably determine the correct
>> action; I doubt many, if any, humans could reliably get it right
>> either.
> 
> IMHO, Unicode should have split many characters which have multiple
> meanings into multiple code points, such that something like the AE
> would have at least three forms:
> 
> 1. A letter
> 
> 2. A ligature of AE
> 
> 3. The glyph "AE", from an origin which does not make clear which of
> the above it should be interpreted as.

That is essentially what Unicode did.

> Likewise, there should an "uppercase Turkish dotless 'i'" and
> "lowercase Turkish dotted 'I'" should have distinct characters from
> the Latin "I" and "i", respectively,

There are code points for "İ" and "ı".  It's just that, in a Turkish
context, tolower("I") is "ı" and toupper("i") is "İ".  That was more
efficient than creating entirely new characters that look the same as
but behave different from "I" and "i".

More importantly, Turkish in ISO-8859-x used the same "I" and "i" as
other Latin languages, so there would be no way to reliably map them to
different "I" and "i" code points in Unicode.

> scripts like Hebrew should have separate right-to-left and
> left-to-right forms, so that mathematicians who are writing about
> Aleph Null within left-to-right text could use the left-to-right
> form

There's a different code point for mathematical Aleph, which looks the
same as the Hebrew Aleph but has LTR behavior.

> and not have to worry that pasting in a Hebrew aleph character
> would do weird things to the text rendering.

It wouldn't anyway, with proper Unicode support.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77764

Fromsupercat@casperkitty.com
Date2015-12-03 09:01 -0800
Message-ID<85d0b9ab-47a0-4459-bd59-f04c67d3a04d@googlegroups.com>
In reply to#77761
On Thursday, December 3, 2015 at 10:27:49 AM UTC-6, Stephen Sprunk wrote:
> The challenge comes when using RTL and LTR scripts together; luckily for
> most of us, modern OSes handle all of that, so all we have to do is put
> the right code points in your strings and each will be rendered in the
> correct direction automatically--even with multiple quoting levels.  The
> cursor's behavior seems bizarre when you first try to edit such text,
> but those who work with RTL languages are used to it by now, and the
> rest of us can simply ignore it.

Unicode includes direction-switching characters, though, which seem more
like markup than characters.

> > 1. A letter
> > 
> > 2. A ligature of AE
> > 
> > 3. The glyph "AE", from an origin which does not make clear which of
> > the above it should be interpreted as.
> 
> That is essentially what Unicode did.

I'm not sure about AE in particular, but there are a lot of places where
things got merged into one.

> > Likewise, there should an "uppercase Turkish dotless 'i'" and
> > "lowercase Turkish dotted 'I'" should have distinct characters from
> > the Latin "I" and "i", respectively,
> 
> There are code points for "İ" and "ı".  It's just that, in a Turkish
> context, tolower("I") is "ı" and toupper("i") is "İ".  That was more
> efficient than creating entirely new characters that look the same as
> but behave different from "I" and "i".

If someone has some Turkish text which contains an English-language
quotation and wishes to set the whole thing in upper-case, that would be
a lot easier if the English "i" and Turkish "i" were distinct characters.
I'm not sure how one would hope to solve that problem if they're not.

> More importantly, Turkish in ISO-8859-x used the same "I" and "i" as
> other Latin languages, so there would be no way to reliably map them to
> different "I" and "i" code points in Unicode.

Use the philosophy indicated above: recognize distinct forms for something
which is known to be be a Turkish "I", something which is known to be a 
Latin "I" regardless of context, and the ASCII "I", whose meaning would
depend upon context.

> There's a different code point for mathematical Aleph, which looks the
> same as the Hebrew Aleph but has LTR behavior.

Hmm... somehow I missed that.  Still, the same issues may arise if one is
trying to talk about a list of individual characters in a script whose
direction is opposite the primary one, since things like commas and blanks
change their direction based upon context.

I should mention another spot where Unicode seems to have been subject to
inconsistent imperatives: one of the design goals of UTF-8 was to prevent
ASCII strings from "hiding" in UTF-8 strings.  Adding things like RTL/LTR
markers to Unicode creates opportunities for strings to slip through one
level of filtering into text which will get "normalized", and then appear
in the normalized form even though they had not appeared in the original.
While there is some benefit to certain kinds of markers and normalization,
they undermine the aforementioned design goal.

> > and not have to worry that pasting in a Hebrew aleph character
> > would do weird things to the text rendering.
> 
> It wouldn't anyway, with proper Unicode support.

Strings which contain bidirectional characters behave oddly in the text
editors I've seen, and I think such oddities occur because of proper
Unicode support, rather than the lack of same.

BTW, I wonder why Unicode hasn't long ago added Latin small caps?  The
proper way of writing the time one hour after noon isn't 1:00p.m. nor
1:00P.M. but should instead use small caps; I would consider that to be
a semantic rather than typographical distinction as the letters should
be neither uppercase nor lowercase.

[toc] | [prev] | [next] | [standalone]


#77770

Fromfir <profesor.fir@gmail.com>
Date2015-12-03 10:16 -0800
Message-ID<e2096649-eb05-4a6d-899d-83748c5802fe@googlegroups.com>
In reply to#77764
W dniu czwartek, 3 grudnia 2015 18:02:06 UTC+1 użytkownik supe...@casperkitty.com napisał:
> On Thursday, December 3, 2015 at 10:27:49 AM UTC-6, Stephen Sprunk wrote:
> > The challenge comes when using RTL and LTR scripts together; luckily for
> > most of us, modern OSes handle all of that, so all we have to do is put
> > the right code points in your strings and each will be rendered in the
> > correct direction automatically--even with multiple quoting levels.  The
> > cursor's behavior seems bizarre when you first try to edit such text,
> > but those who work with RTL languages are used to it by now, and the
> > rest of us can simply ignore it.
> 
> Unicode includes direction-switching characters, though, which seem more
> like markup than characters.
> 
eonder if in some language characters are written form right to left how them are stored in memory, also right to left downward (which is a bit non compatible 
with c (and maybe should be manded for symmetry; im not sure whout would be needed here as literally never consdered this 'downward storage' (except a bit of mentioning it in some BE/LE thread year ago or so))

[toc] | [prev] | [next] | [standalone]


#77789

From"Morten W. Petersen" <morphex@gmail.com>
Date2015-12-04 01:21 +0100
Message-ID<n3qm9p$4l2$1@speranza.aioe.org>
In reply to#77764
On 03.12.2015 18:01, supercat@casperkitty.com wrote:
> On Thursday, December 3, 2015 at 10:27:49 AM UTC-6, Stephen Sprunk wrote:
[...]
>> There are code points for "İ" and "ı".  It's just that, in a Turkish
>> context, tolower("I") is "ı" and toupper("i") is "İ".  That was more
>> efficient than creating entirely new characters that look the same as
>> but behave different from "I" and "i".
>
> If someone has some Turkish text which contains an English-language
> quotation and wishes to set the whole thing in upper-case, that would be
> a lot easier if the English "i" and Turkish "i" were distinct characters.
> I'm not sure how one would hope to solve that problem if they're not.
>
>> More importantly, Turkish in ISO-8859-x used the same "I" and "i" as
>> other Latin languages, so there would be no way to reliably map them to
>> different "I" and "i" code points in Unicode.
>
> Use the philosophy indicated above: recognize distinct forms for something
> which is known to be be a Turkish "I", something which is known to be a
> Latin "I" regardless of context, and the ASCII "I", whose meaning would
> depend upon context.

Well in XML you can specify a segment of text as having a given
language, and with that, the problem is solved.

In English you have I as well, I don't know if "I" is called a word?

I don't think it's possible to work with documents with different
languages, unless these differing segments of texts specify their
language.

-Morten

[toc] | [prev] | [next] | [standalone]


#77791

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-03 16:42 -0800
Message-ID<72171e07-2e4f-447f-9f4a-05a54d21fbbd@googlegroups.com>
In reply to#77789
On Friday, December 4, 2015 at 12:21:27 AM UTC, Morten W. Petersen wrote:
> 
> In English you have I as well, I don't know if "I" is called a word?
>
I is considered a word, but actually rules are a bit fuzzy. In
spoken speech there's not much difference between a word and an
inflection.
It must never be written in lower case, which is one of the little
quirks of English. 
>
> I don't think it's possible to work with documents with different
> languages, unless these differing segments of texts specify their
> language.
> 
You don't normally need to know what language something is in
to makes sense of it, assuming that you can read the language,
of course. And actually in a lot of contexts, you get a certain
amount of mixing. I've seen olive oil in Israel advertised as
"Olive oil" spelled out in Hebrew letters. "Olive" (zeitim) is
of course one of the first Hebrew words everyone learns. But it
was in a New Agey themed shop selling things with an alternative
culture theme, I suppose they thought English sounded more 
olde worldy and mystical.

[toc] | [prev] | [next] | [standalone]


#77800

FromDavid Brown <david.brown@hesbynett.no>
Date2015-12-04 11:15 +0100
Message-ID<n3rovu$vqm$1@dont-email.me>
In reply to#77791
On 04/12/15 01:42, Malcolm McLean wrote:
> On Friday, December 4, 2015 at 12:21:27 AM UTC, Morten W. Petersen wrote:
>>
>> In English you have I as well, I don't know if "I" is called a word?
>>
> I is considered a word, but actually rules are a bit fuzzy. In
> spoken speech there's not much difference between a word and an
> inflection.
> It must never be written in lower case, which is one of the little
> quirks of English. 
>>
>> I don't think it's possible to work with documents with different
>> languages, unless these differing segments of texts specify their
>> language.
>>
> You don't normally need to know what language something is in
> to makes sense of it, assuming that you can read the language,
> of course. And actually in a lot of contexts, you get a certain
> amount of mixing. I've seen olive oil in Israel advertised as
> "Olive oil" spelled out in Hebrew letters. "Olive" (zeitim) is
> of course one of the first Hebrew words everyone learns. But it
> was in a New Agey themed shop selling things with an alternative
> culture theme, I suppose they thought English sounded more 
> olde worldy and mystical.
> 

In my family, we mix Norwegian and English all the time, both when
writing and speaking.  And in technical contexts in Norway it is not
uncommon to use some English terms, simply because there are no
appropriate Norwegian words or the words have not yet been absorbed into
the language.  You might even use Norwegian grammar (such as for
plurals) on the English words.

But if you want to do something automated with the languages, such as
spell-checking, or alternating direction, then obviously the sections
need to be marked in some way.

[toc] | [prev] | [next] | [standalone]


#78158

From"Morten W. Petersen" <morphex@gmail.com>
Date2015-12-08 01:57 +0100
Message-ID<n459sc$l5h$1@speranza.aioe.org>
In reply to#77800
On 04.12.2015 11:15, David Brown wrote:
> On 04/12/15 01:42, Malcolm McLean wrote:
>> On Friday, December 4, 2015 at 12:21:27 AM UTC, Morten W. Petersen wrote:
[...]
>>> I don't think it's possible to work with documents with different
>>> languages, unless these differing segments of texts specify their
>>> language.
>>>
>> You don't normally need to know what language something is in
>> to makes sense of it, assuming that you can read the language,
>> of course. And actually in a lot of contexts, you get a certain
>> amount of mixing. I've seen olive oil in Israel advertised as
>> "Olive oil" spelled out in Hebrew letters. "Olive" (zeitim) is
>> of course one of the first Hebrew words everyone learns. But it
>> was in a New Agey themed shop selling things with an alternative
>> culture theme, I suppose they thought English sounded more
>> olde worldy and mystical.
>>
>
> In my family, we mix Norwegian and English all the time, both when
> writing and speaking.  And in technical contexts in Norway it is not
> uncommon to use some English terms, simply because there are no
> appropriate Norwegian words or the words have not yet been absorbed into
> the language.  You might even use Norwegian grammar (such as for
> plurals) on the English words.
>
> But if you want to do something automated with the languages, such as
> spell-checking, or alternating direction, then obviously the sections
> need to be marked in some way.

Yeah that was what I was thinking, automated handling.

But explicitness is nice in any case, even in documents where a whole
text has foreign-language words and phrases here and there.

-Morten

[toc] | [prev] | [next] | [standalone]


#78167

FromDavid Brown <david.brown@hesbynett.no>
Date2015-12-08 09:08 +0100
Message-ID<n4630g$v2r$1@dont-email.me>
In reply to#78158
On 08/12/15 01:57, Morten W. Petersen wrote:
> On 04.12.2015 11:15, David Brown wrote:

>>
>> In my family, we mix Norwegian and English all the time, both when
>> writing and speaking.  And in technical contexts in Norway it is not
>> uncommon to use some English terms, simply because there are no
>> appropriate Norwegian words or the words have not yet been absorbed into
>> the language.  You might even use Norwegian grammar (such as for
>> plurals) on the English words.
>>
>> But if you want to do something automated with the languages, such as
>> spell-checking, or alternating direction, then obviously the sections
>> need to be marked in some way.
> 
> Yeah that was what I was thinking, automated handling.
> 
> But explicitness is nice in any case, even in documents where a whole
> text has foreign-language words and phrases here and there.
> 

If you want automated handling, the language has to be stated explicitly
in the text format.  Humans can usually figure out which language is
being used (from the languages that they understand) - software, as yet,
hasn't a chance.

[toc] | [prev] | [next] | [standalone]


#77830

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-04 09:44 -0600
Message-ID<n3sc7s$bn3$1@dont-email.me>
In reply to#77789
On 03-Dec-15 18:21, Morten W. Petersen wrote:
> On 03.12.2015 18:01, supercat@casperkitty.com wrote:
>> On Thursday, December 3, 2015 at 10:27:49 AM UTC-6, Stephen Sprunk
>>> More importantly, Turkish in ISO-8859-x used the same "I" and "i"
>>> as other Latin languages, so there would be no way to reliably
>>> map them to different "I" and "i" code points in Unicode.
>> 
>> Use the philosophy indicated above: recognize distinct forms for 
>> something which is known to be be a Turkish "I", something which is
>> known to be a Latin "I" regardless of context, and the ASCII "I",
>> whose meaning would depend upon context.
> 
> Well in XML you can specify a segment of text as having a given 
> language, and with that, the problem is solved.

Ditto for HTML and many application-specific formats.

> In English you have I as well, I don't know if "I" is called a word?

If you mean the first-person pronoun in nominative (aka subjective)
case, then yes; otherwise, it's just another letter.  The former is
AFAIK unique in that it's the only word, other than proper nouns (or
words derived therefrom), that is always capitalized, and it's the only
word that is therefore always in all-caps since it's a single letter.

English is weird; I pity those who learn it as a second language.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77832

FromRichard Heathfield <rjh@cpax.org.uk>
Date2015-12-04 15:58 +0000
Message-ID<n3sd1o$f2f$1@dont-email.me>
In reply to#77830
On 04/12/15 15:44, Stephen Sprunk wrote:
> On 03-Dec-15 18:21, Morten W. Petersen wrote:
<snip>
>
>> In English you have I as well, I don't know if "I" is called a word?
>
> If you mean the first-person pronoun in nominative (aka subjective)
> case, then yes; otherwise, it's just another letter.  The former is
> AFAIK unique in that it's the only word, other than proper nouns (or
> words derived therefrom), that is always capitalized, and it's the only
> word that is therefore always in all-caps since it's a single letter.

I'm not sure, but I think "O" might also qualify.

> English is weird; I pity those who learn it as a second language.

I have had the honour of becoming acquainted with quite a few Poles. 
Without exception, they pity those who learn Polish as a /first/ language!

-- 
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

[toc] | [prev] | [next] | [standalone]


#77839

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-04 11:43 -0600
Message-ID<n3sj8d$a4m$1@dont-email.me>
In reply to#77832
On 04-Dec-15 09:58, Richard Heathfield wrote:
> On 04/12/15 15:44, Stephen Sprunk wrote:
>> On 03-Dec-15 18:21, Morten W. Petersen wrote:
>>> In English you have I as well, I don't know if "I" is called a
>>> word?
>> 
>> If you mean the first-person pronoun in nominative (aka
>> subjective) case, then yes; otherwise, it's just another letter.
>> The former is AFAIK unique in that it's the only word, other than
>> proper nouns (or words derived therefrom), that is always
>> capitalized, and it's the only word that is therefore always in
>> all-caps since it's a single letter.
> 
> I'm not sure, but I think "O" might also qualify.

The only examples of "O" as a word that I can find are in verse ("O
Canada", "O Captain, My Captain", etc.), where the normal rules for
words, spelling, etc. are often bent to fit the author's meter, rhyme,
space, etc. needs.  So, I don't think appearing in such contexts that
alone is enough to establish it as a real word.

Every dictionary I checked but one lists "O" with no meaning other than
as an abbreviation for various things or, like every other letter, as
the letter/sound itself.  The exception lists "O" as a synonym for
"Old", but seems more an abbreviation than a distinct word; I see
nothing remotely similar in others, and I'm not sure that's what it
means in the verse examples either.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77842

FromGeoff <geoff@invalid.invalid>
Date2015-12-04 10:56 -0800
Message-ID<b9o36b12vhuva6i0k892lq4o8hvq822gke@4ax.com>
In reply to#77839
On Fri, 4 Dec 2015 11:43:58 -0600, Stephen Sprunk <stephen@sprunk.org>
wrote:

>On 04-Dec-15 09:58, Richard Heathfield wrote:
>> On 04/12/15 15:44, Stephen Sprunk wrote:
>>> On 03-Dec-15 18:21, Morten W. Petersen wrote:
>>>> In English you have I as well, I don't know if "I" is called a
>>>> word?
>>> 
>>> If you mean the first-person pronoun in nominative (aka
>>> subjective) case, then yes; otherwise, it's just another letter.
>>> The former is AFAIK unique in that it's the only word, other than
>>> proper nouns (or words derived therefrom), that is always
>>> capitalized, and it's the only word that is therefore always in
>>> all-caps since it's a single letter.
>> 
>> I'm not sure, but I think "O" might also qualify.
>
>The only examples of "O" as a word that I can find are in verse ("O
>Canada", "O Captain, My Captain", etc.), where the normal rules for
>words, spelling, etc. are often bent to fit the author's meter, rhyme,
>space, etc. needs.  So, I don't think appearing in such contexts that
>alone is enough to establish it as a real word.
>
>Every dictionary I checked but one lists "O" with no meaning other than
>as an abbreviation for various things or, like every other letter, as
>the letter/sound itself.  The exception lists "O" as a synonym for
>"Old", but seems more an abbreviation than a distinct word; I see
>nothing remotely similar in others, and I'm not sure that's what it
>means in the verse examples either.
>

The O in O Canada should be "Oh Canada" in English since the O should
have been the French Ô.

[toc] | [prev] | [next] | [standalone]


#77843

FromKeith Thompson <kst-u@mib.org>
Date2015-12-04 11:20 -0800
Message-ID<ln1tb2qav6.fsf@kst-u.example.com>
In reply to#77842
Geoff <geoff@invalid.invalid> writes:
[...]
> The O in O Canada should be "Oh Canada" in English since the O should
> have been the French Ô.

Why?  "O" is an English word, though it's not used very often.

See, for example:
    http://dictionary.reference.com/browse/o
and search the page for "interjection".

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#77850

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-04 15:24 -0600
Message-ID<n3t06b$om$1@dont-email.me>
In reply to#77843
On 04-Dec-15 13:20, Keith Thompson wrote:
> Geoff <geoff@invalid.invalid> writes:
>> The O in O Canada should be "Oh Canada" in English since the O
>> should have been the French Ô.
> 
> Why?  "O" is an English word, though it's not used very often.
> 
> See, for example:
>     http://dictionary.reference.com/browse/o
> and search the page for "interjection".

O [oh]
interjection
1. (used before a name in direct address, especially in solemn or poetic
language, to lend earnestness to an appeal): Hear, O Israel!
2. (used as an expression of surprise, pain, annoyance, longing,
gladness, etc.)
noun, plural O's.
3. the exclamation “O.”.
Origin of O
1125-75; Middle English < Old French < Latin ō

But we also see:

oh [oh]
interjection
1. (used as an expression of surprise, pain, disapprobation, etc.)
2. (used in direct address to attract the attention of the person spoken
to): Oh, John, will you take these books?
noun, plural oh's, ohs.
3. the exclamation “oh.”.
verb (used without object)
4. to utter or exclaim “oh.”.
Origin of oh
later spelling of O, from mid-16th century

Those look pretty much the same to me, and the Origin section implies
that "O" is merely an archaic form of today's "oh", and it's not
surprising to encounter such words only in verse, particularly verse
like "O Come All Ye Faithful" that is obviously archaic from the use of
"Ye"--or should that be "Þe"?

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77828

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-04 09:30 -0600
Message-ID<n3sbf2$8fc$1@dont-email.me>
In reply to#77764
On 03-Dec-15 11:01, supercat@casperkitty.com wrote:
> Stephen Sprunk wrote:
>> The challenge comes when using RTL and LTR scripts together;
>> luckily for most of us, modern OSes handle all of that, so all we
>> have to do is put the right code points in your strings and each
>> will be rendered in the correct direction automatically--even with
>> multiple quoting levels.  The cursor's behavior seems bizarre when
>> you first try to edit such text, but those who work with RTL
>> languages are used to it by now, and the rest of us can simply
>> ignore it.
> 
> Unicode includes direction-switching characters, though, which seem
> more like markup than characters.

That's not exactly what they do.  A strong RTL mark will change the
direction of a weak LTR or non-directional character, but it won't
change the direction of a strong LTR character, and vice versa for a
strong LTR mark.  They are rarely needed because the default rendering
behavior is usually correct.

>>> 1. A letter
>>> 
>>> 2. A ligature of AE
>>> 
>>> 3. The glyph "AE", from an origin which does not make clear which
>>> of the above it should be interpreted as.
>> 
>> That is essentially what Unicode did.
> 
> I'm not sure about AE in particular, but there are a lot of places
> where things got merged into one.

To enable round-tripping, Unicode had to assign code points to all
precomposed characters and ligatures that were in existing code pages;
they haven't created new ones.  In fact, they assigned additional code
points to characters that are used in different ways, e.g. Ω (GREEK
CAPITAL LETTER OMEGA, U+03A9) and Ω (OHM SIGN, U+2126), even though they
almost certainly share the same glyph.

>>> Likewise, there should an "uppercase Turkish dotless 'i'" and 
>>> "lowercase Turkish dotted 'I'" should have distinct characters
>>> from the Latin "I" and "i", respectively,
>> 
>> There are code points for "İ" and "ı".  It's just that, in a
>> Turkish context, tolower("I") is "ı" and toupper("i") is "İ".  That
>> was more efficient than creating entirely new characters that look
>> the same as but behave different from "I" and "i".
> 
> If someone has some Turkish text which contains an English-language 
> quotation and wishes to set the whole thing in upper-case, that would
> be a lot easier if the English "i" and Turkish "i" were distinct
> characters. I'm not sure how one would hope to solve that problem if
> they're not.

MSWord solves it just fine: mark the English text as English and the
Turkish text as Turkish, and when you format the entire passage in ALL
CAPS, it applies the correct locale-specific conversions to each
portion.  That also applies to other functions, e.g. spell-check.

The same feature is how, when I write a document in US English and send
it to someone overseas, _their_ copy of MSWord will still treat it as US
English rather than whatever their local language is.  For instance, a
British colleague won't see red squiggles indicating that "colour" was
misspelled as "color", and a French colleage won't see red squiggles
indicating that _every_ word is misspelled (except perhaps a few that
are also correctly-spelled French words by happenstance).

Unicode _did_ have a way (language tags, U+E000x) to do this in an
application-neutral way, but that is now deprecated since folks like you
didn't like the "markup" aspect of that mechanism.

>> More importantly, Turkish in ISO-8859-x used the same "I" and "i"
>> as other Latin languages, so there would be no way to reliably map
>> them to different "I" and "i" code points in Unicode.
> 
> Use the philosophy indicated above: recognize distinct forms for
> something which is known to be be a Turkish "I", something which is
> known to be a Latin "I" regardless of context, and the ASCII "I",
> whose meaning would depend upon context.

There is no need.

>> There's a different code point for mathematical Aleph, which looks
>> the same as the Hebrew Aleph but has LTR behavior.
> 
> Hmm... somehow I missed that.  Still, the same issues may arise if
> one is trying to talk about a list of individual characters in a
> script whose direction is opposite the primary one, since things like
> commas and blanks change their direction based upon context.

Of course they do; they _should_ change their direction based on the
characters around them.  Otherwise, you'd need separate whitespace and
punctuation characters for use with RTL text, and all text converted
from legacy encodings would be messed up.

> I should mention another spot where Unicode seems to have been
> subject to inconsistent imperatives: one of the design goals of UTF-8
> was to prevent ASCII strings from "hiding" in UTF-8 strings.

What?  ASCII strings _are_ UTF-8 strings by definition; there is no way
for them to "hide".

> Adding things like RTL/LTR markers to Unicode creates opportunities
> for strings to slip through one level of filtering into text which
> will get "normalized", and then appear in the normalized form even
> though they had not appeared in the original.

Please provide an example; I can't think of any way that would happen.

>>> and not have to worry that pasting in a Hebrew aleph character 
>>> would do weird things to the text rendering.
>> 
>> It wouldn't anyway, with proper Unicode support.
> 
> Strings which contain bidirectional characters behave oddly in the
> text editors I've seen, and I think such oddities occur because of
> proper Unicode support, rather than the lack of same.

What you consider "odd" is the correct behavior, which is something that
was worked out over several years by experts who deal with bidi text on
a daily basis.  Yes, it's a little weird the first few times you deal
with it, but it's the best (or least bad) solution to a very difficult
problem, and you'll get used to it with practice.

> BTW, I wonder why Unicode hasn't long ago added Latin small caps?
> The proper way of writing the time one hour after noon isn't 1:00p.m.
> nor 1:00P.M. but should instead use small caps; I would consider that
> to be a semantic rather than typographical distinction as the letters
> should be neither uppercase nor lowercase.

"P.M." is an abbreviation for "Post Meridiem" (and "A.M." for "Ante
Meridiem"), and should be treated as such.  It's common to omit the dots
in abbreviations these days since we deal with so many on a daily basis,
but that violates the classical rules.  Small caps may be _your_
typographical preference, but it's not inherent in the characters.

The closest example I know of in Unicode is "™", but again, that was an
existing code point in a legacy code page, so it was grandfathered in;
there was no equivalent PM (or AM) code point.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77831

FromRichard Heathfield <rjh@cpax.org.uk>
Date2015-12-04 15:52 +0000
Message-ID<n3scn3$dog$1@dont-email.me>
In reply to#77828
On 04/12/15 15:30, Stephen Sprunk wrote:

<snip>

> It's common to omit the dots
> in abbreviations these days since we deal with so many on a daily basis,
> but that violates the classical rules.

True enough. There was a conscious move toward "open punctuation" in the 
late 20th century, in which pointless punctuation is deliberately 
omitted, resulting in a cleaner, less "fussy" document. In 1989 I took a 
three-week course in typing and word processing (in ROI terms it was the 
best time investment I ever made), during which time I was taught that I 
should always use open punctuation in business letters unless I had an 
extraordinarily good reason to make an exception. (Nowadays, I use 
Richard Punctuation, which is far superior, having as it does the useful 
property that it's impossible for me to mis-apply, since whatever 
punctuation I use is right *by definition*.)

-- 
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

[toc] | [prev] | [next] | [standalone]


#77837

Fromsupercat@casperkitty.com
Date2015-12-04 09:07 -0800
Message-ID<0cd244e7-aa0f-454b-bb31-1e858866c510@googlegroups.com>
In reply to#77828
On Friday, December 4, 2015 at 9:31:14 AM UTC-6, Stephen Sprunk wrote:
> Unicode _did_ have a way (language tags, U+E000x) to do this in an
> application-neutral way, but that is now deprecated since folks like you
> didn't like the "markup" aspect of that mechanism.

One of the stated goals of Unicode was that string manipulations should work
in context-free way provided that strings were only split on certain readily-
identifiable boundaries, unlike state-based encodings like shift-JIS.  Having
characters which affect the meaning of later characters until further notice
goes contrary to that stated goal.

A remedy would be to have a means of encoding state information on each code
point where it is relevant, but for purposes of wire transmission pass the
text through a state-based filter which would replace runs of characters with
the same state tag with a "change-state" character followed by characters
without individual tags.  I don't think that would be practical to add now,
though.

> Of course they do; they _should_ change their direction based on the
> characters around them.  Otherwise, you'd need separate whitespace and
> punctuation characters for use with RTL text, and all text converted
> from legacy encodings would be messed up.

They should change direction based upon whether they are being used in an
overall RTL or LTR context, perhaps, but trying to infer an RTL or LTR
context from nearby characters seems like the same sort of logic which will
cause 12/1/87 to be read as "December 1, 1987" and "13/1/87" to be read
as "January 13, 1987".  Either interpretation might be the most-probably-
correct one in isolation, but code which would be inclined to regard the
first as "December 1" should indicate that it doesn't understand dates,
rather than trying to pretend it does.

Take a C source file and replace some variable names with Hebrew letters or
words.  Does the result make any sense?

> > I should mention another spot where Unicode seems to have been
> > subject to inconsistent imperatives: one of the design goals of UTF-8
> > was to prevent ASCII strings from "hiding" in UTF-8 strings.
> 
> What?  ASCII strings _are_ UTF-8 strings by definition; there is no way
> for them to "hide".

> Please provide an example; I can't think of any way that would happen.

Would a Unicode string which contained a mixture of ASCII characters and
arbitrarily-sprinkled LTR markers be considered in any way "ill-formed"?

If one replaced $ with LTR markers, would a filter that was looking for
the word "script" see "$<scr$ipt>"?  Could one rule the possibility that after
having gotten through the first filter, the text might pass through another
filter which would notice and eliminate redundant LTR markers on the pre-
validated text before passing it along?

> What you consider "odd" is the correct behavior, which is something that
> was worked out over several years by experts who deal with bidi text on
> a daily basis.  Yes, it's a little weird the first few times you deal
> with it, but it's the best (or least bad) solution to a very difficult
> problem, and you'll get used to it with practice.

I tend to work in a lot of fields where when I want to show a string I want
to do so unambiguously.  If W is a gimmel character, then x5W and xW5 both
appear as x5W.  Trying to apply Hebrew labels to values in hexadecimal or
exponential notation garbles them hopelessly.

> > BTW, I wonder why Unicode hasn't long ago added Latin small caps?
> > The proper way of writing the time one hour after noon isn't 1:00p.m.
> > nor 1:00P.M. but should instead use small caps; I would consider that
> > to be a semantic rather than typographical distinction as the letters
> > should be neither uppercase nor lowercase.
> 
> "P.M." is an abbreviation for "Post Meridiem" (and "A.M." for "Ante
> Meridiem"), and should be treated as such.  It's common to omit the dots
> in abbreviations these days since we deal with so many on a daily basis,
> but that violates the classical rules.  Small caps may be _your_
> typographical preference, but it's not inherent in the characters.

There are a variety of situations (I merely identified one) where
typographers should regard letters as having three cases.  In some forms
of rendering one might display the middle and upper cases identically, but
for other forms one shouldn't.

> The closest example I know of in Unicode is "(tm)", but again, that was an
> existing code point in a legacy code page, so it was grandfathered in;
> there was no equivalent PM (or AM) code point.

I'm not saying A.M. and P.M. deserve their own code points, but rather that
small caps are often used in places where neither uppercase nor lowercase
is appropriate, and the use of small caps should be considered to be part
of the semantics of the document rather than the layout markup.

[toc] | [prev] | [next] | [standalone]


Page 7 of 11 — ← Prev page 1 … 5 6 [7] 8 9 … 11  Next page →

Back to top | Article view | comp.lang.c


csiph-web