Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #77357 > unrolled thread

Working efficiently with 32-bit Unicode output streams, locale etc.

Started by"Morten W. Petersen" <morphex@gmail.com>
First post2015-11-29 01:06 +0100
Last post2015-12-02 09:58 -0800
Articles 20 on this page of 210 — 25 participants

Back to article view | Back to comp.lang.c


Contents

  Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 01:06 +0100
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-11-29 02:01 +0000
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 03:31 +0100
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 00:09 -0600
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Robert Wessel <robertwessel2@yahoo.com> - 2015-11-29 00:22 -0600
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-11-29 14:31 -0500
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-11-29 23:51 +0000
          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 01:21 +0100
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 00:41 -0800
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 03:16 -0600
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-11-29 08:28 +0000
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 02:54 -0600
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-11-29 16:30 +1300
      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-28 23:53 -0800
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 02:23 -0600
          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 00:30 -0800
            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 01:33 +0100
              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 13:54 +1300
                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 02:03 +0100
                  Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 14:15 +1300
                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 02:34 +0100
                      Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 14:42 +1300
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 04:16 +0100
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 20:20 -0600
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 04:34 +0100
                          Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 17:09 +1300
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 06:17 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 19:44 +1300
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 23:36 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 07:39 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:56 -0600
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-01 09:17 +0100
                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:40 -0600
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 00:34 +0100
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 16:03 -0800
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 23:07 -0800
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 08:20 +0100
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-29 23:40 -0800
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 08:48 +0100
                              Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 20:52 +1300
                                Re: Working efficiently with 32-bit Unicode output streams, locale     etc. Ian Collins <ian-news@hotmail.com> - 2015-11-30 21:04 +1300
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 00:34 -0800
                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 03:50 -0600
                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-11-30 12:16 +0000
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 06:11 -0800
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:23 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 13:18 -0600
                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 13:23 -0800
                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-11-30 22:32 +0000
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 15:10 -0800
                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-30 21:05 -0600
                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 12:38 +0000
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-01 14:43 +0000
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 12:09 -0800
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-12-02 09:14 +1300
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 12:27 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ian Collins <ian-news@hotmail.com> - 2015-12-02 10:14 +1300
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-01 18:01 -0600
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 20:41 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-01 12:53 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 21:32 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-01 13:55 -0800
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:30 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-01 18:46 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Say, what? <<nothing@nowhere.nohow>> - 2015-12-01 14:07 -0800
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 23:54 +0000
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Say, what? <<nothing@nowhere.nohow>> - 2015-12-01 17:13 -0800
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 09:08 -0600
                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-01 20:02 +0000
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 17:03 -0600
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 00:17 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-01 16:53 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Martin Shobe <martin.shobe@yahoo.com> - 2015-12-01 21:17 -0600
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 09:37 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. James Kuyper <jameskuyper@verizon.net> - 2015-12-02 10:59 -0500
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 17:43 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:22 -0600
                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:32 +1300
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 21:12 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 10:36 +1300
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 22:00 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 17:55 -0600
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-02 17:04 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 01:11 +0000
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 14:19 +1300
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 23:16 -0600
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 00:54 -0600
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 04:07 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 18:31 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Eric Sosman <esosman@comcast-dot-net.invalid> - 2015-12-03 13:59 -0500
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 19:45 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 14:38 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 22:43 +0000
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 12:14 +0000
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 12:38 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 13:19 +0000
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:54 -0800
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:50 +0000
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 14:26 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 09:19 -0600
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:25 +0100
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 15:33 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:47 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 16:54 +0000
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 09:32 -0800
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 18:53 +0100
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-03 19:00 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-04 14:07 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:41 +0000
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-05 16:09 +0100
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Steve Thompson <stevet810@gmail.com> - 2015-12-05 21:15 +0000
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-06 12:35 +0100
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-03 09:02 -0800
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 19:12 +0000
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 16:58 -0600
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 15:47 +0100
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 14:51 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-03 16:50 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:55 +0000
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 08:56 -0600
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:24 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-04 08:49 +1300
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 07:07 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:27 -0600
                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-03 09:01 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-03 10:16 -0800
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 01:21 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 16:42 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-04 11:15 +0100
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-08 01:57 +0100
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. David Brown <david.brown@hesbynett.no> - 2015-12-08 09:08 +0100
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 09:44 -0600
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 15:58 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 11:43 -0600
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Geoff <geoff@invalid.invalid> - 2015-12-04 10:56 -0800
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-04 11:20 -0800
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:24 -0600
                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 09:30 -0600
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 15:52 +0000
                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 09:07 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 09:53 -0800
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 10:56 -0800
                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:04 -0600
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-04 21:32 +0000
                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 13:38 -0800
                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 16:13 -0600
                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 16:21 -0800
                                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 19:10 -0600
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Geoff <geoff@invalid.invalid> - 2015-12-04 19:16 -0800
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-04 21:19 -0800
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 12:44 -0600
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-06 09:01 -0800
                                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 12:34 -0600
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-06 18:32 -0800
                                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 10:43 -0600
                                                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-07 10:02 -0800
                                                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 03:53 -0800
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-05 09:39 -0800
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-05 18:36 +0000
                                                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 12:26 -0600
                                                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 11:36 -0800
                                                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Udyant Wig <udyantw@gmail.com> - 2015-12-06 16:42 +0530
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 03:59 -0800
                                                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Robert Wessel <robertwessel2@yahoo.com> - 2015-12-07 02:17 -0600
                                                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. supercat@casperkitty.com - 2015-12-07 07:33 -0800
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-03 03:57 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 00:58 +0100
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 01:34 +0000
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-03 11:38 +0000
                                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 14:09 +0000
                                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:10 -0600
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 08:28 -0800
                                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 21:33 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Heathfield <rjh@cpax.org.uk> - 2015-12-02 21:47 +0000
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 16:05 -0600
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-02 14:12 -0800
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 22:47 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 14:00 +1300
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 01:38 -0600
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 02:20 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. raltbos@xs4all.nl (Richard Bos) - 2015-12-04 10:40 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Nobody <nobody@nowhere.invalid> - 2015-12-03 02:42 +0000
                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-12-01 20:48 -0500
                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 12:08 +0000
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 04:21 -0800
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 14:05 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-12-04 01:31 +0100
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 14:23 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 08:00 -0800
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 16:49 +0000
                                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 11:50 -0800
                                                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-02 20:02 +0000
                                                        Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 12:31 -0800
                                                          Re: Working efficiently with 32-bit Unicode output streams, locale etc. Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 01:43 +0000
                                                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-12-02 09:21 -0800
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Richard Damon <Richard@Damon-Family.org> - 2015-12-02 07:29 -0500
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 05:47 -0800
                                                Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:03 -0600
                                              Re: Working efficiently with 32-bit Unicode output streams, locale etc. BartC <bc@freeuk.com> - 2015-12-02 14:16 +0000
                                                Re: Working efficiently with 32-bit Unicode output streams, locale   etc. Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:56 +1300
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:49 -0600
                                            Re: Working efficiently with 32-bit Unicode output streams, locale etc. Philip Lantz <prl@canterey.us> - 2015-12-02 22:11 -0800
                                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:06 -0600
                      Re: Working efficiently with 32-bit Unicode output streams, locale etc. Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-11-30 22:14 +0000
              Re: Working efficiently with 32-bit Unicode output streams, locale etc. Stephen Sprunk <stephen@sprunk.org> - 2015-11-29 23:03 -0600
                Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-30 06:26 +0100
                  Re: Working efficiently with 32-bit Unicode output streams, locale etc. Keith Thompson <kst-u@mib.org> - 2015-11-30 00:39 -0800
                    Re: Working efficiently with 32-bit Unicode output streams, locale etc. Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-11-30 01:57 -0800
        Re: Working efficiently with 32-bit Unicode output streams, locale etc. "Morten W. Petersen" <morphex@gmail.com> - 2015-11-29 15:32 +0100
    Re: Working efficiently with 32-bit Unicode output streams, locale etc. fir <profesor.fir@gmail.com> - 2015-12-02 09:58 -0800

Page 10 of 11 — ← Prev page 1 … 8 9 [10] 11  Next page →


#77804

Fromraltbos@xs4all.nl (Richard Bos)
Date2015-12-04 10:40 +0000
Message-ID<56616d1d.1673562@news.xs4all.nl>
In reply to#77625
Stephen Sprunk <stephen@sprunk.org> wrote:

> On 01-Dec-15 18:17, BartC wrote:
> > On 01/12/2015 23:03, Martin Shobe wrote:
> >> On 12/1/2015 2:02 PM, BartC wrote:
> >>> On 01/12/2015 15:08, Martin Shobe wrote:
> >>>> Why did you expect 11?
> >>> 
> >>> Because there are 11 characters in "£100 = €140", not 13 (or 14 
> >>> actually).
> >> 
> >> But you told C to print an octet, not a character.
> > 
> > I told it to print a value in %c format. In other words, a
> > character.
> 
> For historical reasons, C conflates "bytes" and "characters".
> 
> In various contexts, "character" is used mean any of "byte", "code
> unit", "code point", "glyph", "grapheme" or "grapheme cluster"--and
> sometimes more than one of them in the _same_ context.  And if that
> wasn't confusing enough, sometimes it even means "non-character"!
> 
> >>> So, how would you, given the same "£100 = €140" UTF8 string,
> >>> write the C code to enumerate all the characters or code-points
> >>> rather than the bytes?
> >> 
> >> Code points aren't characters as you mean it either. If you want
> >> that, you will have to make your code aware of the differences
> >> between octets, code-points, and "characters". C I/O is too low
> >> level to understand such.
> > 
> > So considerable amounts of code that was happily mixing up bytes, 
> > octets, characters and code-points for decades, no longer works with
> > the advent of UTF8.
> 
> As long as you don't _split_ strings, which includes extracting
> individual bytes from them, UTF-8 is completely transparent.

Except that, as Bart demonstrates, _counting_ characters is not always
simple. That's a pretty big snag if you want to do neat text output.

UTF-8 is a fine encoding for text _transmission_. For text _handling_,
it's not so nice.

Richard

[toc] | [prev] | [next] | [standalone]


#77696

FromNobody <nobody@nowhere.invalid>
Date2015-12-03 02:42 +0000
Message-ID<pan.2015.12.03.02.42.44.363000@nowhere.invalid>
In reply to#77582
On Wed, 02 Dec 2015 00:17:31 +0000, BartC wrote:

> So considerable amounts of code that was happily mixing up bytes, octets,
> characters and code-points for decades, no longer works with the advent of
> UTF8.

Multi-byte encodings have been around since long before UTF-8.

Languages with a phonetic alphabet (Latin, Cyrillic, Greek, etc) can
typically get by with 256 characters, allowing the use of unibyte (one
byte equals one character) encodings such as ISO-646 or ISO-8859.

Languages which use an ideographic alphabet (Chinese, Japanese, Korean)
have always required multi-byte encodings.

Unicode just provides a single omnibus character set and a handful of
encodings which allow code points (non-negative integers) to be converted
to and from byte sequences without requiring large tables or algorithms
which have to be modified whenever additional characters are added.

[toc] | [prev] | [next] | [standalone]


#77593

FromRichard Damon <Richard@Damon-Family.org>
Date2015-12-01 20:48 -0500
Message-ID<86s7y.245443$qL.88553@fx15.iad>
In reply to#77556
On 12/1/15 3:02 PM, BartC wrote:
>
> This is the problem I have with people saying that UTF8 can be be used
> transparently.

There is one simple rule that a program needs to follow for UTF-8 to be 
transparent, the program must only break the string at the boundaries of 
known characters (either from known characters (like space or new line) 
or boundaries of known valid strings.

UTF-8 is specifically designed that no code point is a sub-string of 
another code point.

Programs that assume that a string can be broken at arbitrary points, 
can be broken by UTF-8.

[toc] | [prev] | [next] | [standalone]


#77606

FromBartC <bc@freeuk.com>
Date2015-12-02 12:08 +0000
Message-ID<n3mmr9$adr$1@dont-email.me>
In reply to#77593
On 02/12/2015 01:48, Richard Damon wrote:
> On 12/1/15 3:02 PM, BartC wrote:
>>
>> This is the problem I have with people saying that UTF8 can be be used
>> transparently.
>
> There is one simple rule that a program needs to follow for UTF-8 to be
> transparent, the program must only break the string at the boundaries of
> known characters (either from known characters (like space or new line)
> or boundaries of known valid strings.
>
> UTF-8 is specifically designed that no code point is a sub-string of
> another code point.
>
> Programs that assume that a string can be broken at arbitrary points,
> can be broken by UTF-8.

I think 99% of the programs I've ever written have needed at some point 
to deal with strings character by character.

Actually just a few days ago I had to take a bunch of separate strings, 
truncate each one to fit a certain pixel-width, and combine them into 
one string separated by tabs and 'reverse-tabs' which corresponded to a 
particular pixel-specified table of tab stops.

(A reverse-tab will right-justify one of the original strings within a 
column. The string is written in a proportional font to a raster display.)

If I can't use 8-bit characters for that then I'd need to switch to 16 
or 32-bit ones. Then I could use the same code, as only data types will 
have changed.

Trying to do it directly on UTF8 would mean rewriting pretty much 
everything.

BTW I notice nobody has given me an example of how to rewrite that 
little example of mine to properly display the separate characters of a 
string (show each one on a separate line together with its hex and 
decimal code).

I guess it's not so easy after all...

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#77607

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-02 04:21 -0800
Message-ID<a330d8e2-6414-4f11-9534-1045b669007f@googlegroups.com>
In reply to#77606
On Wednesday, December 2, 2015 at 12:08:48 PM UTC, Bart wrote:
> On 02/12/2015 01:48, Richard Damon wrote:
> > On 12/1/15 3:02 PM, BartC wrote:
> >>
> >> This is the problem I have with people saying that UTF8 can be be used
> >> transparently.
> >
> > There is one simple rule that a program needs to follow for UTF-8 to be
> > transparent, the program must only break the string at the boundaries of
> > known characters (either from known characters (like space or new line)
> > or boundaries of known valid strings.
> >
> > UTF-8 is specifically designed that no code point is a sub-string of
> > another code point.
> >
> > Programs that assume that a string can be broken at arbitrary points,
> > can be broken by UTF-8.
> 
> I think 99% of the programs I've ever written have needed at some point 
> to deal with strings character by character.
> 
> Actually just a few days ago I had to take a bunch of separate strings, 
> truncate each one to fit a certain pixel-width, and combine them into 
> one string separated by tabs and 'reverse-tabs' which corresponded to a 
> particular pixel-specified table of tab stops.
> 
> (A reverse-tab will right-justify one of the original strings within a 
> column. The string is written in a proportional font to a raster display.)
> 
> If I can't use 8-bit characters for that then I'd need to switch to 16 
> or 32-bit ones. Then I could use the same code, as only data types will 
> have changed.
> 
> Trying to do it directly on UTF8 would mean rewriting pretty much 
> everything.
> 
> BTW I notice nobody has given me an example of how to rewrite that 
> little example of mine to properly display the separate characters of a 
> string (show each one on a separate line together with its hex and 
> decimal code).
> 
> I guess it's not so easy after all...
>
16 bit chars, by Bart 
>> char s[]="£100 = EURO 140"; 
>>      for (i=0; i<strlen(s); ++i){ 
>>          c = s[i]; 

UTF-8, 8-bit chars

char s[] = "utf-8 here"
int ch;
char *ptr = s;
while(ch = bbx_get_utf8(ptr))
{
  char buff[32] = {0};
  strncpy(buff, ptr, bbx_utf8_skip(ptr));
   printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
   ptr += bbx_utf8_skip(ptr);  
}

Obviously the subroutines should be standardised, I don't think that's
happened yet.

[toc] | [prev] | [next] | [standalone]


#77616

FromBartC <bc@freeuk.com>
Date2015-12-02 14:05 +0000
Message-ID<n3mtno$4uq$1@dont-email.me>
In reply to#77607
On 02/12/2015 12:21, Malcolm McLean wrote:
> On Wednesday, December 2, 2015 at 12:08:48 PM UTC, Bart wrote:

>> BTW I notice nobody has given me an example of how to rewrite that
>> little example of mine to properly display the separate characters of a
>> string (show each one on a separate line together with its hex and
>> decimal code).
>>
>> I guess it's not so easy after all...
>>
> 16 bit chars, by Bart
>>> char s[]="£100 = EURO 140";
>>>       for (i=0; i<strlen(s); ++i){
>>>           c = s[i];
>
> UTF-8, 8-bit chars
>
> char s[] = "utf-8 here"
> int ch;
> char *ptr = s;
> while(ch = bbx_get_utf8(ptr))
> {
>    char buff[32] = {0};
>    strncpy(buff, ptr, bbx_utf8_skip(ptr));
>     printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
>     ptr += bbx_utf8_skip(ptr);
> }
>
> Obviously the subroutines should be standardised, I don't think that's
> happened yet.

OK, in other words, it's completely different compared with a version 
where characters in a string can be directly indexed.

So why do people keeping saying that UTF8 can be introduced more or less 
transparently with little change?

(And why do they keep showing how knowledgeable they are by constantly 
bringing up the difference between character and code-points? Apparently 
the difference means that even with 32-bit strings, you cannot just 
index a string as you would expect. I think if we abandoned all the 
software that ignored that difference, even with wide-strings, there 
wouldn't be much left!

My £/€ example would however have worked.)

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#77790

From"Morten W. Petersen" <morphex@gmail.com>
Date2015-12-04 01:31 +0100
Message-ID<n3qmrc$4l2$2@speranza.aioe.org>
In reply to#77616
On 02.12.2015 15:05, BartC wrote:
[...]
> OK, in other words, it's completely different compared with a version
> where characters in a string can be directly indexed.
>
> So why do people keeping saying that UTF8 can be introduced more or less
> transparently with little change?

What he said.

-Morten

[toc] | [prev] | [next] | [standalone]


#77618

FromBen Bacarisse <ben.usenet@bsb.me.uk>
Date2015-12-02 14:23 +0000
Message-ID<878u5d6iay.fsf@bsb.me.uk>
In reply to#77607
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
<snip>
> UTF-8, 8-bit chars
>
> char s[] = "utf-8 here"
> int ch;
> char *ptr = s;
> while(ch = bbx_get_utf8(ptr))
> {
>   char buff[32] = {0};
>   strncpy(buff, ptr, bbx_utf8_skip(ptr));
>    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);

What does printf_utf8 do?

>    ptr += bbx_utf8_skip(ptr);  
> }
>
> Obviously the subroutines should be standardised, I don't think that's
> happened yet.

In standard C:

  setlocale(LC_ALL, "");
  ...
  char32_t ch;
  size_t n;
  while ((n = mbrtoc32(&ch, ptr, MB_LEN_MAX, NULL)) > 0 && n <= MB_LEN_MAX) {
      printf("%.*s : hex %08x dec %d\n", (int)n, ptr, ch, ch);
      ptr += n;
  }

The call to mbrtoc32 is a bit fussy (because of the defaults), and the
test is clumsy (because of the unsigned return value), but you can
always wrap it in a one-line inline function of your own.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#77628

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-02 08:00 -0800
Message-ID<56ad36bc-3af1-4b1c-9976-f3b6c0c439ce@googlegroups.com>
In reply to#77618
On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> <snip>
> > UTF-8, 8-bit chars
> >
> > char s[] = "utf-8 here"
> > int ch;
> > char *ptr = s;
> > while(ch = bbx_get_utf8(ptr))
> > {
> >   char buff[32] = {0};
> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
> 
> What does printf_utf8 do?
> 
You need a version of printf() which h will actually display UTF-8 as the proper glyphs 
rather than as extended ascii or IBM character graphics.
Maybe you can achieve that with set locale. I'm a bit sceptical. %s has a better chance of
working than passing a unicode code point to %c, though it makes the code a bit
messier. 

> >    ptr += bbx_utf8_skip(ptr);  
> > }
> >
> > Obviously the subroutines should be standardised, I don't think that's
> > happened yet.
> 
> In standard C:
> 
>   setlocale(LC_ALL, "");
>   ...
>   char32_t ch;
>   size_t n;
>   while ((n = mbrtoc32(&ch, ptr, MB_LEN_MAX, NULL)) > 0 && n <= MB_LEN_MAX) {
>       printf("%.*s : hex %08x dec %d\n", (int)n, ptr, ch, ch);
>       ptr += n;
>   }
> 
> The call to mbrtoc32 is a bit fussy (because of the defaults), and the
> test is clumsy (because of the unsigned return value), but you can
> always wrap it in a one-line inline function of your own.
> 
mbrtoc32 is effectively bbx_get_utf8 and bbx_utf8_skip rolled into one,
which has micro-optimisation benefits. I didn't realise that such a function
had been standardised.

[toc] | [prev] | [next] | [standalone]


#77633

FromBen Bacarisse <ben.usenet@bsb.me.uk>
Date2015-12-02 16:49 +0000
Message-ID<8737vk7q4j.fsf@bsb.me.uk>
In reply to#77628
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:

> On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> <snip>
>> > UTF-8, 8-bit chars
>> >
>> > char s[] = "utf-8 here"
>> > int ch;
>> > char *ptr = s;
>> > while(ch = bbx_get_utf8(ptr))
>> > {
>> >   char buff[32] = {0};
>> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
>> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
>> 
>> What does printf_utf8 do?
>> 
> You need a version of printf() which h will actually display UTF-8 as
> the proper glyphs
> rather than as extended ascii or IBM character graphics.

I guessed that was the purpose, but wanted to know what it *does* --
i.e. how it achieves this purpose.  What is it that you do that printf
does not?

> Maybe you can achieve that with set locale. I'm a bit sceptical. %s
> has a better chance of
> working than passing a unicode code point to %c, though it makes the code a bit
> messier.

I don't see any %c.  Do you use them inside printf_utf8?

>> >    ptr += bbx_utf8_skip(ptr);  
>> > }

<snip>
-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#77655

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-02 11:50 -0800
Message-ID<093a5906-651d-4a18-a03b-1b4fa15a9700@googlegroups.com>
In reply to#77633
On Wednesday, December 2, 2015 at 4:49:15 PM UTC, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> 
> > On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> >> <snip>
> >> > UTF-8, 8-bit chars
> >> >
> >> > char s[] = "utf-8 here"
> >> > int ch;
> >> > char *ptr = s;
> >> > while(ch = bbx_get_utf8(ptr))
> >> > {
> >> >   char buff[32] = {0};
> >> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
> >> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
> >> 
> >> What does printf_utf8 do?
> >> 
> > You need a version of printf() which h will actually display UTF-8 as
> > the proper glyphs
> > rather than as extended ascii or IBM character graphics.
> 
> I guessed that was the purpose, but wanted to know what it *does* --
> i.e. how it achieves this purpose.  What is it that you do that printf
> does not?
> 
> > Maybe you can achieve that with set locale. I'm a bit sceptical. %s
> > has a better chance of
> > working than passing a unicode code point to %c, though it makes the code a bit
> > messier.
> 
> I don't see any %c.  Do you use them inside printf_utf8?
> 
The system I have actually has a function called bbx_write_utf8
which renders a UTF-8 string to an rgba buffer. Usually, of course
other functions then display the buffer in a window.
The console widget is a high level concept, but it has a con_printf
which is built on top of the bbx_functions, and should be 
unicode aware. However currently you can't have right to left
languages or composited glyphs. Eventually the system will have
to be expanded to accommodate them, but it's quite a task.

However I'm calling the system vsprintf(), so %c won't print
a unicode character. That's something I'll probably change if 
and when I move to a custom con_printf().  

[toc] | [prev] | [next] | [standalone]


#77656

FromBen Bacarisse <ben.usenet@bsb.me.uk>
Date2015-12-02 20:02 +0000
Message-ID<87si3k62lz.fsf@bsb.me.uk>
In reply to#77655
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:

> On Wednesday, December 2, 2015 at 4:49:15 PM UTC, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> 
>> > On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
>> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> >> <snip>
>> >> > UTF-8, 8-bit chars
>> >> >
>> >> > char s[] = "utf-8 here"
>> >> > int ch;
>> >> > char *ptr = s;
>> >> > while(ch = bbx_get_utf8(ptr))
>> >> > {
>> >> >   char buff[32] = {0};
>> >> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
>> >> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
>> >> 
>> >> What does printf_utf8 do?
>> >> 
>> > You need a version of printf() which h will actually display UTF-8 as
>> > the proper glyphs
>> > rather than as extended ascii or IBM character graphics.
>> 
>> I guessed that was the purpose, but wanted to know what it *does* --
>> i.e. how it achieves this purpose.  What is it that you do that printf
>> does not?
>> 
>> > Maybe you can achieve that with set locale. I'm a bit sceptical. %s
>> > has a better chance of
>> > working than passing a unicode code point to %c, though it makes
>> > the code a bit
>> > messier.
>> 
>> I don't see any %c.  Do you use them inside printf_utf8?
>> 
> The system I have actually has a function called bbx_write_utf8
> which renders a UTF-8 string to an rgba buffer. Usually, of course
> other functions then display the buffer in a window.

I thought it was like printf.  That's partly because of the name and
partly because you seemed to be answering BartC request for code that
does what his did but was UTF-8 aware.  Whilst he could use
bbx_utf8_skip instead of adding one (it's presumably nothing more than
an array lookup) he could not use printf_utf8 instead of printf even if
you published the code.

<snip>
-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#77660

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-02 12:31 -0800
Message-ID<7bed98ee-0563-4e96-93f9-0c7cb01d4108@googlegroups.com>
In reply to#77656
On Wednesday, December 2, 2015 at 8:02:32 PM UTC, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> 
> > On Wednesday, December 2, 2015 at 4:49:15 PM UTC, Ben Bacarisse wrote:
> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> >> 
> >> > On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
> >> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> >> >> <snip>
> >> >> > UTF-8, 8-bit chars
> >> >> >
> >> >> > char s[] = "utf-8 here"
> >> >> > int ch;
> >> >> > char *ptr = s;
> >> >> > while(ch = bbx_get_utf8(ptr))
> >> >> > {
> >> >> >   char buff[32] = {0};
> >> >> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
> >> >> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
> >> >> 
> >> >> What does printf_utf8 do?
> >> >> 
> >> > You need a version of printf() which h will actually display UTF-8 as
> >> > the proper glyphs
> >> > rather than as extended ascii or IBM character graphics.
> >> 
> >> I guessed that was the purpose, but wanted to know what it *does* --
> >> i.e. how it achieves this purpose.  What is it that you do that printf
> >> does not?
> >> 
> >> > Maybe you can achieve that with set locale. I'm a bit sceptical. %s
> >> > has a better chance of
> >> > working than passing a unicode code point to %c, though it makes
> >> > the code a bit
> >> > messier.
> >> 
> >> I don't see any %c.  Do you use them inside printf_utf8?
> >> 
> > The system I have actually has a function called bbx_write_utf8
> > which renders a UTF-8 string to an rgba buffer. Usually, of course
> > other functions then display the buffer in a window.
> 
> I thought it was like printf.  That's partly because of the name and
> partly because you seemed to be answering BartC request for code that
> does what his did but was UTF-8 aware.  Whilst he could use
> bbx_utf8_skip instead of adding one (it's presumably nothing more 
> than, an array lookup) he could not use printf_utf8 instead of printf
> even if you published the code.
> 
Depends on his requirements. If you want a program that runs
an interactive console and is portable to Windows, Linux, and 
other systems with a bit of effort, the Baby X console widget
is a perfectly viable solution. If you must have stdin and stdout,
maybe because you need to interact with other programs launched
from a shell, it's not suitable.
But then you need some way of ensuring that bytes passed to stdout 
are displayed in the desired fashion. stdout might be a C
concept. printf might just rasterise images. Admittedly you'd
have a layer between the formatting code and the reasterizer. Or
it might be a system concept. The latter is common on desktop 
systems. 

[toc] | [prev] | [next] | [standalone]


#77693

FromBen Bacarisse <ben.usenet@bsb.me.uk>
Date2015-12-03 01:43 +0000
Message-ID<878u5c5mtv.fsf@bsb.me.uk>
In reply to#77660
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:

> On Wednesday, December 2, 2015 at 8:02:32 PM UTC, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> 
>> > On Wednesday, December 2, 2015 at 4:49:15 PM UTC, Ben Bacarisse wrote:
>> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> >> 
>> >> > On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
>> >> >> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
>> >> >> <snip>
>> >> >> > UTF-8, 8-bit chars
>> >> >> >
>> >> >> > char s[] = "utf-8 here"
>> >> >> > int ch;
>> >> >> > char *ptr = s;
>> >> >> > while(ch = bbx_get_utf8(ptr))
>> >> >> > {
>> >> >> >   char buff[32] = {0};
>> >> >> >   strncpy(buff, ptr, bbx_utf8_skip(ptr));
>> >> >> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
>> >> >> 
>> >> >> What does printf_utf8 do?
>> >> >> 
>> >> > You need a version of printf() which h will actually display UTF-8 as
>> >> > the proper glyphs
>> >> > rather than as extended ascii or IBM character graphics.
>> >> 
>> >> I guessed that was the purpose, but wanted to know what it *does* --
>> >> i.e. how it achieves this purpose.  What is it that you do that printf
>> >> does not?
>> >> 
>> >> > Maybe you can achieve that with set locale. I'm a bit sceptical. %s
>> >> > has a better chance of
>> >> > working than passing a unicode code point to %c, though it makes
>> >> > the code a bit
>> >> > messier.
>> >> 
>> >> I don't see any %c.  Do you use them inside printf_utf8?
>> >> 
>> > The system I have actually has a function called bbx_write_utf8
>> > which renders a UTF-8 string to an rgba buffer. Usually, of course
>> > other functions then display the buffer in a window.
>> 
>> I thought it was like printf.  That's partly because of the name and
>> partly because you seemed to be answering BartC request for code that
>> does what his did but was UTF-8 aware.  Whilst he could use
>> bbx_utf8_skip instead of adding one (it's presumably nothing more 
>> than, an array lookup) he could not use printf_utf8 instead of printf
>> even if you published the code.
>> 
> Depends on his requirements.

Of course.  He appeared to want output and not rendering since he used
printf.  It's quite reasonable of me to interpret your printf_utf8 in
that context.  Obviously you want to promote baby X (and that's fine)
but there's no indication that BartC wanted anything like that.

<snip>
-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#77640

FromKeith Thompson <kst-u@mib.org>
Date2015-12-02 09:21 -0800
Message-ID<lnzixssr4t.fsf@kst-u.example.com>
In reply to#77628
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> On Wednesday, December 2, 2015 at 2:23:29 PM UTC, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
[...]
>> >    printf_utf8("%s : hex %08x dec %d\n", buff, ch, ch);
>> 
>> What does printf_utf8 do?
>> 
> You need a version of printf() which h will actually display UTF-8 as
> the proper glyphs rather than as extended ascii or IBM character
> graphics.  Maybe you can achieve that with set locale. I'm a bit
> sceptical. %s has a better chance of working than passing a unicode
> code point to %c, though it makes the code a bit messier.
[...]

printf() just sends a sequence of bytes to stdout.  It's up to
whatever stdout is connected to to interpret those bytes.  It does
perform some translations in text mode, but that's typically limited
to end-of-line processing (and not even that on POSIX systems).

printf doesn't *display* anything.  For example, stdout might go to
a disk file.  If it goes to a terminal or emulator, it's up to the
terminal and its settings to determine how the bytes are displayed.

printf("%c", ...) prints a single one-byte character.  The argument
is of type int, and is converted to unsigned char.  It doesn't handle
multi-byte characters (unless you invoke it once for each byte).

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#77608

FromRichard Damon <Richard@Damon-Family.org>
Date2015-12-02 07:29 -0500
Message-ID<suB7y.239636$eK.126102@fx11.iad>
In reply to#77606
On 12/2/15 7:08 AM, BartC wrote:
> On 02/12/2015 01:48, Richard Damon wrote:
>> On 12/1/15 3:02 PM, BartC wrote:
>>>
>>> This is the problem I have with people saying that UTF8 can be be used
>>> transparently.
>>
>> There is one simple rule that a program needs to follow for UTF-8 to be
>> transparent, the program must only break the string at the boundaries of
>> known characters (either from known characters (like space or new line)
>> or boundaries of known valid strings.
>>
>> UTF-8 is specifically designed that no code point is a sub-string of
>> another code point.
>>
>> Programs that assume that a string can be broken at arbitrary points,
>> can be broken by UTF-8.
>
> I think 99% of the programs I've ever written have needed at some point
> to deal with strings character by character.
>
> Actually just a few days ago I had to take a bunch of separate strings,
> truncate each one to fit a certain pixel-width, and combine them into
> one string separated by tabs and 'reverse-tabs' which corresponded to a
> particular pixel-specified table of tab stops.
>
> (A reverse-tab will right-justify one of the original strings within a
> column. The string is written in a proportional font to a raster display.)
>
> If I can't use 8-bit characters for that then I'd need to switch to 16
> or 32-bit ones. Then I could use the same code, as only data types will
> have changed.
>
> Trying to do it directly on UTF8 would mean rewriting pretty much
> everything.
>
> BTW I notice nobody has given me an example of how to rewrite that
> little example of mine to properly display the separate characters of a
> string (show each one on a separate line together with its hex and
> decimal code).
>
> I guess it's not so easy after all...
>

IF you are writing code that assumes one location is a Glyph, then to 
handle Unicode, you need to make it fully Unicode understanding due to 
things like combining characters. If you just need one location to be a 
Codepoint, then you need to work in UCS-4 (32 bit Unicode) or maybe 
UCS-2 if you only need to deal

The alternative would be to have you application understand UTF-8, and 
thus know how to break down a UTF-8 string into chars.

This is a natural fact that if you have more than 256 different 
'characters' you can't use a byte to hold them.

Since you say you have a proportional font, then by definition you 
program MUST know every possible character you are displaying, and if 
you want to handle full Unicode will need a VERY big font table, and you 
program must understand a lot about how Unicode works.

[toc] | [prev] | [next] | [standalone]


#77614

FromMalcolm McLean <malcolm.mclean5@btinternet.com>
Date2015-12-02 05:47 -0800
Message-ID<58e23eff-31b2-49c2-b14d-d1c4deb9456c@googlegroups.com>
In reply to#77608
On Wednesday, December 2, 2015 at 12:29:24 PM UTC, Richard Damon wrote:
> On 12/2/15 7:08 AM, BartC wrote:
> 
> Since you say you have a proportional font, then by definition you 
> program MUST know every possible character you are displaying, and if 
> you want to handle full Unicode will need a VERY big font table, and you 
> program must understand a lot about how Unicode works.
>
No, because usually the user selects the font, which typically will contain ascii
plus a subset of non-English glyphs he's interested in. Everything else is keyed
to the missing character, an open box.
So when Baby X renders a utf-8 string, it does a binary search on the code point,
and if it doesn't find it, inserts the missing glyph. Of course there's a function
for determining string width given a font. it's absolutely central. It's a bit naive
because I'm not yet handling compositing glyphs or right to left languages.

[toc] | [prev] | [next] | [standalone]


#77635

FromStephen Sprunk <stephen@sprunk.org>
Date2015-12-02 11:03 -0600
Message-ID<n3n843$fn3$1@dont-email.me>
In reply to#77614
On 02-Dec-15 07:47, Malcolm McLean wrote:
> Richard Damon wrote:
>> On 12/2/15 7:08 AM, BartC wrote:
>> Since you say you have a proportional font, then by definition you
>> program MUST know every possible character you are displaying, and
>> if you want to handle full Unicode will need a VERY big font table,
>> and you program must understand a lot about how Unicode works.
> 
> No, because usually the user selects the font, which typically will
> contain ascii plus a subset of non-English glyphs he's interested in.

Many fonts contain glyphs for numerous scripts, and many others don't
contain glyphs for the ASCII range because there's no need.

> Everything else is keyed to the missing character, an open box.

That's one option, but in recent years, the trend has been to search
_every installed font_ for one that can render the missing character,
and many systems include one or more "fonts of last resort" that cover
virtually everything; a mixture of fonts may not look ideal, but it
simplifies things for users--and doesn't discourage users from using
"unusual" characters.

S

-- 
Stephen Sprunk         "God does not play dice."  --Albert Einstein
CCIE #3723         "God is an inveterate gambler, and He throws the
K5SSS        dice at every possible opportunity." --Stephen Hawking

[toc] | [prev] | [next] | [standalone]


#77617

FromBartC <bc@freeuk.com>
Date2015-12-02 14:16 +0000
Message-ID<n3mub3$7e5$1@dont-email.me>
In reply to#77608
On 02/12/2015 12:29, Richard Damon wrote:
> On 12/2/15 7:08 AM, BartC wrote:

>> Actually just a few days ago I had to take a bunch of separate strings,
>> truncate each one to fit a certain pixel-width, and combine them into
>> one string separated by tabs and 'reverse-tabs' which corresponded to a
>> particular pixel-specified table of tab stops.
>>
>> (A reverse-tab will right-justify one of the original strings within a
>> column. The string is written in a proportional font to a raster
>> display.)

> Since you say you have a proportional font, then by definition you
> program MUST know every possible character you are displaying, and if
> you want to handle full Unicode will need a VERY big font table, and you
> program must understand a lot about how Unicode works.

I use Windows' API to display the text, and to obtain information about 
its dimensions (using TextOutA, GetTextExtentPoint32 etc.)

My code is still responsible for selecting the appropriate font size, 
position the text where I want it, and character-clipping it as needed 
(clipping to part-characters must be done by setting a clip region).

Also TextOutA doesn't handle embedded control codes, which I have to 
take care of (involving scanning by character, extracting the control 
codes and making adjustments to the current pixel position, and 
assembling sub-strings of printable characters to pass to TextOutA).

I'm saying all this as an example of where sometimes you need to get 
inside a string and not just treat it as an opaque byte-array.

I'm surprised actually that apparently few here (in a group for a 
low-level language) ever have to look at the characters of a string.

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#77662 — Re: Working efficiently with 32-bit Unicode output streams, locale etc.

FromIan Collins <ian-news@hotmail.com>
Date2015-12-03 09:56 +1300
SubjectRe: Working efficiently with 32-bit Unicode output streams, locale etc.
Message-ID<dc948dFi96mU7@mid.individual.net>
In reply to#77617
BartC wrote:
> On 02/12/2015 12:29, Richard Damon wrote:
>> On 12/2/15 7:08 AM, BartC wrote:
>
>>> Actually just a few days ago I had to take a bunch of separate strings,
>>> truncate each one to fit a certain pixel-width, and combine them into
>>> one string separated by tabs and 'reverse-tabs' which corresponded to a
>>> particular pixel-specified table of tab stops.
>>>
>>> (A reverse-tab will right-justify one of the original strings within a
>>> column. The string is written in a proportional font to a raster
>>> display.)
>
>> Since you say you have a proportional font, then by definition you
>> program MUST know every possible character you are displaying, and if
>> you want to handle full Unicode will need a VERY big font table, and you
>> program must understand a lot about how Unicode works.
>
> I use Windows' API to display the text, and to obtain information about
> its dimensions (using TextOutA, GetTextExtentPoint32 etc.)
>
> My code is still responsible for selecting the appropriate font size,
> position the text where I want it, and character-clipping it as needed
> (clipping to part-characters must be done by setting a clip region).
>
> Also TextOutA doesn't handle embedded control codes, which I have to
> take care of (involving scanning by character, extracting the control
> codes and making adjustments to the current pixel position, and
> assembling sub-strings of printable characters to pass to TextOutA).
>
> I'm saying all this as an example of where sometimes you need to get
> inside a string and not just treat it as an opaque byte-array.
>
> I'm surprised actually that apparently few here (in a group for a
> low-level language) ever have to look at the characters of a string.

I'm sure many do, I do a lot of work with string manipulation.  Most low 
level string operations are searches and extraction of sub-strings which 
work fine for ASCII or UTF-8 text.

What you describe is more than simple string manipulation: calculating 
the displayed size of a string is one case where you have to be aware of 
the encoding.  In all of the cases I can think of where you need to be 
able to distinguish displayed characters from raw bytes you also need to 
know details of the font and display.

-- 
Ian Collins

[toc] | [prev] | [next] | [standalone]


Page 10 of 11 — ← Prev page 1 … 8 9 [10] 11  Next page →

Back to top | Article view | comp.lang.c


csiph-web