Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #77629 > unrolled thread
| Started by | fir <profesor.fir@gmail.com> |
|---|---|
| First post | 2015-12-02 08:01 -0800 |
| Last post | 2015-12-06 13:45 +0000 |
| Articles | 20 on this page of 158 — 25 participants |
Back to article view | Back to comp.lang.c
unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 08:01 -0800
Re: unicode is a fail me <self@example.org> - 2015-12-02 16:12 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:09 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 08:18 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:07 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:21 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:22 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:59 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:25 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 19:47 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-02 14:38 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:26 -0800
Re: unicode is a fail Tim Rentsch <txr@alumni.caltech.edu> - 2015-12-09 11:33 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:21 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 11:28 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 08:50 -0600
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:38 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:01 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-03 09:46 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:39 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 08:26 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 18:42 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 17:14 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 19:02 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-04 06:35 +0000
Re: unicode is a fail David Thompson <dave.thompson2@verizon.net> - 2015-12-28 05:11 -0500
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 10:24 -0600
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 22:37 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 11:32 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:10 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:10 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-02 19:45 +0000
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:08 +1300
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 14:10 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:27 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:21 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 15:18 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:45 +0000
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 09:43 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 11:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 12:19 -0800
Re: unicode is a fail Nobody <nobody@nowhere.invalid> - 2015-12-02 21:23 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 10:12 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 02:13 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:11 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:17 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 15:33 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:05 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:42 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:58 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 10:38 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:17 +0100
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:54 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 14:25 +0100
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 13:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-02 23:24 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-03 00:45 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 20:59 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 19:13 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-03 07:00 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 04:45 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:04 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 13:22 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 07:35 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 19:17 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 11:49 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:39 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 14:19 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 12:57 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-06 15:47 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:13 +0000
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-05 01:59 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-05 17:17 +0100
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:28 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 23:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:04 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 03:21 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:03 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 11:47 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 04:40 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 13:26 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:35 -0600
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-06 02:23 +0000
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 16:09 +0530
Re: unicode is a fail Xavier <zaz.colmant@free.fr> - 2015-12-05 15:45 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 07:42 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-05 16:32 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 18:11 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 02:19 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 13:09 +0000
Re: unicode is a fail Martin Shobe <martin.shobe@yahoo.com> - 2015-12-06 18:38 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 01:55 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 19:14 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 13:53 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 06:31 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 21:22 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 15:34 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 16:36 -0800
Re: unicode is a fail Lowell Gilbert <lgusenet@be-well.ilk.org> - 2015-12-08 11:40 -0500
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-08 17:18 +0000
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-09 08:36 -0600
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 10:06 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 09:35 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 10:07 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:04 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 12:35 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-09 23:46 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 16:15 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-10 03:49 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 18:12 -0600
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-09 13:12 -0500
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:12 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-10 20:48 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-09 23:44 +0000
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 01:13 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-10 10:39 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-10 03:33 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-10 06:07 -0800
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-10 08:21 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 00:59 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 14:33 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 22:45 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 12:38 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 13:55 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 21:14 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 16:50 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-07 02:38 -0600
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 07:34 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 00:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 19:49 -0600
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 21:32 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:50 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 22:15 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 17:27 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:06 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 18:29 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:50 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:38 +0000
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:33 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 16:51 -0500
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 10:59 +1300
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 11:00 +1300
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:31 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 17:48 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 01:20 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 02:02 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 09:43 -0600
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:55 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:29 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 16:42 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 10:06 +0000
OT: Usenet (Was: unicode is a fail) Steve Thompson <stevet810@gmail.com> - 2015-12-05 20:41 +0000
Re: OT: Usenet (Was: unicode is a fail) Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:18 -0800
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 10:21 +0530
OT: Facebook (was Re: unicode is a fail) Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-06 08:51 +0000
Re: OT: Facebook (was Re: unicode is a fail) raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:45 +0000
Page 4 of 8 — ← Prev page 1 2 3 [4] 5 6 7 8 Next page →
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-02 19:13 -0800 |
| Message-ID | <f68cba27-d18e-4334-afaf-45daf0aaad34@googlegroups.com> |
| In reply to | #77685 |
On Thursday, December 3, 2015 at 12:45:31 AM UTC, Bart wrote: > On 02/12/2015 23:24, Steve Thompson wrote: > > On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: > > > In the long term, US-ASCII will be restricted to select niches, such > > as extremely tiny embedded processors. Or simulators of legacy > > hardware, etc. There is no reason to think that a text encoding > > scheme that cannot represent arbitrary language symbols will survive > > much into the future. > > An encoding scheme that can only describe English? > > (Or, with a few dozen additions to fill those spare 128 slots, French, > Italian, Spanish, German, maybe even pin-yin.) > > You're right, that is no use at all... > You use escapes. Until UTF-8 became popular, and even now, you'll see Greek characters encoded in html as "θ" and so on. It's easier if the text is essentially English with just one or two embedded Greek symbols, although it's not a sensible method for encoding flowing Greek text. Then there is a need for special fonts. Bar codes are usually created using fonts, for example. Another interesting use of computers is to try to decode the Voynich manuscript. It's written in an script which has been discovered nowhere else, but most of it can be divided into what seem to be pretty clearly characters, with of course a few difficulties, and the nagging suspicion that maybe a character-based analysis has got the total wrong end of the stick.
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-03 07:00 +0000 |
| Message-ID | <2qyvC0.96Q.SQT8q@gmail.com> |
| In reply to | #77685 |
On Thu, Dec 03, 2015 at 12:45:10AM +0000, BartC wrote: > On 02/12/2015 23:24, Steve Thompson wrote: > >On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: > > >In the long term, US-ASCII will be restricted to select niches, such > >as extremely tiny embedded processors. Or simulators of legacy > >hardware, etc. There is no reason to think that a text encoding > >scheme that cannot represent arbitrary language symbols will survive > >much into the future. > > An encoding scheme that can only describe English? > > (Or, with a few dozen additions to fill those spare 128 slots, French, > Italian, Spanish, German, maybe even pin-yin.) > > You're right, that is no use at all... But which languages? With 128 characters you can't support all of them (but perhaps most). If you propose to keep ASCII and use code pages (as I think I understand the scheme) to determine which symbols occupy those 128 extra positions at any given time, you have merely punted the problem to another layer. Now I am not saying you should not use a translation layer in your code, and then do what you will with the upper half of char, but I think you will still need to deal with UTF8 for language strings traversing your software. I'm not sure that the extra work is worth it, at least for my purposes, and as it stands I intend to bite the bullet and just deal with UTF8 wherever I need to interpret or display language text. That doesn't mean I will necessarily enjoy it... Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-04 04:45 -0800 |
| Message-ID | <b9c423bf-38c2-4db2-bfe4-e07ad212a9d8@googlegroups.com> |
| In reply to | #77811 |
On Friday, December 4, 2015 at 12:34:15 PM UTC, Steve Thompson wrote: > > But which languages? With 128 characters you can't support all of > them (but perhaps most). If you propose to keep ASCII and use code > pages (as I think I understand the scheme) to determine which symbols > occupy those 128 extra positions at any given time, you have merely > punted the problem to another layer. Now I am not saying you should > not use a translation layer in your code, and then do what you will > with the upper half of char, but I think you will still need to deal > with UTF8 for language strings traversing your software. I'm not sure > that the extra work is worth it, at least for my purposes, and as it > stands I intend to bite the bullet and just deal with UTF8 wherever I > need to interpret or display language text. > If you've got a character mapped display with one byte per character, then you need a code page scheme. Moving to 16 or 32 bits per cell isn't usually a good option, because you still need raster maps for the characters, and you won't be able to store tens of thousands of them. Also, you need changes to the low level display driver to support such a move.
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-04 18:04 +0000 |
| Message-ID | <7vwXAN.WTQ.UnoUE@gmail.com> |
| In reply to | #77814 |
On Fri, Dec 04, 2015 at 04:45:55AM -0800, Malcolm McLean wrote: > On Friday, December 4, 2015 at 12:34:15 PM UTC, Steve Thompson wrote: > > > > But which languages? With 128 characters you can't support all of > > them (but perhaps most). If you propose to keep ASCII and use code > > pages (as I think I understand the scheme) to determine which symbols > > occupy those 128 extra positions at any given time, you have merely > > punted the problem to another layer. Now I am not saying you should > > not use a translation layer in your code, and then do what you will > > with the upper half of char, but I think you will still need to deal > > with UTF8 for language strings traversing your software. I'm not sure > > that the extra work is worth it, at least for my purposes, and as it > > stands I intend to bite the bullet and just deal with UTF8 wherever I > > need to interpret or display language text. > > > If you've got a character mapped display with one byte per character, > then you need a code page scheme. > Moving to 16 or 32 bits per cell isn't usually a good option, because > you still need raster maps for the characters, and you won't be able to > store tens of thousands of them. Also, you need changes to the low level > display driver to support such a move. Drivers and such are a specialized problem domain. No one even halfway sane will suggest that you should use UTF8, etc. to represent text in that case. I thought this discussion was mainly concerned with the application program domain. If I were writing software to display text on a microcontroller LCD I'd do something different too. Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-12-04 13:22 +0000 |
| Message-ID | <n3s3tj$8qe$1@dont-email.me> |
| In reply to | #77811 |
On 03/12/2015 07:00, Steve Thompson wrote: > On Thu, Dec 03, 2015 at 12:45:10AM +0000, BartC wrote: >> On 02/12/2015 23:24, Steve Thompson wrote: >>> On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: >> >>> In the long term, US-ASCII will be restricted to select niches, such >>> as extremely tiny embedded processors. Or simulators of legacy >>> hardware, etc. There is no reason to think that a text encoding >>> scheme that cannot represent arbitrary language symbols will survive >>> much into the future. >> >> An encoding scheme that can only describe English? >> >> (Or, with a few dozen additions to fill those spare 128 slots, French, >> Italian, Spanish, German, maybe even pin-yin.) >> >> You're right, that is no use at all... > > But which languages? With 128 characters you can't support all of > them (but perhaps most). If you propose to keep ASCII and use code > pages (as I think I understand the scheme) to determine which symbols > occupy those 128 extra positions at any given time, you have merely > punted the problem to another layer. What is the problem? That any computer in any country should always be able to deal with 99.99% of the World's alphabets that it doesn't care about? (I normally need about 100 characters which is 0.01% of the 1 million in Unicode.) I suspect that most people are mainly interested in their own language! So that is something about Unicode I'm not comfortable with. Our nice tidy little alphabet (perhaps one of the reasons the West has been ahead technologically) is swamped by these huge character sets from around the world, which still don't like being marshalled into neat little units. Fortunately the same ideas were not applied to encodings such as Morse code, semaphore or telex, and we don't have Scrabble sets with a million different blocks to make it international. While dictionaries and books in general still tend to deal with one (or sometimes two) languages at a time. Not 6000. I think some things should be kept local (like the telephone numbers in a country, road designations, and car registrations, although no doubt people will want to unify all those as well). As it is, Unicode is unwieldy. I don't think many would disagree. -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-04 07:35 -0800 |
| Message-ID | <f2fe527f-2fd0-47dc-a9e2-41f832b357f0@googlegroups.com> |
| In reply to | #77820 |
On Friday, December 4, 2015 at 1:22:23 PM UTC, Bart wrote: > > What is the problem? That any computer in any country should always be > able to deal with 99.99% of the World's alphabets that it doesn't care > about? (I normally need about 100 characters which is 0.01% of the 1 > million in Unicode.) > I've just downloaded a new operating system for this computer (an Apple mini). The image file is 6GB. There's room there for at least one font for every major language. As it happens Chinese is no good to me, Hebrew is. But statistically 20% of us should be Chinese, the other way round is much more common. But it shouldn't matter. Anyone can sit at my computer, open a document, and if he can read that language, he can read the information. > > I suspect that most people are mainly interested in their own language! > Mainly, yes. I read far more English than I do Hebrew. But I still want Hebrew on occasions - if one of Rick's posts leads to an off-topic question that depends crucially on a word used in scripture, I want to be able to check. > > So that is something about Unicode I'm not comfortable with. Our nice > tidy little alphabet (perhaps one of the reasons the West has been ahead > technologically) is swamped by these huge character sets from around the > world, which still don't like being marshalled into neat little units. > > Fortunately the same ideas were not applied to encodings such as Morse > code, semaphore or telex, and we don't have Scrabble sets with a million > different blocks to make it international. While dictionaries and books > in general still tend to deal with one (or sometimes two) languages at a > time. Not 6000. > > I think some things should be kept local (like the telephone numbers in > a country, road designations, and car registrations, although no doubt > people will want to unify all those as well). > > As it is, Unicode is unwieldy. I don't think many would disagree. > The tongues of men were sundered at Babel. It's a curse and not a blessing. But we have to deal with it.
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-04 19:17 +0000 |
| Message-ID | <y51yVe.p8Y.TmUhC@gmail.com> |
| In reply to | #77820 |
On Fri, Dec 04, 2015 at 01:22:04PM +0000, BartC wrote: > On 03/12/2015 07:00, Steve Thompson wrote: > >On Thu, Dec 03, 2015 at 12:45:10AM +0000, BartC wrote: > >>On 02/12/2015 23:24, Steve Thompson wrote: > >>>On Wed, Dec 02, 2015 at 08:01:52AM -0800, fir wrote: > >> > >>>In the long term, US-ASCII will be restricted to select niches, such > >>>as extremely tiny embedded processors. Or simulators of legacy > >>>hardware, etc. There is no reason to think that a text encoding > >>>scheme that cannot represent arbitrary language symbols will survive > >>>much into the future. > >> > >>An encoding scheme that can only describe English? > >> > >>(Or, with a few dozen additions to fill those spare 128 slots, French, > >>Italian, Spanish, German, maybe even pin-yin.) > >> > >>You're right, that is no use at all... > > > >But which languages? With 128 characters you can't support all of > >them (but perhaps most). If you propose to keep ASCII and use code > >pages (as I think I understand the scheme) to determine which symbols > >occupy those 128 extra positions at any given time, you have merely > >punted the problem to another layer. > > What is the problem? That any computer in any country should always be > able to deal with 99.99% of the World's alphabets that it doesn't care > about? (I normally need about 100 characters which is 0.01% of the 1 > million in Unicode.) > > I suspect that most people are mainly interested in their own language! I don't know about most people, but I am often annoyed that my keyboard does not show a means of generating accented characters. I do not use foreign-language words and phrases often, but when I do it is more than trivially annoying to used a character picker. Which is why I usually write 'naive' instead of 'naïve', etc. > So that is something about Unicode I'm not comfortable with. Our nice > tidy little alphabet (perhaps one of the reasons the West has been ahead > technologically) is swamped by these huge character sets from around the > world, which still don't like being marshalled into neat little units. The West? Are you forgetting the Europe is also part of "the West"? The technological lead of the West is another matter, and I am sorry if you are inconvenienced by the catch-up game underway in other parts of the world. Greek, APL, formal logic, mathematics, etc. are all sufficiently pervasive that their symbols merit inclusion in any reasonable general-use character set, and on that basis any fixation on English is bound to be terribly short-sighted. > Fortunately the same ideas were not applied to encodings such as Morse > code, semaphore or telex, and we don't have Scrabble sets with a million > different blocks to make it international. While dictionaries and books > in general still tend to deal with one (or sometimes two) languages at a > time. Not 6000. Again which languages? Software I use would be prudent to include the capacity to render English, French, German, Swedish (Scandinavian language generally), Greek, Latin, as well as the characters appropriate to mathematics symbols and so on. The preference of others is bound to be different. My requirements might be met with a 16-bit encoding while an Asian speaker will have a substantially different preference. A single encoding scheme for everyone at least avoids the ghettoization of any one language demographic. > I think some things should be kept local (like the telephone numbers in > a country, road designations, and car registrations, although no doubt > people will want to unify all those as well). Multi-lingual speakers are not likely to cooperate. > As it is, Unicode is unwieldy. I don't think many would disagree. I do not disagree, and if something better arises I am sure many, many programmers sigh a little sigh of relief. Util then... Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | supercat@casperkitty.com |
|---|---|
| Date | 2015-12-04 11:49 -0800 |
| Message-ID | <9391fc54-b5f8-42b7-93ab-f170fb2d51eb@googlegroups.com> |
| In reply to | #77847 |
On Friday, December 4, 2015 at 1:32:22 PM UTC-6, Steve Thompson wrote: > I don't know about most people, but I am often annoyed that my > keyboard does not show a means of generating accented characters. I > do not use foreign-language words and phrases often, but when I do it > is more than trivially annoying to used a character picker. Which is > why I usually write 'naive' instead of 'naïve', etc. What annoys me is that the US-International keyboard layout terms common ASCII characters into dead-keys. The 1984 Macintosh had a very nice keyboard layout which used option+grave, option+apostrophe, option+shift+apostrophe, etc. as dead keys while leaving grave, apostrophe, and quote marks alone; I have no idea why MS didn't do likewise. On my own machine I've fixed the problem using a free keyboard-layout utility published by MS. For some weird bizarre reason, though, it's not possible to simply tell Windows "Here's a text file holding my keyboard layout", nor can one simply use the utility to take the text file and install it into the system. Instead, one must use the utility to generate an application which will in turn install the new layout into the system. I have no idea why things are so complicated. Still, programming in C is a lot nicer when things like quote marks work than when they don't.
[toc] | [prev] | [next] | [standalone]
| From | Stephen Sprunk <stephen@sprunk.org> |
|---|---|
| Date | 2015-12-04 15:39 -0600 |
| Message-ID | <n3t126$46q$1@dont-email.me> |
| In reply to | #77848 |
On 04-Dec-15 13:49, supercat@casperkitty.com wrote: > Steve Thompson wrote: >> I don't know about most people, but I am often annoyed that my >> keyboard does not show a means of generating accented characters. >> I do not use foreign-language words and phrases often, but when I >> do it is more than trivially annoying to used a character picker. >> Which is why I usually write 'naive' instead of 'naïve', etc. > > What annoys me is that the US-International keyboard layout terms > common ASCII characters into dead-keys. The 1984 Macintosh had a > very nice keyboard layout which used option+grave, option+apostrophe, > option+shift+apostrophe, etc. as dead keys while leaving grave, > apostrophe, and quote marks alone; I have no idea why MS didn't do > likewise. MS had to follow what the hardware vendors did, whereas Apple could create their own standard. I've worked around the problem by having a hotkey that toggles between US and US Intl layouts, but I'd much prefer to "fix" the latter so that AltGr-" et al were dead keys but plain " et al weren't. I could leave such a layout active 24x7; I don't mind the loss of RightAlt to AltGr, but I do mind "Oh!" turning into Öh!"when I don't expect it. > On my own machine I've fixed the problem using a free > keyboard-layout utility published by MS. I've never managed to get it to load on my system, and it's completely unsupported by MS, so no help there. I've tried freeware alternatives, but none of them seem to all me to mess with dead keys in particular. > For some weird bizarre reason, though, it's not possible to simply > tell Windows "Here's a text file holding my keyboard layout", That would go completely against MS's design philosophy. S -- Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSS dice at every possible opportunity." --Stephen Hawking
[toc] | [prev] | [next] | [standalone]
| From | supercat@casperkitty.com |
|---|---|
| Date | 2015-12-04 14:19 -0800 |
| Message-ID | <ca45dcd6-31e2-4306-8804-aead50e6ffe9@googlegroups.com> |
| In reply to | #77853 |
On Friday, December 4, 2015 at 3:39:45 PM UTC-6, Stephen Sprunk wrote: > MS had to follow what the hardware vendors did, whereas Apple could > create their own standard. In what way did hardware vendors force MS to regard ' " ` ~ ^ as dead keys when not used with AltGr? > > On my own machine I've fixed the problem using a free > > keyboard-layout utility published by MS. > > I've never managed to get it to load on my system, and it's completely > unsupported by MS, so no help there. I've tried freeware alternatives, > but none of them seem to all me to mess with dead keys in particular. Sorry it doesn't work for you. > > For some weird bizarre reason, though, it's not possible to simply > > tell Windows "Here's a text file holding my keyboard layout", > > That would go completely against MS's design philosophy. MS doesn't require dancing through hoops for fonts or even device drivers. It's possible to go into a device manager and tell windows "I have this INF file I want you to install." INF files aren't the most legible things in the world, but they can still be better than an opaque executable.
[toc] | [prev] | [next] | [standalone]
| From | Stephen Sprunk <stephen@sprunk.org> |
|---|---|
| Date | 2015-12-06 12:57 -0600 |
| Message-ID | <n4209o$606$1@dont-email.me> |
| In reply to | #77855 |
On 04-Dec-15 16:19, supercat@casperkitty.com wrote: > Stephen Sprunk wrote: >> MS had to follow what the hardware vendors did, whereas Apple >> could create their own standard. > > In what way did hardware vendors force MS to regard ' " ` ~ ^ as > dead keys when not used with AltGr? Keycaps, essentially: https://en.wikipedia.org/wiki/File:KB_US-International.svg If those keys were normal without AltGr and dead with AltGr, the keycaps would need to be replaced--and then would have been incorrect when the user exited Windows and ran some other DOS program. And 30 years later, they're still stuck with that decision. Since Apple makes their own hardware, they didn't have to worry as much such issues--and they care a lot less about backward compatibility than Microsoft does in the first place. >>> For some weird bizarre reason, though, it's not possible to >>> simply tell Windows "Here's a text file holding my keyboard >>> layout", >> >> That would go completely against MS's design philosophy. > > MS doesn't require dancing through hoops for fonts or even device > drivers. It's possible to go into a device manager and tell windows > "I have this INF file I want you to install." INF files aren't the > most legible things in the world, but they can still be better than > an opaque executable. It's rare that drivers are _just_ an INF file; usually there are lots of VXDs and/or DLLs, and the INF lists what needs to be installed. In this specific case, I'm not sure why an INF or similar text format wasn't used. It seems MS treats keyboard layouts as drivers (with all the attending mess) rather than a unified driver that is infinitely configurable, but often the obvious solution has non-obvious defects. S -- Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSS dice at every possible opportunity." --Stephen Hawking
[toc] | [prev] | [next] | [standalone]
| From | supercat@casperkitty.com |
|---|---|
| Date | 2015-12-06 15:47 -0800 |
| Message-ID | <eb97e662-ea1d-46af-8b53-6da02968b663@googlegroups.com> |
| In reply to | #77989 |
On Sunday, December 6, 2015 at 12:57:23 PM UTC-6, Stephen Sprunk wrote: > On 04-Dec-15 16:19, supercat wrote: > Keycaps, essentially: > https://en.wikipedia.org/wiki/File:KB_US-International.svg What's the problem? Have one keyboard layout where the dead keys work as they do now, for people who are used to that behavior, and one where red keys are those which, *when typed with Alt+GR*, act as dead keys. The quote/apostrophe key legends would be a little quirky from that regard, but one could say that the blue legends are simply there as a reminder of what accents the key will produce. On the other hand, if Windows had an applet similar to the Macintosh "Key Caps" desk accessory that shipped in System 1.0 the keyboard legends would be largely irrelevant anyway. Even without any legends on the special keys, learning that altGr-apostrophe followed by a vowel puts an aigu over it, etc. would be a lot easier than learning all the alt-number codes for all the different accented characters.
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-05 01:13 +0000 |
| Message-ID | <taz0Kc.lc7.GQBOn@gmail.com> |
| In reply to | #77848 |
On Fri, Dec 04, 2015 at 11:49:00AM -0800, supercat@casperkitty.com wrote: > On Friday, December 4, 2015 at 1:32:22 PM UTC-6, Steve Thompson wrote: > > I don't know about most people, but I am often annoyed that my > > On my own machine I've fixed the problem using a free keyboard-layout > utility published by MS. For some weird bizarre reason, though, it's not > possible to simply tell Windows "Here's a text file holding my keyboard > layout", nor can one simply use the utility to take the text file and > install it into the system. Instead, one must use the utility to generate > an application which will in turn install the new layout into the system. > I have no idea why things are so complicated. Still, programming in C is > a lot nicer when things like quote marks work than when they don't. On Linux there is the loadkeys(1) utility which will install a key map from a file, and I have previously (long ago) used it to set up function-key macros, but it is a pain in the ass and admittedly not terribly prominent on my sysadmin to-do list. I have a hard enough time remembering all of my customized editor keystrokes. OTOH, I am currently using gnome-terminal and it happily accepts unicode numbers (similar to the DOS feature, some CTRL-SHIFT-ALT numberpad arrangement which escapes me at the moment) and I might be moderately content with a crib sheet for those occasions where I need those extra symbols. Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2015-12-05 01:59 +0000 |
| Message-ID | <87twnx3bax.fsf@bsb.me.uk> |
| In reply to | #77868 |
Steve Thompson <stevet810@gmail.com> writes: > On Fri, Dec 04, 2015 at 11:49:00AM -0800, supercat@casperkitty.com wrote: >> On Friday, December 4, 2015 at 1:32:22 PM UTC-6, Steve Thompson wrote: >> > I don't know about most people, but I am often annoyed that my >> >> On my own machine I've fixed the problem using a free keyboard-layout >> utility published by MS. For some weird bizarre reason, though, it's not >> possible to simply tell Windows "Here's a text file holding my keyboard >> layout", nor can one simply use the utility to take the text file and >> install it into the system. Instead, one must use the utility to generate >> an application which will in turn install the new layout into the system. >> I have no idea why things are so complicated. Still, programming in C is >> a lot nicer when things like quote marks work than when they don't. > > On Linux there is the loadkeys(1) utility which will install a key map > from a file, and I have previously (long ago) used it to set up > function-key macros, but it is a pain in the ass and admittedly not > terribly prominent on my sysadmin to-do list. I have a hard enough > time remembering all of my customized editor keystrokes. > > OTOH, I am currently using gnome-terminal and it happily accepts > unicode numbers (similar to the DOS feature, some CTRL-SHIFT-ALT > numberpad arrangement which escapes me at the moment) It's Shift+Ctrl+u and then hex digits. It works in almost all programs on my system (including Emacs, though Emacs has it's own, often simpler methods). > and I might be > moderately content with a crib sheet for those occasions where I need > those extra symbols. The only key map I modify is to make Insert into Compose so I can get é, ç, ¾ and so on with memorable keys (Compose ' e, Compose , c, Compose 3 4 and so on). -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2015-12-05 17:17 +0100 |
| Message-ID | <n3v2ib$196$1@dont-email.me> |
| In reply to | #77871 |
On 05/12/15 02:59, Ben Bacarisse wrote: > Steve Thompson <stevet810@gmail.com> writes: > >> On Fri, Dec 04, 2015 at 11:49:00AM -0800, supercat@casperkitty.com wrote: >>> On Friday, December 4, 2015 at 1:32:22 PM UTC-6, Steve Thompson wrote: >>>> I don't know about most people, but I am often annoyed that my >>> >>> On my own machine I've fixed the problem using a free keyboard-layout >>> utility published by MS. For some weird bizarre reason, though, it's not >>> possible to simply tell Windows "Here's a text file holding my keyboard >>> layout", nor can one simply use the utility to take the text file and >>> install it into the system. Instead, one must use the utility to generate >>> an application which will in turn install the new layout into the system. >>> I have no idea why things are so complicated. Still, programming in C is >>> a lot nicer when things like quote marks work than when they don't. >> >> On Linux there is the loadkeys(1) utility which will install a key map >> from a file, and I have previously (long ago) used it to set up >> function-key macros, but it is a pain in the ass and admittedly not >> terribly prominent on my sysadmin to-do list. I have a hard enough >> time remembering all of my customized editor keystrokes. >> >> OTOH, I am currently using gnome-terminal and it happily accepts >> unicode numbers (similar to the DOS feature, some CTRL-SHIFT-ALT >> numberpad arrangement which escapes me at the moment) > > It's Shift+Ctrl+u and then hex digits. It works in almost all programs > on my system (including Emacs, though Emacs has it's own, often simpler > methods). That's completely new to me. I think I would still be inclined to use "character map" for anything that could not be done with the dead keys, alt-gr combinations, or compose key on my keyboard. > >> and I might be >> moderately content with a crib sheet for those occasions where I need >> those extra symbols. > > The only key map I modify is to make Insert into Compose so I can get é, > ç, ¾ and so on with memorable keys (Compose ' e, Compose , c, Compose 3 > 4 and so on). > I use "scroll lock" for my compose key. But most non-ASCII characters that I need are accessible more directly on my keyboard - I don't know if that's because it is a Norwegian layout rather than an English layout. (Obviously the Norwegian letters åøæ are more easily accessible, but I am thinking of things like µ π ² ° § ½ ¼ ×
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-06 06:28 +0000 |
| Message-ID | <3Nz2vb.7Mm.SIGAI@gmail.com> |
| In reply to | #77871 |
On Sat, Dec 05, 2015 at 01:59:34AM +0000, Ben Bacarisse wrote: > Steve Thompson <stevet810@gmail.com> writes: > > > On Fri, Dec 04, 2015 at 11:49:00AM -0800, supercat@casperkitty.com wrote: > >> On Friday, December 4, 2015 at 1:32:22 PM UTC-6, Steve Thompson wrote: > >> > I don't know about most people, but I am often annoyed that my > >> > >> On my own machine I've fixed the problem using a free keyboard-layout > >> utility published by MS. For some weird bizarre reason, though, it's not > >> possible to simply tell Windows "Here's a text file holding my keyboard > >> layout", nor can one simply use the utility to take the text file and > >> install it into the system. Instead, one must use the utility to generate > >> an application which will in turn install the new layout into the system. > >> I have no idea why things are so complicated. Still, programming in C is > >> a lot nicer when things like quote marks work than when they don't. > > > > On Linux there is the loadkeys(1) utility which will install a key map > > from a file, and I have previously (long ago) used it to set up > > function-key macros, but it is a pain in the ass and admittedly not > > terribly prominent on my sysadmin to-do list. I have a hard enough > > time remembering all of my customized editor keystrokes. > > > > OTOH, I am currently using gnome-terminal and it happily accepts > > unicode numbers (similar to the DOS feature, some CTRL-SHIFT-ALT > > numberpad arrangement which escapes me at the moment) > > It's Shift+Ctrl+u and then hex digits. It works in almost all programs > on my system (including Emacs, though Emacs has it's own, often simpler > methods). That's the one. I use a much, much smaller clone of Emacs called jove which lacks a built-in programming language and does not support unicode directly. > > and I might be > > moderately content with a crib sheet for those occasions where I need > > those extra symbols. > > The only key map I modify is to make Insert into Compose so I can get é, > ç, ¾ and so on with memorable keys (Compose ' e, Compose , c, Compose 3 > 4 and so on). My laptop has a "Windows" key which I might repurpose but it is not much of a priority at the moment. Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2015-12-04 23:46 +0000 |
| Message-ID | <n3t8h6$3ip$1@dont-email.me> |
| In reply to | #77847 |
On 04/12/2015 19:17, Steve Thompson wrote: > On Fri, Dec 04, 2015 at 01:22:04PM +0000, BartC wrote: >> So that is something about Unicode I'm not comfortable with. Our nice >> tidy little alphabet (perhaps one of the reasons the West has been ahead >> technologically) is swamped by these huge character sets from around the >> world, which still don't like being marshalled into neat little units. > > The West? Are you forgetting the Europe is also part of "the West"? No. But western Europe at least still uses small alphabets, and mostly they are based around A-Z. > The technological lead of the West is another matter, and I am sorry > if you are inconvenienced by the catch-up game underway in other parts > of the world. Greek, APL, formal logic, mathematics, etc. are all > sufficiently pervasive that their symbols merit inclusion in any > reasonable general-use character set, and on that basis any fixation > on English is bound to be terribly short-sighted. Fine, then we move to 16 bits, which had long been anticipated anyway, and gives us plenty of room for special symbols. But not if we have to throw in every single alphabet and writing system that anybody has ever heard of (and apparently plenty that no one has heard of!). (And then you have vast, sprawling 'alphabets' like Chinese which are words rather than the letters used to build the words.) It just sounds 'off'. It reminds me of those early 'text-mode' displays where, instead of having proper pixel-graphics, some character codes were set aside to display a limited range of pre-determined patterns. To be able to display any arbitrary pattern, you need pixel-addressable graphics. So we really want a more flexible of specifying any character or symbol without just enumerating every single one can think of. (Imagine you were in the position of creating a new font, with a hundreds of thousands of to design! I've done that, but for only 100 characters.) > Again which languages? Software I use would be prudent to include the > capacity to render English, French, German, Swedish (Scandinavian > language generally), Greek, Latin, What's special about Latin? as well as the characters > appropriate to mathematics symbols and so on. And mathematics really requires control over layout. You will probably end up representing formulae in some sort of mark-up language anyway, or you will be writing them using a special editor that might store content in some binary format; whether it uses Unicode is then irrelevant. (Actually I've tried using the correct mathematical symbols within programming language syntax, such as × for multiply and ² for squared (y = x²). But it looked too gimmicky, as well as being fiddly to type in.) -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Date | 2015-12-05 01:04 +0000 |
| Message-ID | <wFe7nL.cjz.nHu02@gmail.com> |
| In reply to | #77857 |
On Fri, Dec 04, 2015 at 11:46:52PM +0000, BartC wrote: > On 04/12/2015 19:17, Steve Thompson wrote: > >On Fri, Dec 04, 2015 at 01:22:04PM +0000, BartC wrote: > > >>So that is something about Unicode I'm not comfortable with. Our nice > >>tidy little alphabet (perhaps one of the reasons the West has been ahead > >>technologically) is swamped by these huge character sets from around the > >>world, which still don't like being marshalled into neat little units. > > > >The West? Are you forgetting the Europe is also part of "the West"? > > No. But western Europe at least still uses small alphabets, and mostly > they are based around A-Z. Nitpick. Once the major European languages are included from Spanish to Finnish and everything in-between, how many code points are left? > >The technological lead of the West is another matter, and I am sorry > >if you are inconvenienced by the catch-up game underway in other parts > >of the world. Greek, APL, formal logic, mathematics, etc. are all > >sufficiently pervasive that their symbols merit inclusion in any > >reasonable general-use character set, and on that basis any fixation > >on English is bound to be terribly short-sighted. > > Fine, then we move to 16 bits, which had long been anticipated anyway, > and gives us plenty of room for special symbols. But not if we have to > throw in every single alphabet and writing system that anybody has ever > heard of (and apparently plenty that no one has heard of!). I rather suspect the Anthropologists will scream bloody murder if Egyptian hieroglyphics, Linear B, and all the rest are excluded. > (And then you have vast, sprawling 'alphabets' like Chinese which are > words rather than the letters used to build the words.) So go tell the Chinese (and Japanese, and Thais, and ...) that they should man-up and use a Western alphabet. Such schemes exist, after all. > It just sounds 'off'. It reminds me of those early 'text-mode' displays > where, instead of having proper pixel-graphics, some character codes > were set aside to display a limited range of pre-determined patterns. > > To be able to display any arbitrary pattern, you need pixel-addressable > graphics. > > So we really want a more flexible of specifying any character or symbol > without just enumerating every single one can think of. > > (Imagine you were in the position of creating a new font, with a > hundreds of thousands of to design! I've done that, but for only 100 > characters.) The font weenies will probably figure something out. This is not my concern. Publishers have already invested in the languages they print. > >Again which languages? Software I use would be prudent to include the > >capacity to render English, French, German, Swedish (Scandinavian > >language generally), Greek, Latin, > > What's special about Latin? Bad example. Perhaps Russian is a better choice; I hear it is a great language for cursing, comrade. And ignoring prose for the moment, should people's very names not be representable in their canonical form? > as well as the characters > >appropriate to mathematics symbols and so on. > > And mathematics really requires control over layout. You will probably > end up representing formulae in some sort of mark-up language anyway, or > you will be writing them using a special editor that might store content > in some binary format; whether it uses Unicode is then irrelevant. > > (Actually I've tried using the correct mathematical symbols within > programming language syntax, such as × for multiply and ² for squared (y > = x²). But it looked too gimmicky, as well as being fiddly to type in.) Mathematics is a good example, nonetheless. Then we have physics, all the unit symbols (degrees Centigrade, ohms, Angstroms, and on and on and on), and, and and and. Without complete coverage standards bodies and software houses are put in the position of picking and choosing the winners and losers. Formula markup is a problem as well, but distinct from representing glyphs, and if you're going to start down that road we can include diagrams and graphs as well as a supplemental requirement for representing certain classes of idea. The general problem at hand is the representation of written communications, which arguably includes non-textual forms like napkin scribbles and the like. Unicode doesn't do anything to help represent freehand drawing or cave paintings, but so what. The line must be drawn somewhere, and I think it is unreasonable to exclude NZ Maori script merely because so few people actually use it. Regards, Steve Thompson -- "If I had a nickel for every time some idiot called me about a computer problem that turned out to be user error, I would be able to retire and spend the rest of my days cultivating clues in my backyard hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2015-12-05 03:21 -0800 |
| Message-ID | <622998c5-112a-4334-a76d-05f563b3fb26@googlegroups.com> |
| In reply to | #77867 |
On Saturday, December 5, 2015 at 1:32:42 AM UTC, Steve Thompson wrote: > > I rather suspect the Anthropologists will scream bloody murder if > Egyptian hieroglyphics, Linear B, and all the rest are excluded. > Not really. You've got marginal, poorly documented scripts, and if you happen to be working with one you don't really expect to be able to fire up a word processor and type in the symbols. However there are plenty of spare code points, and supporting oddballs is rather fun. However you need to be able to support every language that a "consumer", which I would define as someone who is neither interested in programming nor in the language itself except as a way of expressing something, might want to have. > > > (Imagine you were in the position of creating a new font, with a > > hundreds of thousands of to design! I've done that, but for only 100 > > characters.) > > The font weenies will probably figure something out. This is not my > concern. Publishers have already invested in the languages they > print. > That's a serious issue. The free software foundation is trying to put together a full unicode font, but it's a massive undertaking, when I looked at it for Baby X it wasn't yet completed. Then even if you allow the user to load a unicode-keyed font (the route Baby X takes), you still haven't supported every language because of the layout rules. Where does the boundary between character representation and layout markup lie? Eg is x squared eks, two, and marked up as superscript, or is it eks, superscript_ two. What about root x + 1? Is that root, eks, plus, one, or openroot, eks, plus, one, closeroot ? The question is whether something that is largely functional but a bit buggy when you start stressing it with adventures into pointed Hebrew and the like is better or worse than something which works perfectly but is more limited. Most professionals prefer the latter, business life is all about presenting an image, not about offering the customer functionality to meet his needs.
[toc] | [prev] | [next] | [standalone]
| From | Stephen Sprunk <stephen@sprunk.org> |
|---|---|
| Date | 2015-12-05 13:03 -0600 |
| Message-ID | <n3vc93$939$1@dont-email.me> |
| In reply to | #77878 |
On 05-Dec-15 05:21, Malcolm McLean wrote: > Steve Thompson wrote: >> I rather suspect the Anthropologists will scream bloody murder if >> Egyptian hieroglyphics, Linear B, and all the rest are excluded. >> > Not really. You've got marginal, poorly documented scripts, and if > you happen to be working with one you don't really expect to be able > to fire up a word processor and type in the symbols. However there > are plenty of spare code points, and supporting oddballs is rather > fun. However you need to be able to support every language that a > "consumer", which I would define as someone who is neither > interested in programming nor in the language itself except as a way > of expressing something, might want to have. Indeed, and going from UCS-2 to UCS-4 gave us so much code space that there's no good reason _not_ to assign a few code points to scripts like Klingon. OTOH, despite being a fictional language, there _are_ more Klingon speakers now than many natural languages have left, so perhaps that isn't as silly as it seems. Of the ~6000 languages today, only _half_ are being taught to children (probably the greatest ethnocide in history), and it seems like every day there's a story about how "last living speaker" of another one has died--and anthropologists and linguists are scrambling to capture as much data about the remaining ones as they can while they can. On the plus side, at this rate we'll be able to rebuild the Tower of Babel within a few centuries. >>> (Imagine you were in the position of creating a new font, with a >>> hundreds of thousands of to design! I've done that, but for only >>> 100 characters.) >> >> The font weenies will probably figure something out. This is not >> my concern. Publishers have already invested in the languages >> they print. >> > That's a serious issue. The free software foundation is trying to put > together a full unicode font, but it's a massive undertaking, when I > looked at it for Baby X it wasn't yet completed. Bah. Just do what other modern GUIs do: if the selected font doesn't have the glyph you need, check all the other installed fonts. You need several dozen script-specific fonts to cover every assigned code point, but it's a lot easier than trying to build one enormous font. > The question is whether something that is largely functional but a > bit buggy when you start stressing it with adventures into pointed > Hebrew and the like is better or worse than something which works > perfectly but is more limited. Most professionals prefer the latter, > business life is all about presenting an image, not about offering > the customer functionality to meet his needs. In some problem domains, no answer is better than a wrong answer, and in others, it's the opposite. And sometimes, getting a wrong answer quickly is better than getting a correct one slowly. Text layout, though, seems to be one of the areas where folks (all of them, not just businesses) do demand perfection, and I'd agree that's at least partly about image, but it's also about how it's such a simple problem for humans that anyone who gets it wrong looks like an idiot, even though it's actually a difficult problem for computers. S -- Stephen Sprunk "God does not play dice." --Albert Einstein CCIE #3723 "God is an inveterate gambler, and He throws the K5SSS dice at every possible opportunity." --Stephen Hawking
[toc] | [prev] | [next] | [standalone]
Page 4 of 8 — ← Prev page 1 2 3 [4] 5 6 7 8 Next page →
Back to top | Article view | comp.lang.c
csiph-web