Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
| From | Steve Thompson <stevet810@gmail.com> |
|---|---|
| Newsgroups | comp.lang.c |
| Subject | Re: unicode is a fail |
| Date | 2015-12-06 07:34 +0000 |
| Organization | Friends of the Galactic Collective |
| Message-ID | <g7DsLI.43F.mQuAF@gmail.com> (permalink) |
| References | (4 earlier) <n3s3tj$8qe$1@dont-email.me> <y51yVe.p8Y.TmUhC@gmail.com> <n3t8h6$3ip$1@dont-email.me> <wFe7nL.cjz.nHu02@gmail.com> <n3uiop$p98$1@dont-email.me> |
On Sat, Dec 05, 2015 at 11:47:45AM +0000, BartC wrote:
> On 05/12/2015 01:04, Steve Thompson wrote:
> >On Fri, Dec 04, 2015 at 11:46:52PM +0000, BartC wrote:
>
> >>Fine, then we move to 16 bits, which had long been anticipated anyway,
> >>and gives us plenty of room for special symbols. But not if we have to
> >>throw in every single alphabet and writing system that anybody has ever
> >>heard of (and apparently plenty that no one has heard of!).
> >
> >I rather suspect the Anthropologists will scream bloody murder if
> >Egyptian hieroglyphics, Linear B, and all the rest are excluded.
>
> They probably wouldn't notice. Whatever software they use to enter and
> display the characters would still work if a different encoding scheme
> was used.
>
> Or many might prefer just using mark-up to describe it:
> {snake}{bird}{water}.
It seems to me that the code positions for those two languages are
already assigned.
> >>(And then you have vast, sprawling 'alphabets' like Chinese which are
> >>words rather than the letters used to build the words.)
> >
> >So go tell the Chinese (and Japanese, and Thais, and ...) that they
> >should man-up and use a Western alphabet. Such schemes exist, after
> >all.
>
> No, they can use the same alphabets, but they don't put them all into
> one giant melting pot with every other.
>
> Now, I can now longer write what had been trivial string handling
> routines such as capitalise, toupper, reverse, compare, left, leftn,
> etc etc. All are very well defined in ASCII, but would no longer be
> guaranteed to work with Unicode because most of the alphabets are so weird.
I'm not sure what to say. As others have pointed out (or suggested)
the complexity of language conventions is a product of undirected
evolution throughout history. It may be a mess, but nevertheless it
has to be dealt with.
Sorting in particular is a problem if one requires case insensitivity.
I suppose the only solution is a good set of per-language tables which
can be put in arrays for quick access. The combining characters are
another problem.
>From the "unicode" man-page on my system:
Implementation Levels
As not all systems are expected to support advanced mechanisms like
combining characters, ISO 10646-1 specifies the following three
implementation levels of UCS:
Level 1 Combining characters and Hangul Jamo (a variant encoding of
the Korean script, where a Hangul syllable glyph is coded as a
triplet or pair of vovel/consonant codes) are not supported.
Level 2 In addition to level 1, combining characters are now
allowed for some languages where they are essential (e.g., Thai,
Lao, Hebrew, Arabic, Devanagari, Malayalam).
Level 3 All UCS characters are supported.
The Unicode 3.0 Standard published by the Unicode Consortium
contains exactly the UCS Basic Multilingual Plane at implementation
level 3, as described in ISO 10646-1:2000. Unicode 3.1 added the
supplemental planes of ISO 10646-2. The Unicode standard and
technical reports published by the Unicode Consortium provide much
additional information on the semantics and recommended usages of
various characters. They provide guidelines and algorithms for
editing, sorting, comparing, normalizing, converting and displaying
Unicode strings.
I wonder what their algorithm hints are. Unfortunately something I
just don't have time to treat in depth at the moment.
Regards,
Steve Thompson
--
"If I had a nickel for every time some idiot called me about a
computer problem that turned out to be user error, I would be able to
retire and spend the rest of my days cultivating clues in my backyard
hillside garden." -- MysteryDog in 24hoursupport.helpdesk.
Back to comp.lang.c | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 08:01 -0800
Re: unicode is a fail me <self@example.org> - 2015-12-02 16:12 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:09 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 08:18 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:07 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:21 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:22 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:59 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:25 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 19:47 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-02 14:38 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 16:26 -0800
Re: unicode is a fail Tim Rentsch <txr@alumni.caltech.edu> - 2015-12-09 11:33 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:21 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 11:28 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 08:50 -0600
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:38 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 10:01 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-03 09:46 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:39 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 08:26 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-03 18:42 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-03 17:14 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 19:02 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-04 06:35 +0000
Re: unicode is a fail David Thompson <dave.thompson2@verizon.net> - 2015-12-28 05:11 -0500
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 10:24 -0600
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-03 22:37 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 11:32 +0100
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 11:10 -0600
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 09:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 13:10 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-02 19:45 +0000
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-03 09:08 +1300
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 14:10 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 11:27 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 15:21 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 15:18 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:45 +0000
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 09:43 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 11:40 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-02 12:19 -0800
Re: unicode is a fail Nobody <nobody@nowhere.invalid> - 2015-12-02 21:23 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 10:12 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 02:13 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:11 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 05:17 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 15:33 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:05 -0800
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 16:42 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-03 07:58 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-03 10:38 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-03 14:17 +0100
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:54 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-04 14:25 +0100
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-04 13:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-02 23:24 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-03 00:45 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-02 20:59 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-02 19:13 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-03 07:00 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 04:45 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:04 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 13:22 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-04 07:35 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 19:17 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 11:49 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 15:39 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-04 14:19 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 12:57 -0600
Re: unicode is a fail supercat@casperkitty.com - 2015-12-06 15:47 -0800
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:13 +0000
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-05 01:59 +0000
Re: unicode is a fail David Brown <david.brown@hesbynett.no> - 2015-12-05 17:17 +0100
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:28 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-04 23:46 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-05 01:04 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 03:21 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:03 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 11:47 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 04:40 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-05 13:26 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-05 13:35 -0600
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-06 02:23 +0000
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 16:09 +0530
Re: unicode is a fail Xavier <zaz.colmant@free.fr> - 2015-12-05 15:45 +0100
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 07:42 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-05 16:32 -0800
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 18:11 -0800
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 02:19 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-06 13:09 +0000
Re: unicode is a fail Martin Shobe <martin.shobe@yahoo.com> - 2015-12-06 18:38 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 01:55 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 19:14 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 13:53 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 06:31 -0800
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-07 21:22 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 15:34 -0600
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-07 16:36 -0800
Re: unicode is a fail Lowell Gilbert <lgusenet@be-well.ilk.org> - 2015-12-08 11:40 -0500
Re: unicode is a fail Ben Bacarisse <ben.usenet@bsb.me.uk> - 2015-12-08 17:18 +0000
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-09 08:36 -0600
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 10:06 -0600
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 09:35 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 10:07 -0800
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:04 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 12:35 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-09 23:46 +0000
Re: unicode is a fail supercat@casperkitty.com - 2015-12-09 16:15 -0800
Re: unicode is a fail glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2015-12-10 03:49 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-09 18:12 -0600
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-09 13:12 -0500
Re: unicode is a fail Keith Thompson <kst-u@mib.org> - 2015-12-09 12:12 -0800
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-10 20:48 +0000
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-09 23:44 +0000
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 01:13 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-10 10:39 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-10 03:33 -0800
Re: unicode is a fail supercat@casperkitty.com - 2015-12-10 06:07 -0800
Re: unicode is a fail "Osmium" <r124c4u102@comcast.net> - 2015-12-10 08:21 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-10 00:59 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 14:33 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-06 22:45 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 12:38 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 13:55 -0600
Re: unicode is a fail BartC <bc@freeuk.com> - 2015-12-07 21:14 +0000
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-07 16:50 -0600
Re: unicode is a fail Robert Wessel <robertwessel2@yahoo.com> - 2015-12-07 02:38 -0600
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 07:34 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-06 00:24 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-04 19:49 -0600
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 21:32 +0000
Re: unicode is a fail Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:50 -0800
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 22:15 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 17:27 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:06 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 18:29 -0500
Re: unicode is a fail Richard Heathfield <rjh@cpax.org.uk> - 2015-12-05 23:50 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:38 +0000
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:33 +0000
Re: unicode is a fail James Kuyper <jameskuyper@verizon.net> - 2015-12-05 16:51 -0500
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 10:59 +1300
Re: unicode is a fail Ian Collins <ian-news@hotmail.com> - 2015-12-06 11:00 +1300
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-06 06:31 +0000
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-02 17:48 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 01:20 -0800
Re: unicode is a fail fir <profesor.fir@gmail.com> - 2015-12-03 02:02 -0800
Re: unicode is a fail Stephen Sprunk <stephen@sprunk.org> - 2015-12-03 09:43 -0600
Re: unicode is a fail raltbos@xs4all.nl (Richard Bos) - 2015-12-04 12:55 +0000
Re: unicode is a fail Steve Thompson <stevet810@gmail.com> - 2015-12-04 18:29 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 16:42 +0000
Re: unicode is a fail Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-05 10:06 +0000
OT: Usenet (Was: unicode is a fail) Steve Thompson <stevet810@gmail.com> - 2015-12-05 20:41 +0000
Re: OT: Usenet (Was: unicode is a fail) Malcolm McLean <malcolm.mclean5@btinternet.com> - 2015-12-05 13:18 -0800
Re: unicode is a fail Udyant Wig <udyantw@gmail.com> - 2015-12-06 10:21 +0530
OT: Facebook (was Re: unicode is a fail) Jorgen Grahn <grahn+nntp@snipabacken.se> - 2015-12-06 08:51 +0000
Re: OT: Facebook (was Re: unicode is a fail) raltbos@xs4all.nl (Richard Bos) - 2015-12-06 13:45 +0000
csiph-web