Path: csiph.com!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail
From: Keith Thompson <kst-u@mib.org>
Newsgroups: comp.lang.c
Subject: Re: unicode is a fail
Date: Sat, 05 Dec 2015 16:32:22 -0800
Organization: None to speak of
Lines: 47
Message-ID: <lnwpsspgbt.fsf@kst-u.example.com>
References: <fbcae10f-7fc6-4a1e-90d7-ea4925016e47@googlegroups.com> <Lm9Wic.bdh.YZJEv@gmail.com> <n3o36b$ud0$1@dont-email.me> <2qyvC0.96Q.SQT8q@gmail.com> <n3s3tj$8qe$1@dont-email.me> <y51yVe.p8Y.TmUhC@gmail.com> <n3t8h6$3ip$1@dont-email.me> <wFe7nL.cjz.nHu02@gmail.com> <n3uiop$p98$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: mx02.eternal-september.org; posting-host="945944de09706c9b4e29b53c9d2efdc2"; logging-data="21517"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX189ywIfVty5p6WJct7eI9Zj"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)
Cancel-Lock: sha1:PzowSFGkyu9doQwr7jJA1GoY3uw= sha1:rICNLZl1FOoJE7JMvG+sZfA+0Fg=
Xref: csiph.com comp.lang.c:77929

BartC <bc@freeuk.com> writes:
> On 05/12/2015 01:04, Steve Thompson wrote:
>> On Fri, Dec 04, 2015 at 11:46:52PM +0000, BartC wrote:
[...]
>>> (And then you have vast, sprawling 'alphabets' like Chinese which are
>>> words rather than the letters used to build the words.)
>>
>> So go tell the Chinese (and Japanese, and Thais, and ...) that they
>> should man-up and use a Western alphabet.  Such schemes exist, after
>> all.
>
> No, they can use the same alphabets, but they don't put them all into 
> one giant melting pot with every other.

So you want users of Asian writing systems to use their own separate
character set encodings, incompatible with the encodings used in
Western countries.

Because that way it's more convenient for you.

Sorry, but the decision has already been made.  Unicode combines
most of the world's character sets into a single standard, and that's
not going to change.  Complain all you like (preferably elsewhere);
it's not going to make any difference.

No doubt you have some ideas for how HTML web pages can include
both ASCII-encoded tag names and Chinese characters.  Which means
there has to be a way to combine Latin and Chinese characters in
a single document anyway.

> Now, I can now longer write what had been trivial string handling 
> routines such as capitalise, toupper, reverse, compare, left, leftn, 
> etc etc. All are very well defined in ASCII, but would no longer be 
> guaranteed to work with Unicode because most of the alphabets are so weird.

Too bad.  The "giant melting pot" you worry about already exists, and is
used for most text transmitted over the Internet.

If you want to write software that only deals with ASCII, you're
absolutely free to do so, and you can do as much trivial string
handling as you like.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"