From: Ken Wesson <kwesson@gmail.com>
Subject: Re: Why No Supplemental Characters In Character Literals?
Newsgroups: comp.lang.java.programmer
References: <iig4k2$sus$1@lust.ihug.co.nz> <iig6j2$dul$2@news.albasani.net> <iig84e$uqu$1@lust.ihug.co.nz> <iigcva$90q$1@news.eternal-september.org> <iihufd$ulm$1@lust.ihug.co.nz> <iii493$nn8$1@news.eternal-september.org> <iii64u$nji$1@news.eternal-september.org> <4d4ca055$0$23765$14726298@news.sunsite.dk> <iii7jc$ssv$1@news.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
NNTP-Posting-Host: $$-cwgml$lsc2q.news.x-privat.org
Message-ID: <4d4cc253$1@news.x-privat.org>
Date: 5 Feb 2011 04:21:55 +0100
Organization: X-Privat.Org NNTP Server - http://www.x-privat.org
Lines: 53
X-Authenticated-User: $$o-16a0wpsuhxkoyemw
X-Complaints-To: abuse@x-privat.org
Path: csiph.com!eeepc.pasdenom.info!news.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!newsfeed.straub-nv.de!newsfeed.x-privat.org!x-privat.org!not-for-mail
Xref: csiph.com comp.lang.java.programmer:25778

On Fri, 04 Feb 2011 17:02:37 -0800, Mike Schilling wrote:

> "Arne Vajhøj" <arne@vajhoej.dk> wrote in message
> news:4d4ca055$0$23765$14726298@news.sunsite.dk...
>> On 04-02-2011 19:37, Mike Schilling wrote:
>>> "Joshua Cranmer" <Pidgeot18@verizon.invalid> wrote in message
>>> news:iii493$nn8$1@news.eternal-september.org...
>>>> On 02/04/2011 05:26 PM, Lawrence D'Oliveiro wrote:
>>>>> In message<iigcva$90q$1@news.eternal-september.org>, Mike Schilling
>>>>> wrote:
>>>>>
>>>>>> Yes, it does (contain 16 bits.)
>>>>>
>>>>> Yeah, I didn’t realize it was spelled out that way in the original
>>>>> language
>>>>> spec. What a short-sighted decision.
>>>>
>>>> It would have been stupider to have not specified a guaranteed size
>>>> for char. Take C (+ POSIX), where the definitions of sizes are very
>>>> loosely defined, and you very quickly get non-portable code. Yes, you
>>>> can in theory change the size of, say, time_t independently of other
>>>> types, but it doesn't do you much good if half the C code assumes
>>>> sizeof(time_t) == sizeof(int). Pinning down the sizes of the types
>>>> was a _very good_ move on Java's part.
>>>>
>>>>> Why was there a need to define the size of a character at all? Even
>>>>> in the
>>>>> early days of the unification of Unicode and ISO-10646, there was
>>>>> already
>>>>> provision for UCS-4. Did they really think that could safely be
>>>>> ignored?
>>>>
>>>> Knowing the results of other properly Unicode-aware code in the first
>>>> days of Unicode, I believe that Unicode quite heavily gave an
>>>> impression of "Unicode == 16 bit". Java is not the only major
>>>> platform to be bitten by now-Unicode-is-32-bits... the Windows
>>>> platform has 16-bit characters embedded into it.
>>>
>>> .NET, which in several cases took advantage of following Java to
>>> correct some of its mistakes (e.g. signed bytes), didn’t fix this one.
>>
>> Which is a bit surprising since high code points were introduced when
>> .NET came around.
>>
>> But they probably had a compatibility issue with p/Invoke and Win32
>> API, COM interop, C++ mixed mode etc. that all had to work with
>> existing Win32 model of 16 bit wchars.
> 
> Or, relentless micro-optimizers that they are, Microsoft wasn't willing
> to bite off the size/performance issues.

Relentless micro-optimizers of what, their cashflow?