Path: csiph.com!eeepc.pasdenom.info!news.pasdenom.info!news.dougwise.org!aioe.org!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: "Mike Schilling" Newsgroups: comp.lang.java.programmer Subject: Re: Why No Supplemental Characters In Character Literals? Date: Fri, 4 Feb 2011 16:37:53 -0800 Organization: A noiseless patient Spider Lines: 1 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="UTF-8"; reply-type=response Content-Transfer-Encoding: 8bit Injection-Date: Sat, 5 Feb 2011 00:37:50 +0000 (UTC) Injection-Info: mx01.eternal-september.org; posting-host="r5rcaYaDpxRdcCTUEzI8Mw"; logging-data="24178"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19QC7ZbDOrqmMb41xtrOeLuRgGzxK9slys=" X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8117.416 In-Reply-To: X-Newsreader: Microsoft Windows Live Mail 14.0.8117.416 Importance: Normal Cancel-Lock: sha1:Dzu3ikM4R5SXqShy8Cf1pRCvL5Q= X-Priority: 3 X-MSMail-Priority: Normal Xref: csiph.com comp.lang.java.programmer:25887 "Joshua Cranmer" wrote in message news:iii493$nn8$1@news.eternal-september.org... > On 02/04/2011 05:26 PM, Lawrence D'Oliveiro wrote: >> In message, Mike Schilling >> wrote: >> >>> Yes, it does (contain 16 bits.) >> >> Yeah, I didn’t realize it was spelled out that way in the original >> language >> spec. What a short-sighted decision. > > It would have been stupider to have not specified a guaranteed size for > char. Take C (+ POSIX), where the definitions of sizes are very loosely > defined, and you very quickly get non-portable code. Yes, you can in > theory change the size of, say, time_t independently of other types, but > it doesn't do you much good if half the C code assumes sizeof(time_t) == > sizeof(int). Pinning down the sizes of the types was a _very good_ move on > Java's part. > >> Why was there a need to define the size of a character at all? Even in >> the >> early days of the unification of Unicode and ISO-10646, there was already >> provision for UCS-4. Did they really think that could safely be ignored? > > Knowing the results of other properly Unicode-aware code in the first days > of Unicode, I believe that Unicode quite heavily gave an impression of > "Unicode == 16 bit". Java is not the only major platform to be bitten by > now-Unicode-is-32-bits... the Windows platform has 16-bit characters > embedded into it. .NET, which in several cases took advantage of following Java to correct some of its mistakes (e.g. signed bytes), didn’t fix this one.