Groups > comp.lang.java.programmer > #26097

Re: Why No Supplemental Characters In Character Literals?

Date	2011-02-04 18:10 -0500
From	Arne Vajhøj <arne@vajhoej.dk>
Newsgroups	comp.lang.java.programmer
Subject	Re: Why No Supplemental Characters In Character Literals?
References	<iig4k2$sus$1@lust.ihug.co.nz> <ejeok6d6v98ju1tpqt5cq3vhko813q4def@4ax.com>
Message-ID	<4d4c8761$0$23753$14726298@news.sunsite.dk> (permalink)
Organization	SunSITE.dk - Supporting Open source

Show all headers | View raw

On 04-02-2011 13:26, Roedy Green wrote:
> On Fri, 04 Feb 2011 18:59:30 +1300, Lawrence D'Oliveiro
> <ldo@geek-central.gen.new_zealand>  wrote, quoted or indirectly quoted
> someone who said :
>> Why was it decreed in the language spec that characters beyond U+FFFF are
>> not allowed in character literals, when they are allowed everywhere else (in
>> string literals, in the program text, in character and string values etc)?
>
> because they did not exist at the time Java was invented.  extended
> literals were tacked on to the 16-bit internal scheme in a somewhat
> half-hearted way. to go to full 32-bit internally would gobble RAM
> hugely.
>
> Java does not have  32-bit String literals, like C style code points
> e.g. \U0001d504. Note the capital U vs the usual \ud504 I wrote the
> SurrogatePair applet (see
> http://mindprod.com/applet/surrogatepair.html)
> to convert C-style code points to a arcane surrogate pairs to let you
> use 32-bit Unicode glyphs in your programs.
>
> Personally, I don’t see the point of any great rush to support 32-bit
> Unicode. The new symbols will be rarely used. Consider what’s there.
> The only ones I would conceivably use are musical symbols and
> Mathematical Alphanumeric symbols (especially the German black letters
> so favoured in real analysis). The rest I can’t imagine ever using
> unless I took up a career in anthropology, i.e. linear B syllabary (I
> have not a clue what it is), linear B ideograms (Looks like symbols
> for categorising cave petroglyphs), Aegean Numbers (counting with
> stones and sticks), Old Italic (looks like Phoenecian), Gothic
> (medieval script), Ugaritic (cuneiform), Deseret (Mormon), Shavian
> (George Bernard Shaw’s phonetic script), Osmanya (Somalian), Cypriot
> syllabary, Byzantine music symbols (looks like Arabic), Musical
> Symbols, Tai Xuan Jing Symbols (truncated I-Ching), CJK
> extensions(Chinese Japanese Korean) and tags (letters with blank
> “price tags”).

Most western people does never use them.

But that does not mean much as we got our stuff in the low codepoints.

The relevant question is whether Chinese/Japanese/Korean use the
 >=64K code points.

Arne

Back to comp.lang.java.programmer | Previous | Next — Previous in thread | Next in thread | Find similar

Thread

Re: Why No Supplemental Characters In Character Literals? Roedy Green <see_website@mindprod.com.invalid> - 2011-02-04 10:26 -0800
  Re: Why No Supplemental Characters In Character Literals? Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-02-05 12:54 +1300
    Re: Why No Supplemental Characters In Character Literals? Martin Gregorie <martin@address-in-sig.invalid> - 2011-02-05 13:09 +0000
  Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 18:10 -0500
  Re: Why No Supplemental Characters In Character Literals? Roedy Green <see_website@mindprod.com.invalid> - 2011-02-04 15:29 -0800

csiph-web