Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #25782

Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals?

From Joshua Cranmer <Pidgeot18@verizon.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals?
Date 2011-02-04 17:28 -0500
Organization A noiseless patient Spider
Message-ID <iihuij$1ar$1@news.eternal-september.org> (permalink)
References (2 earlier) <iig84e$uqu$1@lust.ihug.co.nz> <4d4c2019$0$23753$14726298@news.sunsite.dk> <iihbuo$cqo$1@news.eternal-september.org> <iihhdo$emc$1@news.eternal-september.org> <alpine.DEB.1.10.1102042036190.11442@urchin.earth.li>

Show all headers | View raw


On 02/04/2011 04:30 PM, Tom Anderson wrote:
> A question to the house, then: has anyone ever invented a data structure
> for strings which allows space-efficient storage for strings in
> different scripts, but also allows time-efficient implementation of the
> common string operations?

I think the real answer is that maybe we need to rethink traditional 
string APIs. Particularly, we have the issues of diacratics, since "A 
[combining diacritic `]" is basically 1 character stored in 3,4, or 8 
bytes, depending on storage format.

I would be surprised if there weren't already some studies on the impact 
of using UTF-8 based strings in UTF-16/-32-ish contexts.

-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: Why No Supplemental Characters In Character Literals? Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-02-04 19:59 +1300
  Re: Why No Supplemental Characters In Character Literals? "Mike Schilling" <mscottschilling@hotmail.com> - 2011-02-04 17:02 -0800
    Re: Why No Supplemental Characters In Character Literals? Ken Wesson <kwesson@gmail.com> - 2011-02-05 04:21 +0100
  Re: Why No Supplemental Characters In Character Literals? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-04 19:05 -0500
    Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 19:56 -0500
    Re: Why No Supplemental Characters In Character Literals? "Mike Schilling" <mscottschilling@hotmail.com> - 2011-02-04 16:37 -0800
  Re: Why No Supplemental Characters In Character Literals? "Mike Schilling" <mscottschilling@hotmail.com> - 2011-02-04 00:22 -0800
    Re: Why No Supplemental Characters In Character Literals? Roedy Green <see_website@mindprod.com.invalid> - 2011-02-04 15:03 -0800
    Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 18:04 -0500
    Re: Why No Supplemental Characters In Character Literals? Lew <noone@lewscanon.com> - 2011-02-04 07:49 -0500
    Re: Why No Supplemental Characters In Character Literals? Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-02-05 11:26 +1300
  Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-04 19:13 -0500
    Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 20:08 -0500
  Re: Why No Supplemental Characters In Character Literals? Daniele Futtorovic <da.futt.news@laposte.net.invalid> - 2011-02-04 18:37 +0100
    Re: Why No Supplemental Characters In Character Literals? markspace <nospam@nowhere.com> - 2011-02-04 11:27 -0800
  Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-04 17:28 -0500
  Re: Why No Supplemental Characters In Character Literals? "Mike Schilling" <mscottschilling@hotmail.com> - 2011-02-04 09:10 -0800
    Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Roedy Green <see_website@mindprod.com.invalid> - 2011-02-04 15:22 -0800
      Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 18:41 -0500
    Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 18:12 -0500
    Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Tom Anderson <twic@urchin.earth.li> - 2011-02-04 21:30 +0000
      Re: Efficient unicode string implementation was: Re: Why No Supplemental Characters In Character Literals? Ken Wesson <kwesson@gmail.com> - 2011-02-05 04:25 +0100
    Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 12:33 -0500
    Re: Why No Supplemental Characters In Character Literals? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-04 13:44 -0500
      Re: Why No Supplemental Characters In Character Literals? Roedy Green <see_website@mindprod.com.invalid> - 2011-02-04 15:08 -0800
  Re: Why No Supplemental Characters In Character Literals? Lew <lew@lewscanon.com> - 2011-02-04 12:43 -0800
  Re: Why No Supplemental Characters In Character Literals? Arne Vajhøj <arne@vajhoej.dk> - 2011-02-04 10:49 -0500
  Re: Why No Supplemental Characters In Character Literals? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2011-02-04 08:04 -0500

csiph-web