Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c++ > #6672

Re: UTF-8 and strings

From ruben safir <ruben@mrbrklyn.com>
Newsgroups comp.lang.c++
Subject Re: UTF-8 and strings
Date 2011-06-13 04:46 +0000
Organization PANIX Public Access Internet and UNIX, NYC
Message-ID <it44o2$ioc$6@reader1.panix.com> (permalink)
References (1 earlier) <iso623$otf$2@dont-email.me> <319effb5-bf19-4c21-86e8-e4f1f4350dfc@p21g2000yqh.googlegroups.com> <isq5ne$4d6$1@dont-email.me> <isqii0$bam$1@dont-email.me> <0aef53d1-a8ce-49aa-b931-2fc3453e1bb6@f2g2000yqh.googlegroups.com>

Show all headers | View raw


On Sun, 12 Jun 2011 20:57:38 -0700, John M. Dlugosz wrote:

> On Jun 9, 8:41 am, "MikeP" <mp011...@some.org> wrote:
> 
> 
>> But if you KNOW that all you need is what's in the BMP, why not exploit
>> that, right?
> 
> Sure, the project is specified to be nationalized into 7 languages, and
> they all happen to be serviced by the Latin-1 character set.  So you
> decide to use 8-bit chars and assume the Windows program is running on a
> system that uses code page 1252 as the default for a process.
> 
> Then one day the boss comes in and says that the next version will be
> marketed to China as well.
> 
> It is my experience that software projects only get more complex over
> time.  Plan for it, unless you are planning to be unsuccessful.
> 
> —John



Strangely enough, this is a specific problem for a specific kind of app, 
like a word processor.  IT people even in china learn rudimentary English 
because scanf doesn't translate to Chinese wide chars.  If comp sci 
originally were devised and discovered in china, it would have been a 
bummer for an 8 bit processor with limited memory, and a limit of 256 
chars.

 

Back to comp.lang.c++ | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-07 08:08 -0700
  Re: UTF-8 and strings Asger-P <junk@asger-p.dk> - 2011-06-07 17:26 +0200
    Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-08 17:58 -0700
      Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 18:21 -0700
        Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-09 06:38 -0700
  Re: UTF-8 and strings "Balog Pal" <pasa@lib.hu> - 2011-06-07 17:35 +0200
    Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-08 18:08 -0700
  Re: UTF-8 and strings Jeff Flinn <TriumphSprint2000@hotmail.com> - 2011-06-07 12:55 -0400
    Re: UTF-8 and strings Paavo Helde <myfirstname@osa.pri.ee> - 2011-06-07 15:29 -0500
      Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-08 07:15 -0500
        Re: UTF-8 and strings yatremblay@bel1lin202.(none) (Yannick Tremblay) - 2011-06-08 15:45 +0000
          Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-08 12:58 -0500
            Re: UTF-8 and strings yatremblay@bel1lin202.(none) (Yannick Tremblay) - 2011-06-09 09:18 +0000
              Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-09 07:36 -0500
                Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-10 13:18 -0700
                Re: UTF-8 and strings Ruben Safir <mrbrklyn@panix.com> - 2011-06-11 00:29 +0000
    Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-10 12:58 -0700
  Re: UTF-8 and strings diamondback <christopher.rau@gmail.com> - 2011-06-07 10:57 -0700
    Re: UTF-8 and strings Jorgen Grahn <grahn+nntp@snipabacken.se> - 2011-06-08 06:38 +0000
  Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-07 13:30 -0700
    Re: UTF-8 and strings Jorgen Grahn <grahn+nntp@snipabacken.se> - 2011-06-08 06:52 +0000
      Re: UTF-8 and strings Marc <marc.glisse@gmail.com> - 2011-06-08 08:38 +0000
        Re: UTF-8 and strings Miles Bader <miles@gnu.org> - 2011-06-08 18:02 +0900
          Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 02:05 -0700
            Re: UTF-8 and strings Miles Bader <miles@gnu.org> - 2011-06-08 18:11 +0900
          Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-08 07:22 -0500
        Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 02:03 -0700
          Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 02:06 -0700
        Re: UTF-8 and strings tm <thomas.mertes@gmx.at> - 2011-06-08 02:59 -0700
        Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-08 18:20 -0700
      Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 01:56 -0700
      Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-08 07:19 -0500
        Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-10 13:12 -0700
          Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-10 18:34 -0500
            Re: UTF-8 and strings Miles Bader <miles@gnu.org> - 2011-06-11 13:12 +0900
            Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-12 20:52 -0700
              Re: UTF-8 and strings ruben safir <ruben@mrbrklyn.com> - 2011-06-13 03:57 +0000
                Re: UTF-8 and strings Asger-P <junk@asger-p.dk> - 2011-06-13 10:51 +0200
    Re: UTF-8 and strings Nobody <nobody@nowhere.com> - 2011-06-08 08:01 +0100
  Re: UTF-8 and strings Nobody <nobody@nowhere.com> - 2011-06-07 21:35 +0100
    Re: UTF-8 and strings Joshua Maurice <joshuamaurice@gmail.com> - 2011-06-08 01:56 -0700
      Re: UTF-8 and strings Nobody <nobody@nowhere.com> - 2011-06-09 03:22 +0100
  Re: UTF-8 and strings yatremblay@bel1lin202.(none) (Yannick Tremblay) - 2011-06-08 15:55 +0000
    Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-08 17:55 -0700
      Re: UTF-8 and strings yatremblay@bel1lin202.(none) (Yannick Tremblay) - 2011-06-09 10:02 +0000
        Re: UTF-8 and strings "MikeP" <mp011011@some.org> - 2011-06-09 08:41 -0500
          Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-12 20:57 -0700
            Re: UTF-8 and strings ruben safir <ruben@mrbrklyn.com> - 2011-06-13 04:46 +0000
              Re: UTF-8 and strings Asger-P <junk@asger-p.dk> - 2011-06-13 11:58 +0200
        Re: UTF-8 and strings "John M. Dlugosz" <uwqejpnp92@snkmail.com> - 2011-06-09 06:52 -0700

csiph-web