Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #50443

Re: hex dump w/ or w/out utf-8 chars

Newsgroups comp.lang.python
Date 2013-07-11 06:18 -0700
References <a35609c1-e56f-4180-8176-4405264da0a2@googlegroups.com> <7ef8c0e7-7f7c-4a22-89a9-50f62c4a8064@googlegroups.com> <mailman.4391.1373305945.3114.python-list@python.org>
Message-ID <a3a4aa9b-3a5c-42cd-9a04-4c02f962b71e@googlegroups.com> (permalink)
Subject Re: hex dump w/ or w/out utf-8 chars
From wxjmfauth@gmail.com

Show all headers | View raw


Le lundi 8 juillet 2013 19:52:17 UTC+2, Chris Angelico a écrit :
> On Tue, Jul 9, 2013 at 3:31 AM,  <ferdy.blatsco@gmail.com> wrote:
> 
> > Unfortunately (as probably I told you before) I will never pass to
> 
> > Python 3...  Guido should not always listen only to gurus like him...
> 
> > I don't like Python as before...starting from OOP and ending with codecs
> 
> > like utf-8. Regarding OOP, much appreciated expecially by experts, he
> 
> > could use python 2 for hiding the complexities of OOP (improving, as an
> 
> > effect, object's code hiding) moving classes and objects to
> 
> > imported methods, leaving in this way the programming style to the
> 
> > well known old style: sequential programming and functions.
> 
> > About utf-8... the same solution: keep utf-8 but for the non experts, add
> 
> > methods to convert to solutions which use the range 128-255 of only one
> 
> > byte (I do not give a damn about chinese and "similia"!...)
> 
> > I know that is a lost battle (in italian "una battaglia persa")!
> 
> 
> 
> Well, there won't be a Python 2.8, so you really should consider
> 
> moving at some point. Python 3.3 is already way better than 2.7 in
> 
> many ways, 3.4 will improve on 3.3, and the future is pretty clear.
> 
> But nobody's forcing you, and 2.7.x will continue to get
> 
> bugfix/security releases for a while. (Personally, I'd be happy if
> 
> everyone moved off the 2.3/2.4 releases. It's not too hard supporting
> 
> 2.6+ or 2.7+.)
> 
> 
> 
> The thing is, you're thinking about UTF-8, but you should be thinking
> 
> about Unicode. I recommend you read these articles:
> 
> 
> 
> http://www.joelonsoftware.com/articles/Unicode.html
> 
> http://unspecified.wordpress.com/2012/04/19/the-importance-of-language-level-abstract-unicode-strings/
> 
> 
> 
> So long as you are thinking about different groups of characters as
> 
> different, and wanting a solution that maps characters down into the
> 
> <256 range, you will never be able to cleanly internationalize. With
> 
> Python 3.3+, you can ignore the differences between ASCII, BMP, and
> 
> SMP characters; they're all just "characters". Everything works
> 
> perfectly with Unicode.
> 

-----------

Just to stick with this funny character ẞ, a ucs-2 char
in the Flexible String Representation nomenclature.

It seems to me that, when one needs more than ten bytes
to encode it, 

>>> sys.getsizeof('a')
26
>>> sys.getsizeof('ẞ')
40

this is far away from the perfection.

BTW, for a modern language, is not ucs2 considered
as obsolete since many, many years?

jmf


Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

hex dump w/ or w/out utf-8 chars blatt <ferdy.blatsco@gmail.com> - 2013-07-07 17:22 -0700
  Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-08 11:17 +1000
  Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-08 05:48 +0000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:31 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 03:52 +1000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 06:18 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-11 23:32 +1000
          Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:42 -0700
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-11 11:44 -0700
            Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-12 03:18 +0000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-12 14:42 -0700
            Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-12 12:16 +1000
              Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 00:56 -0700
                Re: hex dump w/ or w/out utf-8 chars Lele Gaifax <lele@metapensiero.it> - 2013-07-13 10:24 +0200
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:36 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 19:46 +1000
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 09:49 +0000
                Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-13 20:09 +1000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-13 07:37 -0700
                Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-13 15:02 -0400
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 01:20 -0700
                Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-14 10:44 +0000
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-14 06:44 -0700
                Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-24 06:28 -0700
                Re: hex dump w/ or w/out utf-8 chars Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-14 09:17 +1000
  Re: hex dump w/ or w/out utf-8 chars ferdy.blatsco@gmail.com - 2013-07-08 10:53 -0700
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 04:07 +1000
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 16:56 -0400
      Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 12:22 +0000
        Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 08:54 -0400
          Re: hex dump w/ or w/out utf-8 chars Neil Cerutti <neilc@norwich.edu> - 2013-07-09 13:00 +0000
            Re: hex dump w/ or w/out utf-8 chars Skip Montanaro <skip@pobox.com> - 2013-07-09 08:18 -0500
            Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-09 09:23 -0400
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-08 22:38 +0100
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 07:49 +1000
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:53 +0000
    Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua.landau.ws@gmail.com> - 2013-07-08 23:02 +0100
    Re: hex dump w/ or w/out utf-8 chars Dave Angel <davea@davea.name> - 2013-07-08 18:45 -0400
    Re: hex dump w/ or w/out utf-8 chars Chris Angelico <rosuav@gmail.com> - 2013-07-09 08:51 +1000
    Re: hex dump w/ or w/out utf-8 chars MRAB <python@mrabarnett.plus.com> - 2013-07-09 00:32 +0100
      Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 06:46 +0000
    Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 07:00 +0000
      Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-09 02:34 -0700
        Re: hex dump w/ or w/out utf-8 chars Chris “Kwpolska” Warrick <kwpolska@gmail.com> - 2013-07-09 12:15 +0200
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-09 16:32 +0000
            Re: hex dump w/ or w/out utf-8 chars wxjmfauth@gmail.com - 2013-07-10 01:52 -0700
        Re: hex dump w/ or w/out utf-8 chars Joshua Landau <joshua@landau.ws> - 2013-07-12 23:01 +0100
          Re: hex dump w/ or w/out utf-8 chars Tim Roberts <timr@probo.com> - 2013-07-12 20:42 -0700
          Re: hex dump w/ or w/out utf-8 chars Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 04:51 +0000

csiph-web