Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63287

Re: "More About Unicode in Python 2 and 3"

Date 2014-01-06 07:10 -0800
From Ethan Furman <ethan@stoneleaf.us>
Subject Re: "More About Unicode in Python 2 and 3"
References <lablra$1mc$2@ger.gmane.org> <52C9FD02.3080109@stoneleaf.us> <CAGGBd_qBA0OBELxgzERO4Tfs6quK7oYq8v_2idA=K2ycoiO6Dg@mail.gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.5022.1389022306.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 01/05/2014 06:37 PM, Dan Stromberg wrote:
>
> The argument seems to be "3.x doesn't work the way I'm accustomed to,
> so I'm not going to use it, and I'm going to shout about it until
> others agree with me."

The argument is that a very important, if small, subset a data manipulation become very painful in Py3.  Not impossible, 
and not difficult, but painful because the mental model and the contortions needed to get things to work don't sync up 
anymore.  Painful because Python is, at heart, a simple and elegant language, but with the use-case of embedded ascii in 
binary data that elegance went right out the window.

On 01/05/2014 06:55 PM, Chris Angelico wrote:
>
> It can't be both things. It's either bytes or it's text.

Of course it can be:

0000000: 0372 0106 0000 0000 6100 1d00 0000 0000  .r......a.......
0000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000020: 4e41 4d45 0000 0000 0000 0043 0100 0000  NAME.......C....
0000030: 1900 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 4147 4500 0000 0000 0000 004e 1a00 0000  AGE........N....
0000050: 0300 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0d1a 0a                                  ...

And there we are, mixed bytes and ascii data.  As I said earlier, my example is minimal, but still very frustrating in 
that normal operations no longer work.  Incidentally, if you were thinking that NAME and AGE were part of the ascii 
text, you'd be wrong -- the field names are also encoded, as are the Character and Memo fields.

--
~Ethan~

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Re: "More About Unicode in Python 2 and 3" Ethan Furman <ethan@stoneleaf.us> - 2014-01-06 07:10 -0800
  Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 04:27 +1100
    Re: "More About Unicode in Python 2 and 3" Ethan Furman <ethan@stoneleaf.us> - 2014-01-06 10:34 -0800
      Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 11:42 +1100
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 13:30 -0600
    Re: "More About Unicode in Python 2 and 3" Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-06 19:36 +0000
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 13:44 -0600
      Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 11:54 +1100
    Re: "More About Unicode in Python 2 and 3" Ned Batchelder <ned@nedbatchelder.com> - 2014-01-06 16:14 -0500
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 15:23 -0600
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 15:32 -0600
    Re: "More About Unicode in Python 2 and 3" Chris Angelico <rosuav@gmail.com> - 2014-01-07 10:03 +1100

csiph-web