Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63287

Re: "More About Unicode in Python 2 and 3"

Path csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <ethan@stoneleaf.us>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'argument': 0.05; 'subject:Python': 0.06; 'binary': 0.07; 'elegant': 0.07; 'important,': 0.07; 'anymore.': 0.09; 'ascii': 0.09; 'dan': 0.09; 'from:addr:ethan': 0.09; 'from:addr:stoneleaf.us': 0.09; 'from:name:ethan furman': 0.09; 'message-id:@stoneleaf.us': 0.09; 'mixed': 0.09; 'subset': 0.09; 'window.': 0.09; '~ethan~': 0.09; 'python': 0.11; 'language,': 0.12; '1900': 0.16; 'be:': 0.16; 'memo': 0.16; 'subject:More': 0.16; 'subject:Unicode': 0.16; 'sync': 0.16; 'wrote:': 0.18; 'small,': 0.19; 'things.': 0.19; 'seems': 0.21; 'example': 0.22; 'header:User-Agent:1': 0.23; 'bytes': 0.24; 'text,': 0.24; 'text.': 0.24; 'header:In-Reply- To:1': 0.27; 'chris': 0.29; 'character': 0.29; "doesn't": 0.30; "i'm": 0.30; 'work.': 0.31; 'went': 0.31; 'subject:About': 0.31; "can't": 0.35; 'agree': 0.35; 'operations': 0.35; 'but': 0.35; 'there': 0.35; 'are,': 0.36; 'wrong': 0.37; 'needed': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'embedded': 0.39; 'skip:. 10': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'subject:"': 0.60; 'received:173': 0.61; 'course': 0.61; 'simple': 0.61; 'name': 0.63; 'field': 0.63; 'skip:n 10': 0.64; 'become': 0.64; 'to,': 0.72; 'age': 0.80; '0000': 0.84; '0300': 0.84; '4500': 0.84; 'frustrating': 0.84; 'received:64.5': 0.84; 'difficult,': 0.91
Date Mon, 06 Jan 2014 07:10:56 -0800
From Ethan Furman <ethan@stoneleaf.us>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version 1.0
To Python <python-list@python.org>
Subject Re: "More About Unicode in Python 2 and 3"
References <lablra$1mc$2@ger.gmane.org> <52C9FD02.3080109@stoneleaf.us> <CAGGBd_qBA0OBELxgzERO4Tfs6quK7oYq8v_2idA=K2ycoiO6Dg@mail.gmail.com>
In-Reply-To <CAGGBd_qBA0OBELxgzERO4Tfs6quK7oYq8v_2idA=K2ycoiO6Dg@mail.gmail.com>
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-AntiAbuse This header was added to track abuse, please include it with any abuse report
X-AntiAbuse Primary Hostname - gator3304.hostgator.com
X-AntiAbuse Original Domain - python.org
X-AntiAbuse Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse Sender Address Domain - stoneleaf.us
X-BWhitelist no
X-Source-IP 173.12.184.233
X-Source
X-Source-Args
X-Source-Dir
X-Source-Sender ([173.12.184.233]) [173.12.184.233]:42762
X-Source-Auth ethan+stoneleaf.us
X-Email-Count 1
X-Source-Cap dG9idWs7dG9idWs7Z2F0b3IzMzA0Lmhvc3RnYXRvci5jb20=
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5022.1389022306.18130.python-list@python.org> (permalink)
Lines 31
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1389022306 news.xs4all.nl 2918 [2001:888:2000:d::a6]:46607
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:63287

Show key headers only | View raw


On 01/05/2014 06:37 PM, Dan Stromberg wrote:
>
> The argument seems to be "3.x doesn't work the way I'm accustomed to,
> so I'm not going to use it, and I'm going to shout about it until
> others agree with me."

The argument is that a very important, if small, subset a data manipulation become very painful in Py3.  Not impossible, 
and not difficult, but painful because the mental model and the contortions needed to get things to work don't sync up 
anymore.  Painful because Python is, at heart, a simple and elegant language, but with the use-case of embedded ascii in 
binary data that elegance went right out the window.

On 01/05/2014 06:55 PM, Chris Angelico wrote:
>
> It can't be both things. It's either bytes or it's text.

Of course it can be:

0000000: 0372 0106 0000 0000 6100 1d00 0000 0000  .r......a.......
0000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000020: 4e41 4d45 0000 0000 0000 0043 0100 0000  NAME.......C....
0000030: 1900 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 4147 4500 0000 0000 0000 004e 1a00 0000  AGE........N....
0000050: 0300 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0d1a 0a                                  ...

And there we are, mixed bytes and ascii data.  As I said earlier, my example is minimal, but still very frustrating in 
that normal operations no longer work.  Incidentally, if you were thinking that NAME and AGE were part of the ascii 
text, you'd be wrong -- the field names are also encoded, as are the Character and Memo fields.

--
~Ethan~

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Re: "More About Unicode in Python 2 and 3" Ethan Furman <ethan@stoneleaf.us> - 2014-01-06 07:10 -0800
  Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 04:27 +1100
    Re: "More About Unicode in Python 2 and 3" Ethan Furman <ethan@stoneleaf.us> - 2014-01-06 10:34 -0800
      Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 11:42 +1100
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 13:30 -0600
    Re: "More About Unicode in Python 2 and 3" Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-01-06 19:36 +0000
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 13:44 -0600
      Re: "More About Unicode in Python 2 and 3" Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-01-07 11:54 +1100
    Re: "More About Unicode in Python 2 and 3" Ned Batchelder <ned@nedbatchelder.com> - 2014-01-06 16:14 -0500
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 15:23 -0600
    Re: "More About Unicode in Python 2 and 3" Mark Janssen <dreamingforward@gmail.com> - 2014-01-06 15:32 -0600
    Re: "More About Unicode in Python 2 and 3" Chris Angelico <rosuav@gmail.com> - 2014-01-07 10:03 +1100

csiph-web