Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63260

Re: "More About Unicode in Python 2 and 3"

Path csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <drsalists@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.019
X-Spam-Evidence '*H*': 0.96; '*S*': 0.00; 'argument': 0.05; 'subject:Python': 0.06; 'type,': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'jan': 0.12; 'wrote': 0.14; 'cc:name:python list': 0.16; 'exist.': 0.16; 'subject:More': 0.16; 'subject:Unicode': 0.16; 'unicode,': 0.16; 'unicode.': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'seems': 0.21; 'example': 0.22; 'cc:addr:python.org': 0.22; '2.x': 0.24; 'bytes': 0.24; 'unicode': 0.24; 'fine': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; "doesn't": 0.30; '(like': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'code': 0.31; 'that.': 0.31; 'bunch': 0.31; 'spirit': 0.31; 'subject:About': 0.31; 'file': 0.32; 'quite': 0.32; 'worked': 0.33; 'ago': 0.33; 'maybe': 0.34; 'agree': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'really': 0.36; 'consistent': 0.36; 'module.': 0.36; 'similar': 0.36; 'url:org': 0.36; 'wrong': 0.37; 'two': 0.37; 'somebody': 0.38; 'handle': 0.38; 'files': 0.38; 'pm,': 0.38; 'anything': 0.39; 'read': 0.60; 'subject:"': 0.60; 'new': 0.61; 'six': 0.68; 'to,': 0.72; 'article': 0.77; 'ethan': 0.84; 'furman': 0.84; 'pain': 0.84; 'total,': 0.84; 'str.': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1v/FWITTa0Eg8NpRyVE/yz+gL0fADejPxJ+Ul95u7p8=; b=u9pyWWdrZGl/i4BUoAUy+SMqFbffiZDQr2e0C6XWu9uEngknrZf9lnqCKugDUjplXF uBqLZKvwAdDWwUHffyAYaXicsloWVs+KlTm1Iy800+REW6wWWyzyGSM/fZyv8XCoWMke +jXEEFT1T7zGndAvNBVlohEJaH1AG2DH7Gnv7o+3z7etxWLEx/FCh62Wb5+n+Ad59Cjb 08nRDEcYNFPx8psnaSio7ygy2ILreoiIuUXavJ5z7CFOz2tzJPmS1PTG9QU3gghTjkwN /slPtLgjPF4LNiawTjo/J2eQTHqMBiEgsU1whQmHK0YYOMO4ZLV7ry/BWFBXnJ+zBxK1 5N3g==
MIME-Version 1.0
X-Received by 10.180.188.100 with SMTP id fz4mr10400710wic.57.1388975860749; Sun, 05 Jan 2014 18:37:40 -0800 (PST)
In-Reply-To <52C9FD02.3080109@stoneleaf.us>
References <lablra$1mc$2@ger.gmane.org> <52C9FD02.3080109@stoneleaf.us>
Date Sun, 5 Jan 2014 18:37:40 -0800
Subject Re: "More About Unicode in Python 2 and 3"
From Dan Stromberg <drsalists@gmail.com>
To Ethan Furman <ethan@stoneleaf.us>
Content-Type text/plain; charset=ISO-8859-1
Cc Python List <python-list@python.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4999.1388975867.18130.python-list@python.org> (permalink)
Lines 40
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1388975867 news.xs4all.nl 2866 [2001:888:2000:d::a6]:59179
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:63260

Show key headers only | View raw


On Sun, Jan 5, 2014 at 4:46 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
> While I don't agree with his assessment of Python 3 in total, I definitely
> feel his pain with regards to bytestrings in Py3 -- because they don't
> exist.  'bytes' /looks/ like a bytestring, but really it's just a bunch of
> integers:
>
> --> b'abc
> 'b'abc'
> --> b'abc'[1]
> 98
>
> Maybe for 3.5 somebody *cough* will make a bytestring type for those of us
> who have to support the lower-level protocols...

I don't see anything wrong with the new bytes type, including the
example above.  I wrote a backup program that used bytes or str's (3.x
or 2.x respectively), and they both worked fine for that.  I had to
code around some limited number of surprises, but they weren't
substantive problems, they were just differences.

The argument seems to be "3.x doesn't work the way I'm accustomed to,
so I'm not going to use it, and I'm going to shout about it until
others agree with me."  And yes, I read Armin's article - it was
pretty long....

Also, I never once wrote a program to use 2.x's unicode type.  I
always used str.  It was important to make str handle unicode, to get
people (like me!) to actually use unicode.

Two modules helped me quite a bit with backshift, the backup program I
mentioned:
http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/python2x3-module.html
http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/bufsock-module.html

python2x3 is tiny, and similar in spirit to the popular six module.

bufsock is something I wrote years ago that enables consistent I/O on
sockets, files or file descriptors; 2.x or 3.x.

HTH

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: "More About Unicode in Python 2 and 3" Dan Stromberg <drsalists@gmail.com> - 2014-01-05 18:37 -0800

csiph-web