Path: csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.019 X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'argument': 0.05; 'subject:Python': 0.06; 'type,': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'jan': 0.12; 'wrote': 0.14; 'cc:name:python list': 0.16; 'exist.': 0.16; 'subject:More': 0.16; 'subject:Unicode': 0.16; 'unicode,': 0.16; 'unicode.': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'seems': 0.21; 'example': 0.22; 'cc:addr:python.org': 0.22; '2.x': 0.24; 'bytes': 0.24; 'unicode': 0.24; 'fine': 0.24; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; "doesn't": 0.30; '(like': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'code': 0.31; 'that.': 0.31; 'bunch': 0.31; 'spirit': 0.31; 'subject:About': 0.31; 'file': 0.32; 'quite': 0.32; 'worked': 0.33; 'ago': 0.33; 'maybe': 0.34; 'agree': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'really': 0.36; 'consistent': 0.36; 'module.': 0.36; 'similar': 0.36; 'url:org': 0.36; 'wrong': 0.37; 'two': 0.37; 'somebody': 0.38; 'handle': 0.38; 'files': 0.38; 'pm,': 0.38; 'anything': 0.39; 'read': 0.60; 'subject:"': 0.60; 'new': 0.61; 'six': 0.68; 'to,': 0.72; 'article': 0.77; 'ethan': 0.84; 'furman': 0.84; 'pain': 0.84; 'total,': 0.84; 'str.': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1v/FWITTa0Eg8NpRyVE/yz+gL0fADejPxJ+Ul95u7p8=; b=u9pyWWdrZGl/i4BUoAUy+SMqFbffiZDQr2e0C6XWu9uEngknrZf9lnqCKugDUjplXF uBqLZKvwAdDWwUHffyAYaXicsloWVs+KlTm1Iy800+REW6wWWyzyGSM/fZyv8XCoWMke +jXEEFT1T7zGndAvNBVlohEJaH1AG2DH7Gnv7o+3z7etxWLEx/FCh62Wb5+n+Ad59Cjb 08nRDEcYNFPx8psnaSio7ygy2ILreoiIuUXavJ5z7CFOz2tzJPmS1PTG9QU3gghTjkwN /slPtLgjPF4LNiawTjo/J2eQTHqMBiEgsU1whQmHK0YYOMO4ZLV7ry/BWFBXnJ+zBxK1 5N3g== MIME-Version: 1.0 X-Received: by 10.180.188.100 with SMTP id fz4mr10400710wic.57.1388975860749; Sun, 05 Jan 2014 18:37:40 -0800 (PST) In-Reply-To: <52C9FD02.3080109@stoneleaf.us> References: <52C9FD02.3080109@stoneleaf.us> Date: Sun, 5 Jan 2014 18:37:40 -0800 Subject: Re: "More About Unicode in Python 2 and 3" From: Dan Stromberg To: Ethan Furman Content-Type: text/plain; charset=ISO-8859-1 Cc: Python List X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 40 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1388975867 news.xs4all.nl 2866 [2001:888:2000:d::a6]:59179 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:63260 On Sun, Jan 5, 2014 at 4:46 PM, Ethan Furman wrote: > While I don't agree with his assessment of Python 3 in total, I definitely > feel his pain with regards to bytestrings in Py3 -- because they don't > exist. 'bytes' /looks/ like a bytestring, but really it's just a bunch of > integers: > > --> b'abc > 'b'abc' > --> b'abc'[1] > 98 > > Maybe for 3.5 somebody *cough* will make a bytestring type for those of us > who have to support the lower-level protocols... I don't see anything wrong with the new bytes type, including the example above. I wrote a backup program that used bytes or str's (3.x or 2.x respectively), and they both worked fine for that. I had to code around some limited number of surprises, but they weren't substantive problems, they were just differences. The argument seems to be "3.x doesn't work the way I'm accustomed to, so I'm not going to use it, and I'm going to shout about it until others agree with me." And yes, I read Armin's article - it was pretty long.... Also, I never once wrote a program to use 2.x's unicode type. I always used str. It was important to make str handle unicode, to get people (like me!) to actually use unicode. Two modules helped me quite a bit with backshift, the backup program I mentioned: http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/python2x3-module.html http://stromberg.dnsalias.org/~dstromberg/backshift/documentation/html/bufsock-module.html python2x3 is tiny, and similar in spirit to the popular six module. bufsock is something I wrote years ago that enables consistent I/O on sockets, files or file descriptors; 2.x or 3.x. HTH