Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!tudelft.nl!txtfeed1.tudelft.nl!feed.xsnews.nl!border-1.ams.xsnews.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'subject:Python': 0.05; 'ascii': 0.07; 'symbols': 0.07; 'terry': 0.07; 'received:209.85.210.174': 0.13; 'received:mail- iy0-f174.google.com': 0.13; 'fits': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'itself;': 0.16; 'reedy': 0.16; 'wrote:': 0.18; 'stick': 0.18; 'dec': 0.22; '(or': 0.22; 'header:In-Reply-To:1': 0.22; 'byte': 0.24; 'message- id:@mail.gmail.com': 0.28; 'assuming': 0.29; 'unicode': 0.29; 'pm,': 0.29; 'sun,': 0.30; 'anyone': 0.31; "can't": 0.32; 'instead': 0.33; 'to:addr:python-list': 0.34; 'character': 0.34; 'anything': 0.34; 'which,': 0.34; 'everyone.': 0.37; 'but': 0.37; 'run': 0.37; 'received:google.com': 0.37; 'some': 0.38; 'received:209.85': 0.38; 'should': 0.39; 'received:209': 0.40; 'to:addr:python.org': 0.40; 'difference': 0.40; '2011': 0.61; 'course,': 0.62; 'us,': 0.71 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Q1lS4exkNB62KTEstQjb+cU3zkqn5B/wyBEG61grr2A=; b=bfee7w3yeiGaErNN9f1uPJ7Nog0lb0TFuShUJUL2gBVSau+vqpuVPx2l6J2D7n4/iI GFihmLpigL1YMIuPYYnMTtkMqBV9NNDlF9jucdMyWcAvYn5G1PsDfO74lVS1U4fLwY9O JpqrcxwE/dMzB1siJ099qsuQnJIWj7kIIQlG8= MIME-Version: 1.0 In-Reply-To: References: Date: Sun, 4 Dec 2011 14:02:15 +1100 Subject: Re: Python 2 or 3 From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 12 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1322967738 news.xs4all.nl 6953 [2001:888:2000:d::a6]:48815 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:16602 On Sun, Dec 4, 2011 at 1:52 PM, Terry Reedy wrote: > For anyone working with unicode instead of ascii... Which, frankly, should be everyone. You can't get away with assuming that a character is a byte any more; even if you stick to the US, you're going to run into some non-ASCII symbols sooner or later. Of course, you can work with UTF-8, which means that anything that fits into 7-bit ASCII will be represented as itself; but you still need to be aware of the difference between 'bytes' and 'str' (or between 'str' and 'unicode'). ChrisA