Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed2a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.06; 'string': 0.09; 'conversions': 0.09; 'formatting': 0.09; 'pep': 0.09; 'subset': 0.09; 'python': 0.11; '10:17': 0.16; 'ascii,': 0.16; 'comparable.': 0.16; 'corrupt': 0.16; 'encoding.': 0.16; 'entirely.': 0.16; 'reversing': 0.16; 'then?': 0.16; 'unchanged,': 0.16; 'worst': 0.16; 'wrote:': 0.18; 'dependent': 0.19; 'thu,': 0.19; 'example': 0.22; 'bytes': 0.24; 'unicode': 0.24; 'header:In- Reply-To:1': 0.27; 'correct': 0.29; 'rest': 0.29; 'leave': 0.29; 'on,': 0.29; 'am,': 0.29; 'message-id:@mail.gmail.com': 0.30; 'assumes': 0.31; 'dropped': 0.31; 'subject:some': 0.31; 'probably': 0.32; 'worked': 0.33; 'entirely': 0.33; 'implemented': 0.33; 'could': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'impression': 0.36; 'should': 0.36; 'being': 0.38; 'e.g.': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'how': 0.40; 'balance': 0.61; 'making': 0.63; 'more': 0.64; 'default': 0.69; 'behavior': 0.77 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=eqD1ilRz+wLRzik902yxQfQk2oFxpzYze14rV9+SUIY=; b=VRu7enP88JfHNM4eR6Ql9HkDYkFSzfDHGG7fDuEw17CZOrkBLPSoT6pHhMWBIJHT6e ZDAjIRO/L5t2hfA91yQjJBXR1Y63gqHg+UbO3vMYzyias0K4hPBcF+Ub/A9pCy8B1Swv 4GPwCT5sj7zJJFxWhqn57fahDXuGMZ1BW2g6DQuWncppXzf9hWO/XjAhibV0EhmUKMnR cROIOACjJBABw3ursgQi2pCT/C78Z6g6BhYOw/7ioFjYiOv/J4meKKH/chaG0o796MFJ jlJE9IBdpo5P6oBlq9A/JEzMGrwaQMk6FwkMToHvYqTs6L0HdTqUdfYQrPbfeGpwpAzP RTYw== X-Received: by 10.236.42.52 with SMTP id i40mr10657377yhb.119.1401988604347; Thu, 05 Jun 2014 10:16:44 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <53909801.5080008@chamonix.reportlab.co.uk> References: <538bcfff$0$29978$c3e8da3$5496439d@news.astraweb.com> <538C5BB8.1020702@chamonix.reportlab.co.uk> <538f1a61$0$29978$c3e8da3$5496439d@news.astraweb.com> <53902bb1$0$11109$c3e8da3@news.astraweb.com> <87wqcvu20h.fsf@elektro.pacujo.net> <7b3543f6-6f62-49c5-abdc-e2783fd6d629@googlegroups.com> <87oay7tnxt.fsf@elektro.pacujo.net> <53908EB3.70202@chamonix.reportlab.co.uk> <53909801.5080008@chamonix.reportlab.co.uk> From: Ian Kelly Date: Thu, 5 Jun 2014 11:16:04 -0600 Subject: Re: Python 3.2 has some deadly infection To: Python Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 13 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401988932 news.xs4all.nl 2959 [2001:888:2000:d::a6]:41534 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72736 On Thu, Jun 5, 2014 at 10:17 AM, Robin Becker wrote: > in python 2 str and unicode were much more comparable. On balance I think > just reversing them ie str --> bytes and unicode --> str was probably the > right thing to do if the default conversions had been turned off. However > making bytes a crippled thing was wrong. How should e.g. bytes.upper() be implemented then? The correct behavior is entirely dependent on the encoding. Python 2 just assumes ASCII, which at best will correctly upper-case some subset of the string and leave the rest unchanged, and at worst could corrupt the string entirely. There are some things that were dropped that should not have been, but my impression is that those are being worked on, for example % formatting in PEP 461.