Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.007 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'encoding': 0.05; 'subject:Python': 0.06; 'python3': 0.07; 'string': 0.09; 'bytes.': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; '8-bit': 0.16; 'encoding.': 0.16; 'encodings': 0.16; 'worse.': 0.16; 'java,': 0.16; 'wrote:': 0.18; 'trying': 0.19; 'cc:addr:python.org': 0.22; 'bytes': 0.24; 'convenient': 0.24; 'issue,': 0.24; "shouldn't": 0.24; 'java': 0.24; 'cc:2**0': 0.24; '>': 0.26; 'least': 0.26; 'header:In-Reply-To:1': 0.27; 'point': 0.28; 'tim': 0.29; 'characters': 0.30; 'especially': 0.30; 'message- id:@mail.gmail.com': 0.30; 'that.': 0.31; 'convenience': 0.31; 'subject:some': 0.31; 'writes:': 0.31; 'probably': 0.32; 'convert': 0.35; 'operations': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'really': 0.36; 'should': 0.36; 'that,': 0.38; 'anything': 0.39; 'skip:& 20': 0.39; 'even': 0.60; 'read': 0.60; 'mentioned': 0.61; 'become': 0.64; 'due': 0.66; 'skip:w 40': 0.68; 'default': 0.69; 'forced': 0.84; "they'd": 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yLRij8idTgpq0mTO+LMnjd+RXK2eJUJ2rKaUzVJDWoI=; b=Mcv4fi9jGMt42EAMPydRgEK/Kn811wav6VpsYA1+Sv05PIRMeH8BWdtwawa+zFq+dg 7gpiyNnzpIdw9Iq4o4j98rh5pIyqUpoyrYl8dUHUGwrwnxDWAfe/LmEI1b8df/FHZbLk yGga/LBpNLq3rdpgAgCOPjVDRD5CVs1Psa3jOTfaW4EZywOVxPAzzwnQ1FdiecwVgM+S IKAgvrusVrjYLrUme4Vf9D9fZtjKPshuKGltc9Zqesv+e01prm8pmxnaekEH/DRj8Yxn RNqdVe60owtoDbZ1VT1xOiWnlRdc50dxRANmh0KQuvBwA30voYvw6NGgArbBA4q7UrDa vHOA== MIME-Version: 1.0 X-Received: by 10.182.144.161 with SMTP id sn1mr1078883obb.82.1401699736275; Mon, 02 Jun 2014 02:02:16 -0700 (PDT) In-Reply-To: References: <538a8f48$0$29978$c3e8da3$5496439d@news.astraweb.com> <538bcfff$0$29978$c3e8da3$5496439d@news.astraweb.com> Date: Mon, 2 Jun 2014 19:02:16 +1000 Subject: Re: Python 3.2 has some deadly infection From: Tim Delaney To: Wolfgang Maier Content-Type: multipart/alternative; boundary=089e0158ac78dceb9d04fad6a607 Cc: Python-List X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 73 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401699744 news.xs4all.nl 2890 [2001:888:2000:d::a6]:59408 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72400 --089e0158ac78dceb9d04fad6a607 Content-Type: text/plain; charset=UTF-8 On 2 June 2014 17:45, Wolfgang Maier < wolfgang.maier@biologie.uni-freiburg.de> wrote: > Tim Delaney gmail.com> writes: > > > For some purposes, there needs to be a way to treat an arbitrary stream > of > bytes as an arbitrary stream of 8-bit characters. iso-latin-1 is a > convenient way to do that. > > > > For that purpose, Python3 has the bytes() type. Read the data as is, then > decode it to a string once you figured out its encoding. > I know that, you know that. Convincing other people of that is the difficulty. I probably should have mentioned it, but in my case it's not even Python (Java). It's exactly the same principal - an assumption was made that has become entrenched due to the fear of breakage. If they'd been forced to think about encodings up-front, it shouldn't have been an issue, which was the point I was trying to make. In Java, it's much worse. At least with Python you can perform string-like operations on bytes. In Java you have to convert it to characters before you can really do anything with it, so people just use the default encoding all the time - especially if they want the convenience of line-by-line reading using BufferedReader ... Tim Delaney --089e0158ac78dceb9d04fad6a607 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
--089e0158ac78dceb9d04fad6a607--