Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'encoding': 0.05; 'subject:Python': 0.06; 'python3': 0.07; 'utf-8': 0.07; 'string': 0.09; 'default.': 0.09; 'executed': 0.09; 'input,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'python': 0.11; '8-bit': 0.16; 'encoding.': 0.16; 'encodings': 0.16; 'message- id:@post.gmane.org': 0.16; 'ought': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'ssh': 0.16; 'do,': 0.16; 'command': 0.22; 'input': 0.22; 'header:User-Agent:1': 0.23; 'bytes': 0.24; 'case.': 0.24; 'convenient': 0.24; 'header:X -Complaints-To:1': 0.27; 'correct': 0.29; 'generally': 0.29; 'possibility': 0.29; 'tim': 0.29; '(which': 0.31; 'that.': 0.31; 'extract': 0.31; 'received:132': 0.31; 'subject:some': 0.31; 'writes:': 0.31; 'quite': 0.32; 'info': 0.35; 'but': 0.35; 'there': 0.35; 'should': 0.36; 'clear': 0.37; 'to:addr:python- list': 0.38; 'sure': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'read': 0.60; 'more': 0.64; 'situation': 0.65; 'talking': 0.65; 'invalid': 0.68; 'default': 0.69 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Wolfgang Maier Subject: Re: Python 3.2 has some deadly infection Date: Mon, 2 Jun 2014 07:45:56 +0000 (UTC) References: <538a8f48$0$29978$c3e8da3$5496439d@news.astraweb.com> <538bcfff$0$29978$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Gmane-NNTP-Posting-Host: sea.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 132.230.1.31 (Mozilla/5.0 (Windows NT 6.1; rv:29.0) Gecko/20100101 Firefox/29.0) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 29 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1401695176 news.xs4all.nl 2876 [2001:888:2000:d::a6]:38052 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:72397 Tim Delaney gmail.com> writes: > > I also should have been more clear that *in the particular situation I was talking about* iso-latin-1 as default would be the right thing to do, not in the general case. Quite often we won't know the correct encoding until we've executed a command via ssh - iso-latin-1 will allow us to extract the info we need (which will generally be 7-bit ASCII) without the possibility of an invalid encoding. Sure we may get mojibake, but that's better than the alternative when we don't yet know the correct encoding. >   > Latin-1 is one of those legacy encodings which needs to die, not to be > entrenched as the default. My terminal uses UTF-8 by default (as itshould), and if I use the terminal to input "δжç", Python ought to seewhat I input, not Latin-1 moji-bake. > > > For some purposes, there needs to be a way to treat an arbitrary stream of bytes as an arbitrary stream of 8-bit characters. iso-latin-1 is a convenient way to do that. > For that purpose, Python3 has the bytes() type. Read the data as is, then decode it to a string once you figured out its encoding. Wolfgang