Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: eryk sun Newsgroups: comp.lang.python Subject: Re: Should stdlib files contain 'narrow non breaking space' U+202F? Date: Fri, 18 Dec 2015 04:35:37 -0600 Lines: 17 Message-ID: References: <5673d713$0$1612$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de 1R+dhtYf0QegkShlV3DS/wAX9B09z/60+0mHrVlFRQQA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'defaults': 0.05; 'bug.': 0.07; 'override': 0.07; "subject:' ": 0.07; 'encoding.': 0.09; 'subject:files': 0.09; 'bug': 0.10; '3:51': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'stderr.': 0.16; 'subject:breaking': 0.16; 'subject:non': 0.16; 'wrote:': 0.16; 'windows': 0.20; '2015': 0.20; 'preferred': 0.20; 'lawrence': 0.22; 'am,': 0.23; 'dec': 0.23; 'header:In-Reply- To:1': 0.24; 'fri,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'see,': 0.27; 'actual': 0.28; 'looks': 0.29; 'ansi': 0.29; 'windows,': 0.29; 'comments': 0.30; "d'aprano": 0.33; "he's": 0.33; 'steven': 0.33; 'received:google.com': 0.35; 'unicode': 0.35; 'instead': 0.36; 'received:209.85': 0.36; 'to:addr:python- list': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'received:209': 0.38; 'skip:p 20': 0.38; 'to:addr:python.org': 0.40; 'mark': 0.40; 'six': 0.65; '11:02': 0.84; 'complaint': 0.84; 'piping': 0.84; 'subject:space': 0.84; 'subject:+': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=mfp1+zycM4wH7pmgoUYNHNM9Z/6P1Bg2+zr9pUAFjRI=; b=jXBjVdH5lEySaLNwImOT7VY6C4t3YsuZ0Uk/cyA4m+7TzSQGX5ro/nauVxnz//THfW kGieT0rELZB7p/YNsDjdfLzekEgOZnMh7PJNuXJ0IEb87CSDa/bhy/LN871S49jaoPve U91IqrwZFZCD8yo4qNG1zwv2DKRZ6wKwi/Kw3KMHICbZ8ZldIvbastuRY3zt2rzF32K1 XgQzt19hJ66Y6rHvdPl95yh65M28qYy09Bksf6/YP+GaCvrZQZ3BpgU/7ypM+EFlEVVg svaryuuO1ZBOS3TB59bI8sp28y7XpdnVtAcKV4yyWlt881DfLwRCELVckACTAfgkbPEk WpoQ== X-Received: by 10.107.11.68 with SMTP id v65mr3770830ioi.188.1450434976957; Fri, 18 Dec 2015 02:36:16 -0800 (PST) In-Reply-To: <5673d713$0$1612$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:100596 On Fri, Dec 18, 2015 at 3:51 AM, Steven D'Aprano wrote: > On Fri, 18 Dec 2015 11:02 am, Mark Lawrence wrote: > >> A lot of it is down to Windows, as the actual complaint is:- >> >> six.print_(source) > > Looks like a bug in six to me. > > See, without Unicode comments in the std lib, you never would have found > that bug. I think Mark said he's piping the output. In this case it's not looking at the current console/terminal encoding. Instead it defaults to the platform's preferred encoding. On Windows that's the system ANSI encoding, such as codepage 1252. You can set PYTHONIOENCODING=UTF-8 to override this for stdin, stdout, and stderr.