Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #100576 > unrolled thread
| Started by | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| First post | 2015-12-18 00:02 +0000 |
| Last post | 2015-12-18 04:35 -0600 |
| Articles | 3 — 3 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Should stdlib files contain 'narrow non breaking space' U+202F? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-18 00:02 +0000
Re: Should stdlib files contain 'narrow non breaking space' U+202F? Steven D'Aprano <steve@pearwood.info> - 2015-12-18 20:51 +1100
Re: Should stdlib files contain 'narrow non breaking space' U+202F? eryk sun <eryksun@gmail.com> - 2015-12-18 04:35 -0600
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-12-18 00:02 +0000 |
| Subject | Re: Should stdlib files contain 'narrow non breaking space' U+202F? |
| Message-ID | <mailman.41.1450396996.30845.python-list@python.org> |
On 17/12/2015 23:18, Chris Angelico wrote:
> On Fri, Dec 18, 2015 at 10:05 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> The culprit character is hidden between "Issue #" and "20540" at line 400 of
>> C:\Python35\Lib\multiprocessing\connection.py.
>> https://bugs.python.org/issue20540 and
>> https://hg.python.org/cpython/rev/125c24f47f3c refers.
>>
>> I'm asking as I've just spent 30 minutes tracking down why my debug code
>> would bomb when running on 3.5, but not 2.7 or 3.2 through 3.4.
>
> I'm curious as to why this character should bomb your code at all -
> it's in a comment. Is it that your program was expecting ASCII, or is
> it something about that particular character?
>
I'm playing with ASTs and using the stdlib as test data. I was trying
to avoid going down this particular route, but...
A lot of it is down to Windows, as the actual complaint is:-
six.print_(source)
File "C:\Python35\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u202f' in
position 407: character maps to <undefined>
And as usual I've answered my own question. The cp1252 shows even if my
console is set to 65001, *BUT* I'm piping the output to file as it's so
much faster. Having taken five minutes to run the code without the pipe
everything runs to completion.
I suppose the original question still holds, but I for one certainly
won't be losing any sleep over it. Talking of which, good night all :)
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve@pearwood.info> |
|---|---|
| Date | 2015-12-18 20:51 +1100 |
| Message-ID | <5673d713$0$1612$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #100576 |
On Fri, 18 Dec 2015 11:02 am, Mark Lawrence wrote: > A lot of it is down to Windows, as the actual complaint is:- > > six.print_(source) Looks like a bug in six to me. See, without Unicode comments in the std lib, you never would have found that bug. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-18 04:35 -0600 |
| Message-ID | <mailman.53.1450434985.30845.python-list@python.org> |
| In reply to | #100593 |
On Fri, Dec 18, 2015 at 3:51 AM, Steven D'Aprano <steve@pearwood.info> wrote: > On Fri, 18 Dec 2015 11:02 am, Mark Lawrence wrote: > >> A lot of it is down to Windows, as the actual complaint is:- >> >> six.print_(source) > > Looks like a bug in six to me. > > See, without Unicode comments in the std lib, you never would have found > that bug. I think Mark said he's piping the output. In this case it's not looking at the current console/terminal encoding. Instead it defaults to the platform's preferred encoding. On Windows that's the system ANSI encoding, such as codepage 1252. You can set PYTHONIOENCODING=UTF-8 to override this for stdin, stdout, and stderr.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web