Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #74987

Re: Unicode, stdout, and stderr

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'win32': 0.03; 'encoded': 0.07; 'important,': 0.07; 'interpreter.': 0.07; 'sys': 0.07; 'tries': 0.07; 'encode': 0.09; 'raises': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'python': 0.11; 'wrote': 0.14; 'windows': 0.15; 'codec': 0.16; 'fallback': 0.16; 'interest,': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'repr()': 0.16; 'stderr': 0.16; 'stdout': 0.16; 'subject:Unicode': 0.16; 'sys.stdout': 0.16; 'applies': 0.16; 'thanks,': 0.17; 'wrote:': 0.18; 'bit': 0.19; 'result.': 0.19; 'written': 0.21; '(the': 0.22; '>>>': 0.22; 'import': 0.22; '(in': 0.22; 'print': 0.22; 'header:User-Agent:1': 0.23; 'error': 0.23; 'skip:" 40': 0.26; 'asking': 0.27; 'skip:" 20': 0.27; 'header:X -Complaints-To:1': 0.27; 'character': 0.29; 'errors': 0.30; 'skip:( 20': 0.30; 'code': 0.31; "skip:' 10": 0.31; '"",': 0.31; 'file': 0.32; 'run': 0.32; 'another': 0.32; '(most': 0.33; 'actual': 0.34; 'could': 0.34; "can't": 0.35; 'display': 0.35; 'skip:s 30': 0.35; 'no,': 0.35; 'but': 0.35; 'server': 0.38; 'handle': 0.38; 'to:addr:python-list': 0.38; 'recent': 0.39; 'explain': 0.39; 'does': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'skip:u 10': 0.60; 'skip:c 50': 0.60; 'happen': 0.63; 'more': 0.64; 'frank': 0.68; '2014,': 0.84; 'differently:': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Peter Otten <__peter__@web.de>
Subject Re: Unicode, stdout, and stderr
Date Tue, 22 Jul 2014 11:09:37 +0200
Organization None
References <lqkvn0$ptp$1@ger.gmane.org> <lql3am$2q7$1@ger.gmane.org> <lql4jh$iup$1@ger.gmane.org>
Mime-Version 1.0
Content-Type text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding 7Bit
X-Gmane-NNTP-Posting-Host p57bd9011.dip0.t-ipconnect.de
User-Agent KNode/4.11.5
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.12175.1406020191.18130.python-list@python.org> (permalink)
Lines 70
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1406020191 news.xs4all.nl 2829 [2001:888:2000:d::a6]:57148
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:74987

Show key headers only | View raw


Frank Millman wrote:

> 
> "Peter Otten" <__peter__@web.de> wrote in message
> news:lql3am$2q7$1@ger.gmane.org...
>> Frank Millman wrote:
>>
>>> Hi all
>>>
>>> This is not important, but I would appreciate it if someone could
>>> explain the following, run from cmd.exe on Windows Server 2003 -
>>>
>>> C:\>python
>>> Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32
>>> bit (In
>>> tel)] on win32
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>>>> x = '\u2119'
>>>>>> x  # this uses stderr
>>> '\u2119'
>>
>> No, both print to stdout, but just
>>
>>>>> x
>>
>> is passed to the display hook of the interactive interpreter. This
>> applies
>> repr() and  then tries to print the result. If this fails it makes
>> another effort, roughly (the actual code is written in C)
>>
>> sys.stdout.buffer.write(repr(x).encode(
>>    sys.stdout.encoding, "backslashreplace"))
>>
>>
> 
> Thanks, Peter. Very interesting.
> 
> Out of interest, does the same thing happen when writing to sys.stderr?

If you are asking about the fallback mechanism, that is specific to 
sys.displayhook in the interactive interpreter. 

But stdout and stderr do handle errors differently:

>>> import sys
>>> sys.stdout.errors
'strict'
>>> sys.stderr.errors
'backslashreplace'

So a codepoint written to stdout that cannot be encoded with stdout.encoding 
raises an error while a codepoint written to stderr that cannot be encoded 
with stderr.encoding is escaped.

Another way to make stdout more forgiving:

>>> import sys
>>> print("\u2119")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/encodings/cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2119' in 
position 0: character maps to <undefined>
>>> sys.stdout = open(1, mode="w", errors="xmlcharrefreplace", 
encoding=sys.stdout.encoding, closefd=False)
>>> print("\u2119")
&#8473;

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Re: Unicode, stdout, and stderr Peter Otten <__peter__@web.de> - 2014-07-22 11:09 +0200
  Re: Unicode, stdout, and stderr wxjmfauth@gmail.com - 2014-07-22 02:33 -0700

csiph-web