Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!storethat.news.telefonica.de!telefonica.de!news-1.dfn.de!news.dfn.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Thomas Heller Newsgroups: comp.lang.python Subject: Re: Unicode Date: Fri, 15 Mar 2013 12:43:49 +0100 Lines: 35 Message-ID: References: <5142feca$0$29965$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: individual.net XPHvG2OwI2dKzXiEWrsT3Qafv0bCNEicYa/CrufAZIcgp+IWk= Cancel-Lock: sha1:MH3UDTY3/R9MaNv7Uj5U57VeAMg= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130307 Thunderbird/17.0.4 In-Reply-To: <5142feca$0$29965$c3e8da3$5496439d@news.astraweb.com> Xref: csiph.com comp.lang.python:41265 Am 15.03.2013 11:58, schrieb Steven D'Aprano: > On Fri, 15 Mar 2013 11:46:36 +0100, Thomas Heller wrote: [Windows: Problems with unicode output to console] > You can isolate the error by noting that the second one only raises an > exception when you try to print it. That suggests that the problem is > that it contains a character which is not defined in your terminal's > codepage. So let's inspect the strings more carefully: > > > py> a = u"µm" > py> b = u"\u03bcm" > py> a == b > False > py> ord(a[0]), ord(b[0]) > (181, 956) > py> import unicodedata > py> unicodedata.name(a[0]) > 'MICRO SIGN' > py> unicodedata.name(b[0]) > 'GREEK SMALL LETTER MU' > > Does codepage 850 include Greek Small Letter Mu? The evidence suggests it > does not. > > If you can, you should set the terminal's encoding to UTF-8. That will > avoid this sort of problem. Thanks for the clarification. For the archives: Setting the console codepage to 65001 and the font to lucida console helps. Thomas