Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #32908 > unrolled thread

Right solution to unicode error?

Started byAnders <aschneiderman@asha.org>
First post2012-11-07 14:17 -0800
Last post2012-11-08 21:30 -0600
Articles 3 on this page of 23 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  Right solution to unicode error? Anders <aschneiderman@asha.org> - 2012-11-07 14:17 -0800
    RE: Right solution to unicode error? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-11-07 23:07 +0000
    Re: Right solution to unicode error? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-07 23:27 +0000
    Re: Right solution to unicode error? Andrew Berg <bahamutzero8825@gmail.com> - 2012-11-07 17:51 -0600
    Re: Right solution to unicode error? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-11-07 23:53 +0000
      Re: Right solution to unicode error? Hans Mulder <hansmu@xs4all.nl> - 2012-11-08 12:40 +0100
    Re: Right solution to unicode error? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-08 00:44 +0000
    Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 03:01 -0800
    RE: Right solution to unicode error? Anders Schneiderman <ASchneiderman@asha.org> - 2012-11-08 09:00 -0500
    Re: Right solution to unicode error? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-08 14:06 +0000
      Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 07:05 -0800
        Re: Right solution to unicode error? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-08 18:32 +0000
          Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 11:30 -0800
          Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 11:30 -0800
        Re: Right solution to unicode error? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-08 11:48 -0700
          Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 11:54 -0800
            Re: Right solution to unicode error? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-08 13:41 -0700
              Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-09 02:06 -0800
            RE: Right solution to unicode error? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-11-08 20:54 +0000
            Re: Right solution to unicode error? Ian Kelly <ian.g.kelly@gmail.com> - 2012-11-08 14:07 -0700
            Re: Right solution to unicode error? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-08 21:37 +0000
          Re: Right solution to unicode error? wxjmfauth@gmail.com - 2012-11-08 11:54 -0800
    Re: Right solution to unicode error? Andrew Berg <bahamutzero8825@gmail.com> - 2012-11-08 21:30 -0600

Page 2 of 2 — ← Prev page 1 [2]


#32983

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2012-11-08 21:37 +0000
Message-ID<mailman.3468.1352410669.27098.python-list@python.org>
In reply to#32976
On 8 November 2012 19:54,  <wxjmfauth@gmail.com> wrote:
> Le jeudi 8 novembre 2012 19:49:24 UTC+1, Ian a écrit :
>> On Thu, Nov 8, 2012 at 11:32 AM, Oscar Benjamin
>>
>> <oscar.j.benjamin@gmail.com> wrote:
>>
>> > If I want the other characters to work I need to change the code page:
>>
>> >
>>
>> > O:\>chcp 65001
>>
>> > Active code page: 65001
>>
>> >
>>
>> > O:\>Q:\tools\Python33\python -c "import sys;
>>
>> I find that I also need to change the font.  With the default font,
>>
>> printing '\u2013' gives me:
>>
>> –
>>
>>
>>
>> The only alternative font option I have in Windows XP is Lucida
>>
>> Console, which at least works correctly, although it seems to be
>>
>> lacking a lot of glyphs.
>
> Font has nothing to do here.
> You are "simply" wrongly encoding your "unicode".
>
>>>> '\u2013'
> '–'
>>>> '\u2013'.encode('utf-8')
> b'\xe2\x80\x93'
>>>> '\u2013'.encode('utf-8').decode('cp1252')
> '–'

You have correctly identified that the displayed characters are the
result of accidentally interpreting utf-8 bytes as if they were cp1252
or similar. However, it is not Ian or Python that is confusing the
encoding. It is cmd.exe that is confusing the encoding in a
font-dependent way. I also had to change the font as Ian describes
though I did it some time ago and forgot to mention it here.

jmf, can you please trim the text you quote removing the parts you are
not responding to and then any remaining blank lines that were
inserted by your reader/editor?


Oscar

[toc] | [prev] | [next] | [standalone]


#32977

Fromwxjmfauth@gmail.com
Date2012-11-08 11:54 -0800
Message-ID<mailman.3462.1352404465.27098.python-list@python.org>
In reply to#32972
Le jeudi 8 novembre 2012 19:49:24 UTC+1, Ian a écrit :
> On Thu, Nov 8, 2012 at 11:32 AM, Oscar Benjamin
> 
> <oscar.j.benjamin@gmail.com> wrote:
> 
> > If I want the other characters to work I need to change the code page:
> 
> >
> 
> > O:\>chcp 65001
> 
> > Active code page: 65001
> 
> >
> 
> > O:\>Q:\tools\Python33\python -c "import sys;
> 
> > sys.stdout.buffer.write('\u03b1\n'.encode('utf-8'))"
> 
> > α
> 
> >
> 
> > O:\>Q:\tools\Python33\python -c "import sys;
> 
> > sys.stdout.buffer.write('\u03b1\n'.encode(sys.stdout.en
> 
> > coding))"
> 
> > α
> 
> 
> 
> I find that I also need to change the font.  With the default font,
> 
> printing '\u2013' gives me:
> 
> 
> 
> –
> 
> 
> 
> The only alternative font option I have in Windows XP is Lucida
> 
> Console, which at least works correctly, although it seems to be
> 
> lacking a lot of glyphs.

--------

Font has nothing to do here.
You are "simply" wrongly encoding your "unicode".

>>> '\u2013'
'–'
>>> '\u2013'.encode('utf-8')
b'\xe2\x80\x93'
>>> '\u2013'.encode('utf-8').decode('cp1252')
'–'

jmf

[toc] | [prev] | [next] | [standalone]


#32993

FromAndrew Berg <bahamutzero8825@gmail.com>
Date2012-11-08 21:30 -0600
Message-ID<mailman.3472.1352431857.27098.python-list@python.org>
In reply to#32908
On 2012.11.08 08:06, Oscar Benjamin wrote:
> It would be a lot better though if it just worked straight away
> without me needing to set the code page (like the terminal in every
> other OS I use).
The crude equivalent of .bashrc/.zshrc/whatever shell startup script for
cmd is setting a string value (REG_SZ) in
HKCU\Software\Microsoft\Command Processor named autorun and setting that
with whatever command(s) you want to run whenever the shell starts. Mine
has a value of '@chcp 65001>nul'. I actually run zsh when practical
(gotta love Cygwin) and I have an equivalent command in my .zshrc.
Getting unicode to work in a Windows is a hassle, but it /can/ work.
CPython does have a bug that makes it annoying at times, though -
http://bugs.python.org/issue1602
-- 
CPython 3.3.0 | Windows NT 6.1.7601.17835

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web