Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18790

Re: UnicodeEncodeError in compile

From Terry Reedy <tjreedy@udel.edu>
Subject Re: UnicodeEncodeError in compile
Date 2012-01-10 19:56 -0500
References <9043309.329.1326169476466.JavaMail.geo-discussion-forums@yqhi24> <mailman.4584.1326182952.27778.python-list@python.org> <mailman.4585.1326192839.27778.python-list@python.org> <e8448df4-76f6-4444-a785-53a1103d3f39@a11g2000vbz.googlegroups.com> <3c9fd9e7-6a0e-40cc-a048-1a82e477c013@p4g2000vbt.googlegroups.com>
Newsgroups comp.lang.python
Message-ID <mailman.4618.1326243425.27778.python-list@python.org> (permalink)

Show all headers | View raw


On 1/10/2012 8:43 AM, jmfauth wrote:
> D:\>c:\python32\python.exe
> Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit
> (Intel)] on win
> 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> '\u5de5'.encode('utf-8')
> b'\xe5\xb7\xa5'
>>>> '\u5de5'.encode('mbcs')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> UnicodeEncodeError: 'mbcs' codec can't encode characters in position
> 0--1: inval
> id character

> D:\>c:\python27\python.exe
> Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit
> (Intel)] on win
> 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> u'\u5de5'.encode('utf-8')
> '\xe5\xb7\xa5'
>>>> u'\u5de5'.encode('mbcs')
> '?'

mbcs encodes according to the current codepage. Only the chinese 
codepage(s) can encode the chinese char. So the unicode error is correct 
and 2.7 has a bug in that it is doing "errors='replace'" when it 
supposedly is doing "errors='strict'". The Py3 fix was done in
http://bugs.python.org/issue850997
2.7 was intentionally left alone because of back-compatibility 
considerations. (None of this addresses the OP's question.)

-- 
Terry Jan Reedy

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-09 20:24 -0800
  Re: UnicodeEncodeError in compile Terry Reedy <tjreedy@udel.edu> - 2012-01-10 03:08 -0500
    Re: UnicodeEncodeError in compile jmfauth <wxjmfauth@gmail.com> - 2012-01-10 01:42 -0800
    Re: UnicodeEncodeError in compile 88888 Dihedral <dihedral88888@googlemail.com> - 2012-01-10 02:53 -0800
      Re: UnicodeEncodeError in compile jmfauth <wxjmfauth@gmail.com> - 2012-01-10 04:28 -0800
        Re: UnicodeEncodeError in compile jmfauth <wxjmfauth@gmail.com> - 2012-01-10 05:43 -0800
          Re: UnicodeEncodeError in compile Terry Reedy <tjreedy@udel.edu> - 2012-01-10 19:56 -0500
            Re: UnicodeEncodeError in compile jmfauth <wxjmfauth@gmail.com> - 2012-01-11 01:29 -0800
            Re: UnicodeEncodeError in compile jmfauth <wxjmfauth@gmail.com> - 2012-01-10 23:05 -0800
    Re: UnicodeEncodeError in compile 88888 Dihedral <dihedral88888@googlemail.com> - 2012-01-10 02:53 -0800
  Re: UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-10 02:04 -0800
  Re: UnicodeEncodeError in compile Terry Reedy <tjreedy@udel.edu> - 2012-01-10 22:50 -0500
    Re: UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-11 03:27 -0800
      Re: UnicodeEncodeError in compile Dave Angel <d@davea.name> - 2012-01-11 06:45 -0500
        Re: UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-11 04:14 -0800
        Re: UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-11 04:14 -0800
    Re: UnicodeEncodeError in compile pyscripter@gmail.com - 2012-01-11 03:27 -0800

csiph-web