Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100081

Re: Unicode failure

From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Newsgroups comp.lang.python
Subject Re: Unicode failure
Date 2015-12-07 10:48 +0000
Message-ID <mailman.12.1449485333.12405.python-list@python.org> (permalink)
References <mailman.205.1449268365.14615.python-list@python.org> <ye39y.824840$FM6.212312@fx42.am4>

Show all headers | View raw


On Sun, 6 Dec 2015 at 23:11 Quivis <quivis@domain.invalid> wrote:

> On Fri, 04 Dec 2015 13:07:38 -0500, D'Arcy J.M. Cain wrote:
>
> > I thought that going to Python 3.4 would solve my Unicode issues but it
> > seems I still don't understand this stuff.  Here is my script.
> >
> > #! /usr/bin/python3 # -*- coding: UTF-8 -*-
> > import sys print(sys.getdefaultencoding())
> > print(u"\N{TRADE MARK SIGN}")
> >
> > And here is my output.
> >
> > utf-8 Traceback (most recent call last):
> >   File "./g", line 5, in <module>
> >     print(u"\N{TRADE MARK SIGN}")
> > UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
> > position 0: ordinal not in range(128)
>
> Hmmmm, interesting:
>
> Python 2.7.3 (default, Jun 22 2015, 19:43:34)
> [GCC 4.6.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import sys
> >>> print sys.getdefaultencoding()
> ascii
> >>> print u'\N{TRADE MARK SIGN}'
> ™
>
>
sys.getdefaultencoding() returns the default encoding used when opening a
file if an encoding is not explicitly given in the open call. What matters
here is the encoding associated with stdout which is sys.stdout.encoding.

$ python2.7 -c 'import sys; print(sys.stdout.encoding); print(u"\u2122")'
UTF-8
™

$ LANG=C python2.7 -c 'import sys; print(sys.stdout.encoding);
print(u"\u2122")'
ANSI_X3.4-1968
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in
position 0: ordinal not in range(128)

--
Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Unicode failure "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> - 2015-12-04 13:07 -0500
  Re: Unicode failure Dave Farrance <df@see.replyto.invalid> - 2015-12-06 09:06 +0000
    Re: Unicode failure Dave Farrance <df@see.replyto.invalid> - 2015-12-06 09:16 +0000
    Re: Unicode failure Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-06 09:34 +0000
    Re: Unicode failure Random832 <random832@fastmail.com> - 2015-12-06 15:36 -0500
  Re: Unicode failure Quivis <quivis@domain.invalid> - 2015-12-06 23:09 +0000
    Re: Unicode failure Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-07 10:48 +0000

csiph-web