Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #100013 > unrolled thread
| Started by | "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> |
|---|---|
| First post | 2015-12-04 13:07 -0500 |
| Last post | 2015-12-07 10:48 +0000 |
| Articles | 7 — 6 participants |
Back to article view | Back to comp.lang.python
Unicode failure "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> - 2015-12-04 13:07 -0500
Re: Unicode failure Dave Farrance <df@see.replyto.invalid> - 2015-12-06 09:06 +0000
Re: Unicode failure Dave Farrance <df@see.replyto.invalid> - 2015-12-06 09:16 +0000
Re: Unicode failure Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-12-06 09:34 +0000
Re: Unicode failure Random832 <random832@fastmail.com> - 2015-12-06 15:36 -0500
Re: Unicode failure Quivis <quivis@domain.invalid> - 2015-12-06 23:09 +0000
Re: Unicode failure Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-12-07 10:48 +0000
| From | "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> |
|---|---|
| Date | 2015-12-04 13:07 -0500 |
| Subject | Unicode failure |
| Message-ID | <mailman.205.1449268365.14615.python-list@python.org> |
I thought that going to Python 3.4 would solve my Unicode issues but it
seems I still don't understand this stuff. Here is my script.
#! /usr/bin/python3
# -*- coding: UTF-8 -*-
import sys
print(sys.getdefaultencoding())
print(u"\N{TRADE MARK SIGN}")
And here is my output.
utf-8
Traceback (most recent call last):
File "./g", line 5, in <module>
print(u"\N{TRADE MARK SIGN}")
UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
position 0: ordinal not in range(128)
What am I missing?
TIA.
--
D'Arcy J.M. Cain
Vybe Networks Inc.
http://www.VybeNetworks.com/
IM:darcy@Vex.Net VoIP: sip:darcy@VybeNetworks.com
[toc] | [next] | [standalone]
| From | Dave Farrance <df@see.replyto.invalid> |
|---|---|
| Date | 2015-12-06 09:06 +0000 |
| Message-ID | <69u76b9spvoql5eeh4h2686pmhigfvmivv@4ax.com> |
| In reply to | #100013 |
"D'Arcy J.M. Cain" <darcy@VybeNetworks.com> wrote:
>...
>utf-8
>Traceback (most recent call last):
> File "./g", line 5, in <module>
> print(u"\N{TRADE MARK SIGN}")
>UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
>position 0: ordinal not in range(128)
I *presume* that you're using Linux since you've got a hashbang, so...
You can *check* that it's the local environment that's the issue with
the *test* of setting the PYTHONIOENCODING environment variable. But if
that works, then it tells you must then fix the underlying environment's
character encoding to give a permanent fix.
$ PYTHONIOENCODING=UTF-8 python3 -c 'print(u"\u00A9")'
©
$ PYTHONIOENCODING=ascii python3 -c 'print(u"\u00A9")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\xa9' in
position 0: ordinal not in range(128)
[toc] | [prev] | [next] | [standalone]
| From | Dave Farrance <df@see.replyto.invalid> |
|---|---|
| Date | 2015-12-06 09:16 +0000 |
| Message-ID | <r1v76b92hj164djoc8eqfvqefqhsh6egnh@4ax.com> |
| In reply to | #100052 |
I was taking it for granted that you knew how to set environment variables, but just in case you don't: In the shell, (are you using BASH?), put this: export PYTHONIOENCODING=UTF-8 ...then run your script. Remember that this is *not* a permanent fix.
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2015-12-06 09:34 +0000 |
| Message-ID | <mailman.1.1449394477.2247.python-list@python.org> |
| In reply to | #100052 |
On 06/12/2015 09:06, Dave Farrance wrote:
> "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> wrote:
>
>> ...
>> utf-8
>> Traceback (most recent call last):
>> File "./g", line 5, in <module>
>> print(u"\N{TRADE MARK SIGN}")
>> UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
>> position 0: ordinal not in range(128)
>
> I *presume* that you're using Linux since you've got a hashbang, so...
>
Not really a good presumption as the hashbang has been used in Python
scripts on Windows ever since "PEP 397 -- Python launcher for Windows",
see https://www.python.org/dev/peps/pep-0397/
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Random832 <random832@fastmail.com> |
|---|---|
| Date | 2015-12-06 15:36 -0500 |
| Message-ID | <mailman.4.1449434224.12405.python-list@python.org> |
| In reply to | #100052 |
Mark Lawrence <breamoreboy@yahoo.co.uk> writes:
> On 06/12/2015 09:06, Dave Farrance wrote:
>> "D'Arcy J.M. Cain" <darcy@VybeNetworks.com> wrote:
>>> utf-8
>>> Traceback (most recent call last):
>>> File "./g", line 5, in <module>
>>> print(u"\N{TRADE MARK SIGN}")
>>> UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
>>> position 0: ordinal not in range(128)
>>
>> I *presume* that you're using Linux since you've got a hashbang, so...
>
> Not really a good presumption as the hashbang has been used in Python
> scripts on Windows ever since "PEP 397 -- Python launcher for
> Windows", see https://www.python.org/dev/peps/pep-0397/
However, on windows it would typically be codepage 437, 850, or the
like, and the error message would call it a 'charmap' codec. The 'ascii'
codec error is associated with being in a UNIX environment with an unset
(or "C" or "POSIX") locale.
[toc] | [prev] | [next] | [standalone]
| From | Quivis <quivis@domain.invalid> |
|---|---|
| Date | 2015-12-06 23:09 +0000 |
| Message-ID | <ye39y.824840$FM6.212312@fx42.am4> |
| In reply to | #100013 |
On Fri, 04 Dec 2015 13:07:38 -0500, D'Arcy J.M. Cain wrote:
> I thought that going to Python 3.4 would solve my Unicode issues but it
> seems I still don't understand this stuff. Here is my script.
>
> #! /usr/bin/python3 # -*- coding: UTF-8 -*-
> import sys print(sys.getdefaultencoding())
> print(u"\N{TRADE MARK SIGN}")
>
> And here is my output.
>
> utf-8 Traceback (most recent call last):
> File "./g", line 5, in <module>
> print(u"\N{TRADE MARK SIGN}")
> UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
> position 0: ordinal not in range(128)
Hmmmm, interesting:
Python 2.7.3 (default, Jun 22 2015, 19:43:34)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print sys.getdefaultencoding()
ascii
>>> print u'\N{TRADE MARK SIGN}'
™
--
_____ __ __ __ __ __ __ __
(( )) || || || \\ // || ((
\\_/X| \\_// || \V/ || \_))
Omnia paratus *~*~*~*~*~*~*
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2015-12-07 10:48 +0000 |
| Message-ID | <mailman.12.1449485333.12405.python-list@python.org> |
| In reply to | #100073 |
On Sun, 6 Dec 2015 at 23:11 Quivis <quivis@domain.invalid> wrote:
> On Fri, 04 Dec 2015 13:07:38 -0500, D'Arcy J.M. Cain wrote:
>
> > I thought that going to Python 3.4 would solve my Unicode issues but it
> > seems I still don't understand this stuff. Here is my script.
> >
> > #! /usr/bin/python3 # -*- coding: UTF-8 -*-
> > import sys print(sys.getdefaultencoding())
> > print(u"\N{TRADE MARK SIGN}")
> >
> > And here is my output.
> >
> > utf-8 Traceback (most recent call last):
> > File "./g", line 5, in <module>
> > print(u"\N{TRADE MARK SIGN}")
> > UnicodeEncodeError: 'ascii' codec can't encode character '\u2122' in
> > position 0: ordinal not in range(128)
>
> Hmmmm, interesting:
>
> Python 2.7.3 (default, Jun 22 2015, 19:43:34)
> [GCC 4.6.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import sys
> >>> print sys.getdefaultencoding()
> ascii
> >>> print u'\N{TRADE MARK SIGN}'
> ™
>
>
sys.getdefaultencoding() returns the default encoding used when opening a
file if an encoding is not explicitly given in the open call. What matters
here is the encoding associated with stdout which is sys.stdout.encoding.
$ python2.7 -c 'import sys; print(sys.stdout.encoding); print(u"\u2122")'
UTF-8
™
$ LANG=C python2.7 -c 'import sys; print(sys.stdout.encoding);
print(u"\u2122")'
ANSI_X3.4-1968
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in
position 0: ordinal not in range(128)
--
Oscar
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web