Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #7270
| References | <4df02e04$0$1779$a729d347@news.telepac.pt> |
|---|---|
| Date | 2011-06-08 20:00 -0700 |
| Subject | Re: the stupid encoding problem to stdout |
| From | Benjamin Kaplan <benjamin.kaplan@case.edu> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.40.1307588443.11593.python-list@python.org> (permalink) |
2011/6/8 Sérgio Monteiro Basto <sergiomb@sapo.pt>:
> hi,
> cat test.py
> #!/usr/bin/env python
> #-*- coding: utf-8 -*-
> u = u'moçambique'
> print u.encode("utf-8")
> print u
>
> chmod +x test.py
> ./test.py
> moçambique
> moçambique
>
> ./test.py > output.txt
> Traceback (most recent call last):
> File "./test.py", line 5, in <module>
> print u
> UnicodeEncodeError: 'ascii' codec can't encode character
> u'\xe7' in position 2: ordinal not in range(128)
>
> in python 2.7
> how I explain to python to send the same thing to stdout and
> the file output.txt ?
>
> Don't seems logic, when send things to a file the beaviour
> change.
>
> Thanks,
> Sérgio M. B.
That's not a terminal vs file thing. It's a "file that declares it's
encoding" vs a "file that doesn't declare it's encoding" thing. Your
terminal declares that it is UTF-8. So when you print a Unicode string
to your terminal, Python knows that it's supposed to turn it into
UTF-8. When you pipe the output to a file, that file doesn't declare
an encoding. So rather than guess which encoding you want, Python
defaults to the lowest common denominator: ASCII. If you want
something to be a particular encoding, you have to encode it yourself.
You have a couple of choices on how to make it work:
1) Play dumb and always encode as UTF-8. This would look really weird
if someone tried running your program in a terminal with a CP-847
encoding (like cmd.exe on at least the US version of Windows), but it
would never crash.
2) Check sys.stdout.encoding. If it's ascii, then encode your unicode
string in the string-escape encoding, which substitutes the escape
sequence in for all non-ASCII characters.
3) Check to see if sys.stdout.isatty() and have different behavior for
terminals vs files. If you're on a terminal that doesn't declare its
encoding, encoding it as UTF-8 probably won't help. If you're writing
to a file, that might be what you want to do.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 03:18 +0100
Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-09 12:39 +1000
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 22:16 +0100
Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-10 09:19 +1000
Re: the stupid encoding problem to stdout Benjamin Kaplan <benjamin.kaplan@case.edu> - 2011-06-08 20:00 -0700
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 22:14 +0100
Re: the stupid encoding problem to stdout Nobody <nobody@nowhere.com> - 2011-06-09 22:46 +0100
Re: the stupid encoding problem to stdout Terry Reedy <tjreedy@udel.edu> - 2011-06-09 20:14 -0400
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:11 +0100
Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-10 11:45 +1000
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:59 +0100
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 16:11 +0100
Re: the stupid encoding problem to stdout Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-10 10:58 -0600
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-13 15:15 +0100
Re: the stupid encoding problem to stdout Chris Angelico <rosuav@gmail.com> - 2011-06-14 00:49 +1000
Re: the stupid encoding problem to stdout Chris Angelico <rosuav@gmail.com> - 2011-06-11 08:07 +1000
Re: the stupid encoding problem to stdout "Mark Tolonen" <metolone+gmane@gmail.com> - 2011-06-09 17:57 -0700
Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:17 +0100
Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200
Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200
Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200
csiph-web