Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #7412

Re: the stupid encoding problem to stdout

References (2 earlier) <4df137a7$0$30580$a729d347@news.telepac.pt> <pan.2011.06.09.21.46.15.672000@nowhere.com> <4df16f2e$0$30572$a729d347@news.telepac.pt> <8762oewjao.fsf@benfinney.id.au> <4df2340d$0$30577$a729d347@news.telepac.pt>
Date 2011-06-11 08:07 +1000
Subject Re: the stupid encoding problem to stdout
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.111.1307743627.11593.python-list@python.org> (permalink)

Show all headers | View raw


2011/6/11 Sérgio Monteiro Basto <sergiomb@sapo.pt>:
> ok after thinking about this, this problem exist because Python want be
> smart with ttys

The *anomaly* (not problem) exists because Python has a way of being
told a target encoding. If two parties agree on an encoding, they can
send characters to each other. I had this discussion at work a while
ago; my boss was talking about being "binary-safe" (which really meant
"8-bit safe"), while I was saying that we should support, verify, and
demand properly-formed UTF-8. The main significance is that agreeing
on an encoding means we can change the encoding any time it's
convenient, without having to document that we've changed the data -
because we haven't. I can take the number "twelve thousand three
hundred and forty-five" and render that as a string of decimal digits
as "12345", or as hexadecimal digits as "3039", but I haven't changed
the number. If you know that I'm giving you a string of decimal
digits, and I give you "12345", you will get the same number at the
far side.

Python has agreed with stdout that it will send it characters encoded
in UTF-8. Having made that agreement, Python and stdout can happily
communicate in characters, not bytes. You don't need to explicitly
encode your characters into bytes - and in fact, this would be a very
bad thing to do, because you don't know _what_ encoding stdout is
using. If it's expecting UTF-16, you'll get a whole lot of rubbish if
you send it UTF-8 - but it'll look fine if you send it Unicode.

Chris Angelico

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 03:18 +0100
  Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-09 12:39 +1000
    Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 22:16 +0100
      Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-10 09:19 +1000
  Re: the stupid encoding problem to stdout Benjamin Kaplan <benjamin.kaplan@case.edu> - 2011-06-08 20:00 -0700
    Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-09 22:14 +0100
      Re: the stupid encoding problem to stdout Nobody <nobody@nowhere.com> - 2011-06-09 22:46 +0100
        Re: the stupid encoding problem to stdout Terry Reedy <tjreedy@udel.edu> - 2011-06-09 20:14 -0400
        Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:11 +0100
          Re: the stupid encoding problem to stdout Ben Finney <ben+python@benfinney.id.au> - 2011-06-10 11:45 +1000
            Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:59 +0100
            Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 16:11 +0100
              Re: the stupid encoding problem to stdout Ian Kelly <ian.g.kelly@gmail.com> - 2011-06-10 10:58 -0600
                Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-13 15:15 +0100
                Re: the stupid encoding problem to stdout Chris Angelico <rosuav@gmail.com> - 2011-06-14 00:49 +1000
              Re: the stupid encoding problem to stdout Chris Angelico <rosuav@gmail.com> - 2011-06-11 08:07 +1000
      Re: the stupid encoding problem to stdout "Mark Tolonen" <metolone+gmane@gmail.com> - 2011-06-09 17:57 -0700
        Re: the stupid encoding problem to stdout Sérgio Monteiro Basto <sergiomb@sapo.pt> - 2011-06-10 02:17 +0100
  Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200
  Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200
  Re: the stupid encoding problem to stdout Laurent Claessens <moky.math@gmail.com> - 2011-06-10 07:47 +0200

csiph-web