Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100273

Re: trying to force stdout to utf-8 with errors='ignore' or 'replace'

From Adam Funk <a24061@ducksburg.com>
Newsgroups comp.lang.python
Subject Re: trying to force stdout to utf-8 with errors='ignore' or 'replace'
Date 2015-12-11 16:54 +0000
Organization $CABAL
Message-ID <3nbrjcxcaf.ln2@news.ducksburg.com> (permalink)
References <k6nqjcx307.ln2@news.ducksburg.com> <mailman.131.1449833844.12405.python-list@python.org>

Show all headers | View raw


On 2015-12-11, Peter Otten wrote:

> Adam Funk wrote:

>> but with either or both of those, I get the dreaded
>> "UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
>> 562: ordinal not in range(128)".  How can I force the output to be in
>> UTF-8 & silently suppress invalid characters?
>
> (I'm assuming you are using Python 2 and that main_body is a unicode 
> instance)

The short answer turned out to be 'switch to Python 3', which I think
is what I'll do from now on unless I absolutely need a library that
isn't available there.

(AFAICT, the email parser in 2.7 returns the body as a bytestring &
doesn't actually look at the Content-Type header, & trying to decode
the body with that just made it barf in different places.)


-- 
Science is what we understand well enough to explain to a computer.  
Art is everything else we do.                      --- Donald Knuth

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

trying to force stdout to utf-8 with errors='ignore' or 'replace' Adam Funk <a24061@ducksburg.com> - 2015-12-11 11:04 +0000
  Re: trying to force stdout to utf-8 with errors='ignore' or 'replace' Peter Otten <__peter__@web.de> - 2015-12-11 12:37 +0100
    Re: trying to force stdout to utf-8 with errors='ignore' or 'replace' Adam Funk <a24061@ducksburg.com> - 2015-12-11 16:54 +0000

csiph-web