Groups > comp.lang.python > #100087 > unrolled thread

writing an email.message.Message in UTF-8

Started by	Adam Funk <a24061@ducksburg.com>
First post	2015-12-07 14:57 +0000
Last post	2015-12-08 09:35 +0000
Articles	6 — 3 participants

Back to article view | Back to comp.lang.python

  writing an email.message.Message in UTF-8 Adam Funk <a24061@ducksburg.com> - 2015-12-07 14:57 +0000
    Re: writing an email.message.Message in UTF-8 Adam Funk <a24061@ducksburg.com> - 2015-12-07 15:21 +0000
    Re: writing an email.message.Message in UTF-8 Terry Reedy <tjreedy@udel.edu> - 2015-12-07 12:40 -0500
      Re: writing an email.message.Message in UTF-8 Adam Funk <a24061@ducksburg.com> - 2015-12-08 09:35 +0000
    Re: writing an email.message.Message in UTF-8 dieter <dieter@handshake.de> - 2015-12-08 08:50 +0100
      Re: writing an email.message.Message in UTF-8 Adam Funk <a24061@ducksburg.com> - 2015-12-08 09:35 +0000

#100087 — writing an email.message.Message in UTF-8

From	Adam Funk <a24061@ducksburg.com>
Date	2015-12-07 14:57 +0000
Subject	writing an email.message.Message in UTF-8
Message-ID	<nbjgjcxqsh.ln2@news.ducksburg.com>

I'm trying to write an instance of email.message.Message, whose body
contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
again.)

    reply = email.message.Message()
    reply.set_charset('utf-8')
    ... # set various headers
    reply.set_payload('\n'.join(body_lines) + '\n')
    ...
    outfile = codecs.open(outfilename, 'w', encoding='utf-8', errors='ignore')
    outfile.write(reply.as_string())
    outfile.close()

Then reply.as_string() barfs a UnicodeDecodeError.  I look in the
documentation, which says the generator is better.  So I replace the
outfile.write(...) line with the following:

    g = email.generator.Generator(outfile, mangle_from_=False)
    g.flatten(reply)

which still barfs a UnicodeDecodeError.  Looking closer at the first
error, I see that the exception was in g.flatten(...) already & thrown
up to reply.as_string().  How can I force the thing to do UTF-8
output?

Thanks.


-- 
      $2.95!
 PLATE O' SHRIMP
Luncheon Special

[toc] | [next] | [standalone]

#100089

From	Adam Funk <a24061@ducksburg.com>
Date	2015-12-07 15:21 +0000
Message-ID	<4pkgjcxr0j.ln2@news.ducksburg.com>
In reply to	#100087

On 2015-12-07, Adam Funk wrote:

> I'm trying to write an instance of email.message.Message, whose body
> contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
> again.)
>
>     reply = email.message.Message()
>     reply.set_charset('utf-8')
>     ... # set various headers
>     reply.set_payload('\n'.join(body_lines) + '\n')

I've also tried changing that to
     reply.set_payload('\n'.join(body_lines) + '\n', 'utf-8')
but I get the same error on output.

>     ...
>     outfile = codecs.open(outfilename, 'w', encoding='utf-8', errors='ignore')
>     outfile.write(reply.as_string())
>     outfile.close()
>
> Then reply.as_string() barfs a UnicodeDecodeError.  I look in the
> documentation, which says the generator is better.  So I replace the
> outfile.write(...) line with the following:
>
>     g = email.generator.Generator(outfile, mangle_from_=False)
>     g.flatten(reply)
>
> which still barfs a UnicodeDecodeError.  Looking closer at the first
> error, I see that the exception was in g.flatten(...) already & thrown
> up to reply.as_string().  How can I force the thing to do UTF-8
> output?
>
> Thanks.
>
>


-- 
Cats don't have friends.  They have co-conspirators.
         http://www.gocomics.com/getfuzzy/2015/05/31

[toc] | [prev] | [next] | [standalone]

#100093

From	Terry Reedy <tjreedy@udel.edu>
Date	2015-12-07 12:40 -0500
Message-ID	<mailman.22.1449510105.12405.python-list@python.org>
In reply to	#100087

On 12/7/2015 9:57 AM, Adam Funk wrote:
> I'm trying to write an instance of email.message.Message, whose body
> contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
> again.)

The email package was rewritten for, I believe, 3.3.  I believe it 
should handle unicode email encoded as utf-8 more easily.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]

#100144

From	Adam Funk <a24061@ducksburg.com>
Date	2015-12-08 09:35 +0000
Message-ID	<sskijcxt0v.ln2@news.ducksburg.com>
In reply to	#100093

On 2015-12-07, Terry Reedy wrote:

> On 12/7/2015 9:57 AM, Adam Funk wrote:
>> I'm trying to write an instance of email.message.Message, whose body
>> contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
>> again.)
>
> The email package was rewritten for, I believe, 3.3.  I believe it 
> should handle unicode email encoded as utf-8 more easily.

Actually it works in Python 3.2.3, & fortunately my program doesn't
depend on anything that isn't available for python 3 yet.  Thanks!


-- 
Most Americans are too civilized to hang skulls from baskets, having
been headhunters, of course, only as recently as Vietnam.
                                                  --- Kinky Friedman

[toc] | [prev] | [next] | [standalone]

#100139

From	dieter <dieter@handshake.de>
Date	2015-12-08 08:50 +0100
Message-ID	<mailman.52.1449561032.12405.python-list@python.org>
In reply to	#100087

Adam Funk <a24061@ducksburg.com> writes:

> I'm trying to write an instance of email.message.Message, whose body
> contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
> again.)
>
>     reply = email.message.Message()
>     reply.set_charset('utf-8')
>     ... # set various headers
>     reply.set_payload('\n'.join(body_lines) + '\n')
>     ...
>     outfile = codecs.open(outfilename, 'w', encoding='utf-8', errors='ignore')
>     outfile.write(reply.as_string())
>     outfile.close()
>
> Then reply.as_string() barfs a UnicodeDecodeError.  I look in the
> documentation, which says the generator is better.  So I replace the
> outfile.write(...) line with the following:
>
>     g = email.generator.Generator(outfile, mangle_from_=False)
>     g.flatten(reply)
>
> which still barfs a UnicodeDecodeError.  Looking closer at the first
> error, I see that the exception was in g.flatten(...) already & thrown
> up to reply.as_string().  How can I force the thing to do UTF-8
> output?

You could try replacing "reply.set_payload('\n'.join(body_lines) + '\n')"
by "reply.set_payload(('\n'.join(body_lines) + '\n').encode('utf-8'))",
i.e. you would not pass in a unicode payload but an "utf-8" encode
"str" payload.

[toc] | [prev] | [next] | [standalone]

#100143

From	Adam Funk <a24061@ducksburg.com>
Date	2015-12-08 09:35 +0000
Message-ID	<arkijcxt0v.ln2@news.ducksburg.com>
In reply to	#100139

On 2015-12-08, dieter wrote:

> Adam Funk <a24061@ducksburg.com> writes:
>
>> I'm trying to write an instance of email.message.Message, whose body
>> contains unicode characters, to a UTF-8 file.  (Python 2.7.3 & 2.7.10
>> again.)
>>
>>     reply = email.message.Message()
>>     reply.set_charset('utf-8')
>>     ... # set various headers
>>     reply.set_payload('\n'.join(body_lines) + '\n')
>>     ...
>>     outfile = codecs.open(outfilename, 'w', encoding='utf-8', errors='ignore')
>>     outfile.write(reply.as_string())
>>     outfile.close()
>>
>> Then reply.as_string() barfs a UnicodeDecodeError.  I look in the
>> documentation, which says the generator is better.  So I replace the
>> outfile.write(...) line with the following:
>>
>>     g = email.generator.Generator(outfile, mangle_from_=False)
>>     g.flatten(reply)
>>
>> which still barfs a UnicodeDecodeError.  Looking closer at the first
>> error, I see that the exception was in g.flatten(...) already & thrown
>> up to reply.as_string().  How can I force the thing to do UTF-8
>> output?
>
> You could try replacing "reply.set_payload('\n'.join(body_lines) + '\n')"
> by "reply.set_payload(('\n'.join(body_lines) + '\n').encode('utf-8'))",
> i.e. you would not pass in a unicode payload but an "utf-8" encode
> "str" payload.

That didn't work (I got the same error) but switching to python 3.2
did.  Thanks, though.


-- 
A mathematical formula should never be "owned" by anybody! Mathematics
belonga to God.                                       --- Donald Knuth

[toc] | [prev] | [standalone]

csiph-web

writing an email.message.Message in UTF-8

Contents

#100087 — writing an email.message.Message in UTF-8

#100089

#100093

#100144

#100139

#100143