Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #196930

Re: Printing UTF-8 mail to terminal

From "Loris Bennett" <loris.bennett@fu-berlin.de>
Newsgroups comp.lang.python
Subject Re: Printing UTF-8 mail to terminal
Date 2024-11-01 10:10 +0100
Organization FUB-IT, Freie Universität Berlin
Message-ID <875xp7nwus.fsf@zedat.fu-berlin.de> (permalink)
References <878qu49tii.fsf@zedat.fu-berlin.de> <ZyPtsLSme7IJ-q4j@cskk.homeip.net> <mailman.63.1730408232.4695.python-list@python.org> <87msijo2cd.fsf@zedat.fu-berlin.de>

Show all headers | View raw


"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> Cameron Simpson <cs@cskk.id.au> writes:
>
>> On 31Oct2024 16:33, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>I have a command-line program which creates an email containing German
>>>umlauts.  On receiving the mail, my mail client displays the subject and
>>>body correctly:
>> [...]
>>>So far, so good.  However, when I use the --verbose option to print
>>>the mail to the terminal via
>>>
>>>  if args.verbose:
>>>      print(mail)
>>>
>>>I get:
>>>
>>>  Subject: Übungsbetreff
>>>
>>>  Sehr geehrter Herr Dr. Bennett,
>>>
>>>  Dies ist eine =C3=9Cbung.
>>>
>>>What do I need to do to prevent the body from getting mangled?
>>
>> That looks to me like quoted-printable. This is an encoding for binary
>> transport of text to make it robust against not 8-buit clean
>> transports.  So your Unicode text is encodings as UTF-8, and then that
>> is encoded in quoted-printable for transport through the email system.
>
> As I mentioned, I think the problem is to do with the way the salutation
> text provided by the "salutation server" and the mail body from a file
> are encoded.  This seems to be different.  
>
>> Your terminal probably accepts UTF-8 - I imagine other German text
>> renders corectly?
>
> Yes, it does.
>
>> You need to get the text and undo the quoted-printable encoding.
>>
>> If you're using the Python email module to parse (or construct) the
>> message as a `Message` object I'd expect that to happen automatically.
>
> I am using
>
>   email.message.EmailMessage
>
> as, from the Python documentation
>
>   https://docs.python.org/3/library/email.examples.html
>
> I gathered that that is the standard approach.
>
> And you are right that encoding for the actual mail which is received is
> automatically sorted out.  If I display the raw email in my client I get
> the following:
>
>   Content-Type: text/plain; charset="utf-8"
>   Content-Transfer-Encoding: quoted-printable
>   ...
>   Subject: =?utf-8?q?=C3=9Cbungsbetreff?=
>   ...
>   Dies ist eine =C3=9Cbung.
>
> I would interpret that as meaning that the subject and body are encoded
> in the same way.
>
> The problem just occurs with the unsent string representation printed to
> the terminal.

If I log the body like this

  body = f"{salutation},\n\n{text}\n{signature}"
  logger.debug("body: " + body)
 
and look at the log file in my terminal I see 

  2024-11-01 09:59:12,318 - DEBUG - mailer:create_body - body: Sehr geehrter Herr Dr. Bennett,

  Dies ist eine Übung.
 
  ...

as expected.  The non-UTF-8 text occurs when I do

  mail = EmailMessage()
  mail.set_content(body, cte="quoted-printable")
  ...

  if args.verbose:   
      print(mail)

which is presumably also correct.

The question is: What conversion is necessary in order to print the
EmailMessage object to the terminal, such that the quoted-printable
parts are turned (back) into UTF-8?

Cheers,

Loris

-- 
This signature is currently under constuction.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-10-31 16:33 +0100
  Re: Printing UTF-8 mail to terminal Left Right <olegsivokon@gmail.com> - 2024-10-31 17:38 +0100
    Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-01 07:52 +0100
      Re: Printing UTF-8 mail to terminal Inada Naoki <songofacandy@gmail.com> - 2024-11-03 12:08 +0900
        Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-04 11:48 +0100
  Re: Printing UTF-8 mail to terminal (Posting On Python-List Prohibited) Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-10-31 19:35 +0000
  Re: Printing UTF-8 mail to terminal Cameron Simpson <cs@cskk.id.au> - 2024-11-01 07:50 +1100
    Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-01 08:11 +0100
      Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-01 10:10 +0100
        Re: Printing UTF-8 mail to terminal dieter.maurer@online.de - 2024-11-01 17:38 +0100
        Re: Printing UTF-8 mail to terminal Cameron Simpson <cs@cskk.id.au> - 2024-11-02 08:47 +1100
          Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-04 11:44 +0100
            Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-04 11:57 +0100
              Re: Printing UTF-8 mail to terminal "Loris Bennett" <loris.bennett@fu-berlin.de> - 2024-11-04 13:02 +0100
                Re: Printing UTF-8 mail to terminal "Peter J. Holzer" <hjp-python@hjp.at> - 2024-11-05 21:39 +0100
                Re: Printing UTF-8 mail to terminal Cameron Simpson <cs@cskk.id.au> - 2024-11-06 08:20 +1100
      Re: Printing UTF-8 mail to terminal Cameron Simpson <cs@cskk.id.au> - 2024-11-02 08:44 +1100

csiph-web