Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #197140

How to manage accented characters in mail header?

From Chris Green <cl@isbd.net>
Newsgroups comp.lang.python
Subject How to manage accented characters in mail header?
Date 2025-01-04 14:31 +0000
Message-ID <satn4l-6sqh.ln1@q957.zbmc.eu> (permalink)

Show all headers | View raw


I have a Python script that filters my incoming E-Mail.  It has been
working OK (with various updates and improvements) for many years.

I now have a minor new problem when handling E-Mail with a From: that
has accented characters in it:-

    From: Sébastien Crignon <sebastien.crignon@amvs.fr>


I use Python mailbox to parse the message:-

    import mailbox
    ...
    ...
    msg = mailbox.MaildirMessage(sys.stdin.buffer.read())

Then various mailbox methods to get headers etc.
I use the following to get the From: address:-

    str(msg.get('from', "unknown").lower()

The result has the part with the accented character wrapped as follows:-

    From: =?utf-8?B?U8OpYmFzdGllbiBDcmlnbm9u?= <sebastien.crignon@amvs.fr>


I know I have hit this issue before but I can't rememeber the fix. The
problem I have now is that searching the above doesn't work as
expected. Basically I just need to get rid of the ?utf-8? wrapped bit
altogether as I'm only interested in the 'real' address.  How can I
easily remove the UTF8 section in a way that will work whether or not
it's there?


-- 
Chris Green
·

Back to comp.lang.python | Previous | NextNext in thread | Find similar


Thread

How to manage accented characters in mail header? Chris Green <cl@isbd.net> - 2025-01-04 14:31 +0000
  Re: How to manage accented characters in mail header? Peter Pearson <pkpearson@nowhere.invalid> - 2025-01-04 15:00 +0000
  Re: How to manage accented characters in mail header? Chris Green <cl@isbd.net> - 2025-01-04 19:07 +0000
    Re: How to manage accented characters in mail header? "Peter J. Holzer" <hjp-python@hjp.at> - 2025-01-06 20:43 +0100

csiph-web