Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #8759 > unrolled thread
| Started by | Thomas Guettler <hv@tbz-pariv.de> |
|---|---|
| First post | 2011-07-04 10:31 +0200 |
| Last post | 2011-07-05 14:02 +0200 |
| Articles | 5 — 2 participants |
Back to article view | Back to comp.lang.python
HeaderParseError Thomas Guettler <hv@tbz-pariv.de> - 2011-07-04 10:31 +0200
Re: HeaderParseError Peter Otten <__peter__@web.de> - 2011-07-04 11:51 +0200
Re: HeaderParseError Thomas Guettler <hv@tbz-pariv.de> - 2011-07-04 12:38 +0200
Re: HeaderParseError Peter Otten <__peter__@web.de> - 2011-07-04 13:20 +0200
Re: HeaderParseError Thomas Guettler <hv@tbz-pariv.de> - 2011-07-05 14:02 +0200
| From | Thomas Guettler <hv@tbz-pariv.de> |
|---|---|
| Date | 2011-07-04 10:31 +0200 |
| Subject | HeaderParseError |
| Message-ID | <97dc37F7gaU1@mid.individual.net> |
Hi,
I get a HeaderParseError during decode_header(), but Thunderbird can
display the name.
>>> from email.header import decode_header
>>> decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
raise HeaderParseError
email.errors.HeaderParseError
How can I parse this in Python?
Thomas
Same question on Stackoverflow:
http://stackoverflow.com/questions/6568596/headerparseerror-in-python
--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de
[toc] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2011-07-04 11:51 +0200 |
| Message-ID | <ius2f9$e1$1@solani.org> |
| In reply to | #8759 |
Thomas Guettler wrote:
> I get a HeaderParseError during decode_header(), but Thunderbird can
> display the name.
>
>>>> from email.header import decode_header
>>>>
decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
> raise HeaderParseError
> email.errors.HeaderParseError
>
>
> How can I parse this in Python?
Trying to decode as much as possible:
>>> s = "QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?="
>>> for n in range(len(s), 0, -1):
... try: t = s[:n].decode("base64")
... except: pass
... else: break
...
>>> n, t
(49, 'Anmeldung Netzanschluss S\x19\x1c\x9a[\x99\xcc\xdc\x0b\x9a\x9c\x19')
>>> print t.decode("iso-8859-1")
Anmeldung Netzanschluss S[ÌÜ
>>> s[n:]
'w==?='
The characters after "...Netzanschluss " look like garbage. What does
Thunderbird display?
[toc] | [prev] | [next] | [standalone]
| From | Thomas Guettler <hv@tbz-pariv.de> |
|---|---|
| Date | 2011-07-04 12:38 +0200 |
| Message-ID | <97djhfF10kU1@mid.individual.net> |
| In reply to | #8765 |
On 04.07.2011 11:51, Peter Otten wrote:
> Thomas Guettler wrote:
>
>> I get a HeaderParseError during decode_header(), but Thunderbird can
>> display the name.
>>
>>>>> from email.header import decode_header
>>>>>
> decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> File "/usr/lib64/python2.6/email/header.py", line 101, in decode_header
>> raise HeaderParseError
>> email.errors.HeaderParseError
>>
>>
>> How can I parse this in Python?
>
> Trying to decode as much as possible:
>
>>>> s = "QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?="
>>>> for n in range(len(s), 0, -1):
> ... try: t = s[:n].decode("base64")
> ... except: pass
> ... else: break
> ...
>>>> n, t
> (49, 'Anmeldung Netzanschluss S\x19\x1c\x9a[\x99\xcc\xdc\x0b\x9a\x9c\x19')
>>>> print t.decode("iso-8859-1")
> Anmeldung Netzanschluss S[ÌÜ
>
>>>> s[n:]
> 'w==?='
>
> The characters after "...Netzanschluss " look like garbage. What does
> Thunderbird display?
Hi Peter, Thunderbird shows this:
Anmeldung Netzanschluss Südring3p.jpg
Thomas
--
Thomas Guettler, http://www.thomas-guettler.de/
E-Mail: guettli (*) thomas-guettler + de
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2011-07-04 13:20 +0200 |
| Message-ID | <mailman.595.1309778458.1164.python-list@python.org> |
| In reply to | #8766 |
Thomas Guettler wrote:
> On 04.07.2011 11:51, Peter Otten wrote:
>> Thomas Guettler wrote:
>>
>>> I get a HeaderParseError during decode_header(), but Thunderbird can
>>> display the name.
>>>
>>>>>> from email.header import decode_header
>>>>>>
>>
decode_header('=?iso-8859-1?B?QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?=')
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> File "/usr/lib64/python2.6/email/header.py", line 101, in
>>> decode_header
>>> raise HeaderParseError
>>> email.errors.HeaderParseError
>> The characters after "...Netzanschluss " look like garbage. What does
>> Thunderbird display?
>
> Hi Peter, Thunderbird shows this:
>
> Anmeldung Netzanschluss Südring3p.jpg
>>> a = u"Anmeldung Netzanschluss
Südring3p.jpg".encode("iso-8859-1").encode("base64")
>>> b = "QW5tZWxkdW5nIE5ldHphbnNjaGx1c3MgU_xkcmluZzNwLmpwZw==?="
>>> for i, (x, y) in enumerate(zip(a, b)):
... if x != y: print i, x, y
...
33 / _
52
?
>>> b.decode("base64")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/encodings/base64_codec.py", line 42, in
base64_decode
output = base64.decodestring(input)
File "/usr/lib/python2.6/base64.py", line 321, in decodestring
return binascii.a2b_base64(s)
binascii.Error: Incorrect padding
>>> b.replace("_", "/").decode("base64")
'Anmeldung Netzanschluss S\xfcdring3p.jpg'
Looks like you encountered a variant of base64 that uses "_" instead of "/"
for chr(63). The wikipedia page http://en.wikipedia.org/wiki/Base64
calls that base64url.
You could try and make the email package accept that with a monkey patch
like the following:
#untested
import binascii
def a2b_base64(s):
return binascii.a2b_base64(s.replace("_", "/"))
from email import base64mime
base64mime.a2b_base64 = a2b_base64
Alternatively monkey-patch the binascii module before you import the email
package.
[toc] | [prev] | [next] | [standalone]
| From | Thomas Guettler <hv@tbz-pariv.de> |
|---|---|
| Date | 2011-07-05 14:02 +0200 |
| Message-ID | <97gcqoFfd1U1@mid.individual.net> |
| In reply to | #8767 |
On 04.07.2011 13:20, Peter Otten wrote: > Thomas Guettler wrote: > >> On 04.07.2011 11:51, Peter Otten wrote: >>> Thomas Guettler wrote: >>> >>>> I get a HeaderParseError during decode_header(), but Thunderbird can >>>> display the name. >>>> >>>>>>> from email.header import decode_header >>>>>>> >>> Hi, I created a ticket: http://bugs.python.org/issue12489 Thomas Güttler -- Thomas Guettler, http://www.thomas-guettler.de/ E-Mail: guettli (*) thomas-guettler + de
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web