Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #35688 > unrolled thread

email.message.Message - as_string fails

Started byHelmut Jarausch <jarausch@igpm.rwth-aachen.de>
First post2012-12-28 12:22 +0000
Last post2012-12-29 10:26 +0000
Articles 4 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  email.message.Message - as_string fails Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2012-12-28 12:22 +0000
    Re: email.message.Message - as_string fails Chris Rebert <clp2@rebertia.com> - 2012-12-28 09:25 -0800
    Re: email.message.Message - as_string fails Terry Reedy <tjreedy@udel.edu> - 2012-12-28 20:57 -0500
      Re: email.message.Message - as_string fails Helmut Jarausch <jarausch@skynet.be> - 2012-12-29 10:26 +0000

#35688 — email.message.Message - as_string fails

FromHelmut Jarausch <jarausch@igpm.rwth-aachen.de>
Date2012-12-28 12:22 +0000
Subjectemail.message.Message - as_string fails
Message-ID<ak5h8rFqgagU1@mid.dfncis.de>
Hi,

I'm trying to filter an mbox file by removing some messages.
For that I use 
Parser= FeedParser(policy=policy.SMTP)
and 'feed' any lines to it.
If the mbox file contains a white line followed by '^From ',
I do

Msg= Parser.close()

(lateron I delete the Parser and create a new one by
Parser= FeedParser(policy=policy.SMTP)
)

I can access parts of the message by  Msg['Message-ID'], e.g.
but even for the very first message, trying to print it or convert it to a string
by  MsgStr=Msg.as_string(unixfrom=True)

lets Python (3.3.1_pre20121209) die with

Traceback (most recent call last):
  File "Email_Parse.py", line 35, in <module>
    MsgStr=Msg.as_string(unixfrom=True)
  File "/usr/lib64/python3.3/email/message.py", line 151, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten
    self._write(msg)
  File "/usr/lib64/python3.3/email/generator.py", line 171, in _write
    self._write_headers(msg)
  File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers
    self.write(self.policy.fold(h, v))
  File "/usr/lib64/python3.3/email/policy.py", line 153, in fold
    return self._fold(name, value, refold_binary=True)
  File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold
    (len(lines[0])+len(name)+2 > maxlen or
IndexError: list index out of range


What am I missing?

Many thanks for a hint,
Helmut.

[toc] | [next] | [standalone]


#35709

FromChris Rebert <clp2@rebertia.com>
Date2012-12-28 09:25 -0800
Message-ID<mailman.1401.1356715525.29569.python-list@python.org>
In reply to#35688

[Multipart message — attachments visible in raw view] — view raw

On Dec 28, 2012 4:26 AM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de>
wrote:
>
> Hi,
>
> I'm trying to filter an mbox file by removing some messages.
> For that I use
> Parser= FeedParser(policy=policy.SMTP)
> and 'feed' any lines to it.
> If the mbox file contains a white line followed by '^From ',
> I do
>
> Msg= Parser.close()
>
> (lateron I delete the Parser and create a new one by
> Parser= FeedParser(policy=policy.SMTP)
> )
>
> I can access parts of the message by  Msg['Message-ID'], e.g.
> but even for the very first message, trying to print it or convert it to
a string
> by  MsgStr=Msg.as_string(unixfrom=True)
>
> lets Python (3.3.1_pre20121209) die with
>
> Traceback (most recent call last):
>   File "Email_Parse.py", line 35, in <module>
>     MsgStr=Msg.as_string(unixfrom=True)
>   File "/usr/lib64/python3.3/email/message.py", line 151, in as_string
>     g.flatten(self, unixfrom=unixfrom)
>   File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten
>     self._write(msg)
>   File "/usr/lib64/python3.3/email/generator.py", line 171, in _write
>     self._write_headers(msg)
>   File "/usr/lib64/python3.3/email/generator.py", line 198, in
_write_headers
>     self.write(self.policy.fold(h, v))
>   File "/usr/lib64/python3.3/email/policy.py", line 153, in fold
>     return self._fold(name, value, refold_binary=True)
>   File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold
>     (len(lines[0])+len(name)+2 > maxlen or
> IndexError: list index out of range
>
>
> What am I missing?

Perhaps the message is malformed. What does Msg.defects give you?

Could you post the line strings you fed to the parser that together
constitute the first message (redacted if necessary)?

P.S. Your naming conventions (with respect to capitalization) disagree with
those of Python.

[toc] | [prev] | [next] | [standalone]


#35727

FromTerry Reedy <tjreedy@udel.edu>
Date2012-12-28 20:57 -0500
Message-ID<mailman.1415.1356746310.29569.python-list@python.org>
In reply to#35688
On 12/28/2012 7:22 AM, Helmut Jarausch wrote:
> Hi,
>
> I'm trying to filter an mbox file by removing some messages.
> For that I use
> Parser= FeedParser(policy=policy.SMTP)
> and 'feed' any lines to it.
> If the mbox file contains a white line followed by '^From ',
> I do
>
> Msg= Parser.close()
>
> (lateron I delete the Parser and create a new one by
> Parser= FeedParser(policy=policy.SMTP)
> )
>
> I can access parts of the message by  Msg['Message-ID'], e.g.
> but even for the very first message, trying to print it or convert it to a string
> by  MsgStr=Msg.as_string(unixfrom=True)
>
> lets Python (3.3.1_pre20121209) die with
>
> Traceback (most recent call last):
>    File "Email_Parse.py", line 35, in <module>
>      MsgStr=Msg.as_string(unixfrom=True)
>    File "/usr/lib64/python3.3/email/message.py", line 151, in as_string
>      g.flatten(self, unixfrom=unixfrom)
>    File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten
>      self._write(msg)
>    File "/usr/lib64/python3.3/email/generator.py", line 171, in _write
>      self._write_headers(msg)
>    File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers
>      self.write(self.policy.fold(h, v))
>    File "/usr/lib64/python3.3/email/policy.py", line 153, in fold
>      return self._fold(name, value, refold_binary=True)
>    File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold
>      (len(lines[0])+len(name)+2 > maxlen or
> IndexError: list index out of range

The only list index visible is 0 in lines[0]. If this raises, lines is 
empty. You could trace back to see where lines is defined. I suspect it 
is all or part of the Msg you started with.

I believe that some of email was rewritten for 3.3, so it is possible 
that you found a bug based on an untrue assumption. It is also possible 
that you missed a limitation in the doc, or tripped over an intended but 
not written limitation. So I hope you do the tracing, so if doc or code 
need a fix, a tracker issue can be opened.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#35744

FromHelmut Jarausch <jarausch@skynet.be>
Date2012-12-29 10:26 +0000
Message-ID<50dec569$0$3120$ba620e4c@news.skynet.be>
In reply to#35727
On Fri, 28 Dec 2012 20:57:46 -0500, Terry Reedy wrote:

> On 12/28/2012 7:22 AM, Helmut Jarausch wrote:
>> Hi,
>>
>> I'm trying to filter an mbox file by removing some messages.
>> For that I use Parser= FeedParser(policy=policy.SMTP)
>> and 'feed' any lines to it.
>> If the mbox file contains a white line followed by '^From ',
>> I do
>>
>> Msg= Parser.close()
>>
>> (lateron I delete the Parser and create a new one by Parser=
>> FeedParser(policy=policy.SMTP)
>> )
>>
>> I can access parts of the message by  Msg['Message-ID'], e.g.
>> but even for the very first message, trying to print it or convert it
>> to a string by  MsgStr=Msg.as_string(unixfrom=True)
>>
>> lets Python (3.3.1_pre20121209) die with
>>
>> Traceback (most recent call last):
>>    File "Email_Parse.py", line 35, in <module>
>>      MsgStr=Msg.as_string(unixfrom=True)
>>    File "/usr/lib64/python3.3/email/message.py", line 151, in as_string
>>      g.flatten(self, unixfrom=unixfrom)
>>    File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten
>>      self._write(msg)
>>    File "/usr/lib64/python3.3/email/generator.py", line 171, in _write
>>      self._write_headers(msg)
>>    File "/usr/lib64/python3.3/email/generator.py", line 198, in
>>    _write_headers
>>      self.write(self.policy.fold(h, v))
>>    File "/usr/lib64/python3.3/email/policy.py", line 153, in fold
>>      return self._fold(name, value, refold_binary=True)
>>    File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold
>>      (len(lines[0])+len(name)+2 > maxlen or
>> IndexError: list index out of range
> 
> The only list index visible is 0 in lines[0]. If this raises, lines is
> empty. You could trace back to see where lines is defined. I suspect it
> is all or part of the Msg you started with.
> 
> I believe that some of email was rewritten for 3.3, so it is possible
> that you found a bug based on an untrue assumption. It is also possible
> that you missed a limitation in the doc, or tripped over an intended but
> not written limitation. So I hope you do the tracing, so if doc or code
> need a fix, a tracker issue can be opened.

Thanks Terry,
I've debugged it and it smells like a bug.
I have created  http://bugs.python.org/issue16811

Helmut.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web