Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #35688 > unrolled thread
| Started by | Helmut Jarausch <jarausch@igpm.rwth-aachen.de> |
|---|---|
| First post | 2012-12-28 12:22 +0000 |
| Last post | 2012-12-29 10:26 +0000 |
| Articles | 4 — 4 participants |
Back to article view | Back to comp.lang.python
email.message.Message - as_string fails Helmut Jarausch <jarausch@igpm.rwth-aachen.de> - 2012-12-28 12:22 +0000
Re: email.message.Message - as_string fails Chris Rebert <clp2@rebertia.com> - 2012-12-28 09:25 -0800
Re: email.message.Message - as_string fails Terry Reedy <tjreedy@udel.edu> - 2012-12-28 20:57 -0500
Re: email.message.Message - as_string fails Helmut Jarausch <jarausch@skynet.be> - 2012-12-29 10:26 +0000
| From | Helmut Jarausch <jarausch@igpm.rwth-aachen.de> |
|---|---|
| Date | 2012-12-28 12:22 +0000 |
| Subject | email.message.Message - as_string fails |
| Message-ID | <ak5h8rFqgagU1@mid.dfncis.de> |
Hi,
I'm trying to filter an mbox file by removing some messages.
For that I use
Parser= FeedParser(policy=policy.SMTP)
and 'feed' any lines to it.
If the mbox file contains a white line followed by '^From ',
I do
Msg= Parser.close()
(lateron I delete the Parser and create a new one by
Parser= FeedParser(policy=policy.SMTP)
)
I can access parts of the message by Msg['Message-ID'], e.g.
but even for the very first message, trying to print it or convert it to a string
by MsgStr=Msg.as_string(unixfrom=True)
lets Python (3.3.1_pre20121209) die with
Traceback (most recent call last):
File "Email_Parse.py", line 35, in <module>
MsgStr=Msg.as_string(unixfrom=True)
File "/usr/lib64/python3.3/email/message.py", line 151, in as_string
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten
self._write(msg)
File "/usr/lib64/python3.3/email/generator.py", line 171, in _write
self._write_headers(msg)
File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers
self.write(self.policy.fold(h, v))
File "/usr/lib64/python3.3/email/policy.py", line 153, in fold
return self._fold(name, value, refold_binary=True)
File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold
(len(lines[0])+len(name)+2 > maxlen or
IndexError: list index out of range
What am I missing?
Many thanks for a hint,
Helmut.
[toc] | [next] | [standalone]
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2012-12-28 09:25 -0800 |
| Message-ID | <mailman.1401.1356715525.29569.python-list@python.org> |
| In reply to | #35688 |
[Multipart message — attachments visible in raw view] — view raw
On Dec 28, 2012 4:26 AM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote: > > Hi, > > I'm trying to filter an mbox file by removing some messages. > For that I use > Parser= FeedParser(policy=policy.SMTP) > and 'feed' any lines to it. > If the mbox file contains a white line followed by '^From ', > I do > > Msg= Parser.close() > > (lateron I delete the Parser and create a new one by > Parser= FeedParser(policy=policy.SMTP) > ) > > I can access parts of the message by Msg['Message-ID'], e.g. > but even for the very first message, trying to print it or convert it to a string > by MsgStr=Msg.as_string(unixfrom=True) > > lets Python (3.3.1_pre20121209) die with > > Traceback (most recent call last): > File "Email_Parse.py", line 35, in <module> > MsgStr=Msg.as_string(unixfrom=True) > File "/usr/lib64/python3.3/email/message.py", line 151, in as_string > g.flatten(self, unixfrom=unixfrom) > File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten > self._write(msg) > File "/usr/lib64/python3.3/email/generator.py", line 171, in _write > self._write_headers(msg) > File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers > self.write(self.policy.fold(h, v)) > File "/usr/lib64/python3.3/email/policy.py", line 153, in fold > return self._fold(name, value, refold_binary=True) > File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold > (len(lines[0])+len(name)+2 > maxlen or > IndexError: list index out of range > > > What am I missing? Perhaps the message is malformed. What does Msg.defects give you? Could you post the line strings you fed to the parser that together constitute the first message (redacted if necessary)? P.S. Your naming conventions (with respect to capitalization) disagree with those of Python.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2012-12-28 20:57 -0500 |
| Message-ID | <mailman.1415.1356746310.29569.python-list@python.org> |
| In reply to | #35688 |
On 12/28/2012 7:22 AM, Helmut Jarausch wrote: > Hi, > > I'm trying to filter an mbox file by removing some messages. > For that I use > Parser= FeedParser(policy=policy.SMTP) > and 'feed' any lines to it. > If the mbox file contains a white line followed by '^From ', > I do > > Msg= Parser.close() > > (lateron I delete the Parser and create a new one by > Parser= FeedParser(policy=policy.SMTP) > ) > > I can access parts of the message by Msg['Message-ID'], e.g. > but even for the very first message, trying to print it or convert it to a string > by MsgStr=Msg.as_string(unixfrom=True) > > lets Python (3.3.1_pre20121209) die with > > Traceback (most recent call last): > File "Email_Parse.py", line 35, in <module> > MsgStr=Msg.as_string(unixfrom=True) > File "/usr/lib64/python3.3/email/message.py", line 151, in as_string > g.flatten(self, unixfrom=unixfrom) > File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten > self._write(msg) > File "/usr/lib64/python3.3/email/generator.py", line 171, in _write > self._write_headers(msg) > File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers > self.write(self.policy.fold(h, v)) > File "/usr/lib64/python3.3/email/policy.py", line 153, in fold > return self._fold(name, value, refold_binary=True) > File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold > (len(lines[0])+len(name)+2 > maxlen or > IndexError: list index out of range The only list index visible is 0 in lines[0]. If this raises, lines is empty. You could trace back to see where lines is defined. I suspect it is all or part of the Msg you started with. I believe that some of email was rewritten for 3.3, so it is possible that you found a bug based on an untrue assumption. It is also possible that you missed a limitation in the doc, or tripped over an intended but not written limitation. So I hope you do the tracing, so if doc or code need a fix, a tracker issue can be opened. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Helmut Jarausch <jarausch@skynet.be> |
|---|---|
| Date | 2012-12-29 10:26 +0000 |
| Message-ID | <50dec569$0$3120$ba620e4c@news.skynet.be> |
| In reply to | #35727 |
On Fri, 28 Dec 2012 20:57:46 -0500, Terry Reedy wrote: > On 12/28/2012 7:22 AM, Helmut Jarausch wrote: >> Hi, >> >> I'm trying to filter an mbox file by removing some messages. >> For that I use Parser= FeedParser(policy=policy.SMTP) >> and 'feed' any lines to it. >> If the mbox file contains a white line followed by '^From ', >> I do >> >> Msg= Parser.close() >> >> (lateron I delete the Parser and create a new one by Parser= >> FeedParser(policy=policy.SMTP) >> ) >> >> I can access parts of the message by Msg['Message-ID'], e.g. >> but even for the very first message, trying to print it or convert it >> to a string by MsgStr=Msg.as_string(unixfrom=True) >> >> lets Python (3.3.1_pre20121209) die with >> >> Traceback (most recent call last): >> File "Email_Parse.py", line 35, in <module> >> MsgStr=Msg.as_string(unixfrom=True) >> File "/usr/lib64/python3.3/email/message.py", line 151, in as_string >> g.flatten(self, unixfrom=unixfrom) >> File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten >> self._write(msg) >> File "/usr/lib64/python3.3/email/generator.py", line 171, in _write >> self._write_headers(msg) >> File "/usr/lib64/python3.3/email/generator.py", line 198, in >> _write_headers >> self.write(self.policy.fold(h, v)) >> File "/usr/lib64/python3.3/email/policy.py", line 153, in fold >> return self._fold(name, value, refold_binary=True) >> File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold >> (len(lines[0])+len(name)+2 > maxlen or >> IndexError: list index out of range > > The only list index visible is 0 in lines[0]. If this raises, lines is > empty. You could trace back to see where lines is defined. I suspect it > is all or part of the Msg you started with. > > I believe that some of email was rewritten for 3.3, so it is possible > that you found a bug based on an untrue assumption. It is also possible > that you missed a limitation in the doc, or tripped over an intended but > not written limitation. So I hope you do the tracing, so if doc or code > need a fix, a tracker issue can be opened. Thanks Terry, I've debugged it and it smells like a bug. I have created http://bugs.python.org/issue16811 Helmut.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web