Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'received:209.85.223': 0.03; 'value,': 0.03; 'messages.': 0.04; 'parser': 0.07; 'python': 0.09; 'cc:addr:python-list': 0.10; 'index': 0.13; 'dec': 0.15; 'skip:f 30': 0.15; 'fold': 0.16; 'indexerror:': 0.16; 'missing?': 0.16; 'naming': 0.16; 'subject:fails': 0.16; 'string': 0.17; 'wrote:': 0.17; 'followed': 0.20; 'skip:" 30': 0.20; 'skip:" 40': 0.20; 'trying': 0.21; 'lets': 0.22; 'cc:2**0': 0.23; '>': 0.23; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'skip:m 30': 0.26; '(most': 0.27; 'am,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'lines': 0.28; 'skip:( 20': 0.28; 'post': 0.28; 'convert': 0.29; '8bit%:5': 0.29; 'skip:_ 10': 0.29; 'skip:& 10': 0.29; "i'm": 0.29; 'e.g.': 0.30; 'you?': 0.32; 'file': 0.32; 'could': 0.32; 'print': 0.32; 'skip:s 30': 0.33; 'conventions': 0.33; 'traceback': 0.33; '(with': 0.33; 'hi,': 0.33; 'skip:& 20': 0.33; 'received:google.com': 0.34; 'list': 0.35; 'filter': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'does': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'delete': 0.38; 'skip:" 10': 0.40; 'your': 0.60; 'range': 0.60; 'skip:u 10': 0.60; 'first': 0.61; 'p.s.': 0.63; 'respect': 0.63; '8bit%:10': 0.69; '4:26': 0.84; 'maxlen': 0.84; 'sender:addr:chris': 0.84; 'constitute': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rebertia.com; s=google; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=9adTN7y/RneNHTYj1uVOUqdQGB57tFxT3L3KTTMbzyE=; b=SFChTq0txGEOQjiIJVMQXrP8UlL5ugoBnX6Q3qCkR3hQUS3AyL7MwSv/j3Mjp4T9vs Nm1gx/lcV5AmcQipnj0doCjfL8iJ7HuYzE7T2XCk/lyv2Bmu6qcIunxKpuR3w2BRP8jX tMVB9ohCu27aX1fbWH7hY0W3VEQl7EJXDFvto= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=9adTN7y/RneNHTYj1uVOUqdQGB57tFxT3L3KTTMbzyE=; b=akI1EJnbhlEgNxQt3m4bRYm3hKt6/tDY+5RVYP4lCMqwLVAEH6UaVan870JaBrvk3n dfogMe8B3ocHqRg1WzjfHbay/RLdJRQuZCdvIboIJ6iVEf/fxBjWy69tjrJ5tvsGszt0 qcIWih+1coy1EEb6IY6gAee3iQodNXII5ujICEKpwoc7qDzRpLUD6A0lQyUvACzDjPmV C0S1coSBOCmUPaighiCDcrSahY450Gyfk1XFIcnjppaB5R9wQSkcqk3cHS23MPu1lq3a kdaEioojUL/p9aAe5rmm5yxFLh0oRg/HaST9uNIDT5+NDA80AcXmyNZYTh9cjRUUvHak QlpA== MIME-Version: 1.0 Sender: chris@rebertia.com In-Reply-To: References: Date: Fri, 28 Dec 2012 09:25:22 -0800 X-Google-Sender-Auth: eF1PZaVrt3mVb_3pN9tjRddvsUI Subject: Re: email.message.Message - as_string fails From: Chris Rebert To: Helmut Jarausch Content-Type: multipart/alternative; boundary=14dae934103bcc0cdd04d1ecf287 X-Gm-Message-State: ALoCoQnp+oLgYN0UROCNstbozSohho4tLtsamY+W3J6qvaVnfYm7oZk46+ZBwbPvsW5ktJRJJgQk Cc: Python X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 122 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1356715525 news.xs4all.nl 6848 [2001:888:2000:d::a6]:59651 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:35709 --14dae934103bcc0cdd04d1ecf287 Content-Type: text/plain; charset=UTF-8 On Dec 28, 2012 4:26 AM, "Helmut Jarausch" wrote: > > Hi, > > I'm trying to filter an mbox file by removing some messages. > For that I use > Parser= FeedParser(policy=policy.SMTP) > and 'feed' any lines to it. > If the mbox file contains a white line followed by '^From ', > I do > > Msg= Parser.close() > > (lateron I delete the Parser and create a new one by > Parser= FeedParser(policy=policy.SMTP) > ) > > I can access parts of the message by Msg['Message-ID'], e.g. > but even for the very first message, trying to print it or convert it to a string > by MsgStr=Msg.as_string(unixfrom=True) > > lets Python (3.3.1_pre20121209) die with > > Traceback (most recent call last): > File "Email_Parse.py", line 35, in > MsgStr=Msg.as_string(unixfrom=True) > File "/usr/lib64/python3.3/email/message.py", line 151, in as_string > g.flatten(self, unixfrom=unixfrom) > File "/usr/lib64/python3.3/email/generator.py", line 112, in flatten > self._write(msg) > File "/usr/lib64/python3.3/email/generator.py", line 171, in _write > self._write_headers(msg) > File "/usr/lib64/python3.3/email/generator.py", line 198, in _write_headers > self.write(self.policy.fold(h, v)) > File "/usr/lib64/python3.3/email/policy.py", line 153, in fold > return self._fold(name, value, refold_binary=True) > File "/usr/lib64/python3.3/email/policy.py", line 176, in _fold > (len(lines[0])+len(name)+2 > maxlen or > IndexError: list index out of range > > > What am I missing? Perhaps the message is malformed. What does Msg.defects give you? Could you post the line strings you fed to the parser that together constitute the first message (redacted if necessary)? P.S. Your naming conventions (with respect to capitalization) disagree with those of Python. --14dae934103bcc0cdd04d1ecf287 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On Dec 28, 2012 4:26 AM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote:
>
> Hi,
>
> I'm trying to filter an mbox file by removing some messages.
> For that I use
> Parser=3D FeedParser(policy=3Dpolicy.SMTP)
> and 'feed' any lines to it.
> If the mbox file contains a white line followed by '^From ', > I do
>
> Msg=3D Parser.close()
>
> (lateron I delete the Parser and create a new one by
> Parser=3D FeedParser(policy=3Dpolicy.SMTP)
> )
>
> I can access parts of the message by =C2=A0Msg['Message-ID'], = e.g.
> but even for the very first message, trying to print it or convert it = to a string
> by =C2=A0MsgStr=3DMsg.as_string(unixfrom=3DTrue)
>
> lets Python (3.3.1_pre20121209) die with
>
> Traceback (most recent call last):
> =C2=A0 File "Email_Parse.py", line 35, in <module>
> =C2=A0 =C2=A0 MsgStr=3DMsg.as_string(unixfrom=3DTrue)
> =C2=A0 File "/usr/lib64/python3.3/email/message.py", line 15= 1, in as_string
> =C2=A0 =C2=A0 g.flatten(self, unixfrom=3Dunixfrom)
> =C2=A0 File "/usr/lib64/python3.3/email/generator.py", line = 112, in flatten
> =C2=A0 =C2=A0 self._write(msg)
> =C2=A0 File "/usr/lib64/python3.3/email/generator.py", line = 171, in _write
> =C2=A0 =C2=A0 self._write_headers(msg)
> =C2=A0 File "/usr/lib64/python3.3/email/generator.py", line = 198, in _write_headers
> =C2=A0 =C2=A0 self.write(self.policy.fold(h, v))
> =C2=A0 File "/usr/lib64/python3.3/email/policy.py", line 153= , in fold
> =C2=A0 =C2=A0 return self._fold(name, value, refold_binary=3DTrue)
> =C2=A0 File "/usr/lib64/python3.3/email/policy.py", line 176= , in _fold
> =C2=A0 =C2=A0 (len(lines[0])+len(name)+2 > maxlen or
> IndexError: list index out of range
>
>
> What am I missing?

Perhaps the message is malformed. What does Msg.defects give= you?

Could you post the line strings you fed to the parser that t= ogether constitute the first message (redacted if necessary)?

P.S. Your naming conventions (with respect to capitalization= ) disagree with those of Python.

--14dae934103bcc0cdd04d1ecf287--