Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #5097 > unrolled thread

Py3k,email header handling

Started byTheSaint <nobody@nowhere.net.no>
First post2011-05-11 16:04 +0800
Last post2011-05-11 13:56 -0400
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Py3k,email header handling TheSaint <nobody@nowhere.net.no> - 2011-05-11 16:04 +0800
    Re: Py3k,email header handling Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-05-11 13:44 +0000
      Re: Py3k,email header handling TheSaint <nobody@nowhere.net.no> - 2011-05-12 00:27 +0800
        Re: Py3k,email header handling Terry Reedy <tjreedy@udel.edu> - 2011-05-11 13:56 -0400

#5097 — Py3k,email header handling

FromTheSaint <nobody@nowhere.net.no>
Date2011-05-11 16:04 +0800
SubjectPy3k,email header handling
Message-ID<iqdftv$age$1@speranza.aioe.org>
Hello,
some time ago, I wrote a program to eliminate undesided emails from the 
server(s) and leave those which comply to certain filter criteria.

I started it when I got to know whit Python 2.3. Now a days I'd like to 
spend some time to improve it, just for my interest, however it didn't 
gather anybody's interest before.
Now python 3.2 (and some version before) started to use byte, rather than 
text strings, almost for all data handling in compliance to unicode. My 
problem arise that my program handle text strings, so I'd like to rewrite 
the code

My program reads from IMAP4 or POP3 server, I'd prefer that a function/class  
will return either a list or a dictionary which contains the following 
fields:

'from', 'to', 'cc', 'bcc', 'date', 'subject', 'reply-to', 'message-id'

The list may be organized as tuple (from, its_content,), etc,etc for each 
field, but I think dictionary would be more efficient to use.

1) is there a way to call the IMAPlib or POPlib and pass the data directly 
to email.parser.HeaderParser to achieve my intention?

2) If the above will do, do re.compile compile unicode results?
I guess yes.

3) any related documentation...

-- 
goto /dev/null

[toc] | [next] | [standalone]


#5115

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-05-11 13:44 +0000
Message-ID<4dca92b1$0$29980$c3e8da3$5496439d@news.astraweb.com>
In reply to#5097
On Wed, 11 May 2011 16:04:13 +0800, TheSaint wrote:

> Now python 3.2 (and some version before) started to use byte, rather
> than text strings, almost for all data handling in compliance to
> unicode. My problem arise that my program handle text strings, so I'd
> like to rewrite the code

Before you re-write it, you should run 2to3 over it and see how much it 
can do automatically:

http://docs.python.org/library/2to3.html


> My program reads from IMAP4 or POP3 server, I'd prefer that a
> function/class will return either a list or a dictionary which contains
> the following fields:
> 
> 'from', 'to', 'cc', 'bcc', 'date', 'subject', 'reply-to', 'message-id'
> 
> The list may be organized as tuple (from, its_content,), etc,etc for
> each field, but I think dictionary would be more efficient to use.
> 
> 1) is there a way to call the IMAPlib or POPlib and pass the data
> directly to email.parser.HeaderParser to achieve my intention?

I'm afraid I don't understand the question.


> 2) If the above will do, do re.compile compile unicode results? I guess
> yes.

Yes. In Python 3, re.compile("some string") is automatically unicode, 
because "some string" is unicode.


> 3) any related documentation...

http://docs.python.org/py3k/library/email.html
http://docs.python.org/py3k/library/re.html
http://docs.python.org/py3k/library/imaplib.html
http://docs.python.org/py3k/library/poplib.html


If you have any more concrete questions, please ask.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#5131

FromTheSaint <nobody@nowhere.net.no>
Date2011-05-12 00:27 +0800
Message-ID<iqedda$p9f$1@speranza.aioe.org>
In reply to#5115
Steven D'Aprano wrote:

> Before you re-write it, you should run 2to3 over it and see how much it
> can do automatically:

Widely done, only the results from some query has radically changed on 
favour of unicode. Errors raising about results which are not strings 
anymore.
 
> I'm afraid I don't understand the question.

Making an example :

from poplib import POP3 as pop3
pop3.user('userid')
pop3.pass_('password')
numMsg, total = pop3.stat()
for cnt in numMsgs:
   header = pop3.top(cnt)
   # here I'd like to pass the header to some function that will return
   # a dictionary filling
   # from', 'to', 'cc', 'bcc', 'date', 'subject', 'reply-to', 'message-id'
   # keys, if nothing the leave empty string or None
   dict = email.header,decode_header(header) # but might not my result
   # Should I subclass this?

The same would be from the IMAP4 message parsing. Some different process 
would take, would it ?

> If you have any more concrete questions, please ask.

If these aren't concrete questions, forgive me, I perhaps got into wrong 
news group.
In the other and I hugely apreciated your clues. I'll see the docs some more 
long to achieve a clear learning.

-- 
goto /dev/null

[toc] | [prev] | [next] | [standalone]


#5137

FromTerry Reedy <tjreedy@udel.edu>
Date2011-05-11 13:56 -0400
Message-ID<mailman.1418.1305136589.9059.python-list@python.org>
In reply to#5131
On 5/11/2011 12:27 PM, TheSaint wrote:
> Steven D'Aprano wrote:
>
>> Before you re-write it, you should run 2to3 over it and see how much it
>> can do automatically:
>
> Widely done, only the results from some query has radically changed on
> favour of unicode. Errors raising about results which are not strings
> anymore.

Make sure you use 3.2 and not 3.1 (or 3.0) as there were improvements to 
the email module.

-- 
Terry Jan Reedy

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web