Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #77381
| Date | 2014-09-01 12:07 +1000 |
|---|---|
| From | Cameron Simpson <cs@zip.com.au> |
| Subject | Re: Distinguishing between maildir, mbox, and MH files/directories? |
| References | <20140831134525.25a4321e@bigbox.christie.dr> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.13675.1409537281.18130.python-list@python.org> (permalink) |
On 31Aug2014 13:45, Tim Chase <python.list@tim.thechases.com> wrote:
>Tinkering around with a little script, I found myself with the need
>to walk a directory tree and process mail messaged found within.
>Sometimes these end up being mbox files (with multiple messages
>within), sometimes it's a Maildir structure with messages in each
>individual file and extra holding directories, and sometimes it's a
>MH directory. To complicate matters, there's also the possibility of
>non-{mbox,maildir,mh) files such as binary MUA caches appearing
>alongside these messages.
>
>Python knows how to handle each just fine as long as I tell it what
>type of file to expect. But is there a straight-forward way to
>distinguish them? (FWIW, the *nix "file" utility is just reporting
>"ASCII text", sometimes "with very long lines", and sometimes
>erroneously flags them as C or C++ files‽).
>
>All I need is "is it maildir, mbox, mh, or something else" (I don't
>have to get more complex for the "something else") inside an os.walk
>loop.
Here is my code for these tests:
def ismhdir(path):
''' Test if `path` points at an MH directory.
'''
return os.path.isfile(os.path.join(path, '.mh_sequences'))
def ismaildir(path):
''' Test if `path` points at a Maildir directory.
'''
for subdir in ('new', 'cur', 'tmp'):
if not os.path.isdir(os.path.join(path,subdir)):
return False
return True
def ismbox(path):
''' Open path and check that its first line begins with "From ".
'''
fp=None
try:
fp=open(path)
from_ = fp.read(5)
except IOError:
if fp is not None:
fp.close()
return False
fp.close()
return from_ == 'From '
I would use these is code somewhat like this (imagining your use case):
if ismaildir(path):
...
elif ismhdir(path):
...
elif ismbox(path):
...
else:
reject other known special files here
continue traversing downward otherwise
Cheers,
Cameron Simpson <cs@zip.com.au>
Gabriel Genellina: See PEP 234 http://www.python.org/dev/peps/pep-0234/
Angus Rodgers:
You've got to love a language whose documentation contains sentences
beginning like this:
"Among its chief virtues are the following four -- no, five -- no,
six -- points: [...]"
from python-list@python.org
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Distinguishing between maildir, mbox, and MH files/directories? Cameron Simpson <cs@zip.com.au> - 2014-09-01 12:07 +1000
csiph-web