Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #59959 > unrolled thread
| Started by | Victor Hooi <victorhooi@gmail.com> |
|---|---|
| First post | 2013-11-18 23:13 -0800 |
| Last post | 2013-11-20 01:50 +0000 |
| Articles | 8 — 6 participants |
Back to article view | Back to comp.lang.python
Using try-catch to handle multiple possible file types? Victor Hooi <victorhooi@gmail.com> - 2013-11-18 23:13 -0800
Re: Using try-catch to handle multiple possible file types? Chris Angelico <rosuav@gmail.com> - 2013-11-19 18:22 +1100
Re: Using try-catch to handle multiple possible file types? Amit Saha <amitsaha.in@gmail.com> - 2013-11-19 17:22 +1000
Re: Using try-catch to handle multiple possible file types? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-19 09:36 +0000
Re: Using try-catch to handle multiple possible file types? Victor Hooi <victorhooi@gmail.com> - 2013-11-19 16:30 -0800
Re: Using try-catch to handle multiple possible file types? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-20 01:56 +0000
Re: Using try-catch to handle multiple possible file types? Neil Cerutti <mr.cerutti@gmail.com> - 2013-11-20 10:05 -0500
Re: Using try-catch to handle multiple possible file types? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-20 01:50 +0000
| From | Victor Hooi <victorhooi@gmail.com> |
|---|---|
| Date | 2013-11-18 23:13 -0800 |
| Subject | Using try-catch to handle multiple possible file types? |
| Message-ID | <8379f7c2-c248-4a67-82ed-2d288a1635d2@googlegroups.com> |
Hi, I have a script that needs to handle input files of different types (uncompressed, gzipped etc.). My question is regarding how I should handle the different cases. My first thought was to use a try-catch block and attempt to open it using the most common filetype, then if that failed, try the next most common type etc. before finally erroring out. So basically, using exception handling for flow-control. However, is that considered bad practice, or un-Pythonic? What other alternative constructs could I also use, and pros and cons? (I was thinking I could also use python-magic which wraps libmagic, or I can just rely on file extensions). Other thoughts? Cheers, Victor
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-11-19 18:22 +1100 |
| Message-ID | <mailman.2889.1384845774.18130.python-list@python.org> |
| In reply to | #59959 |
On Tue, Nov 19, 2013 at 6:13 PM, Victor Hooi <victorhooi@gmail.com> wrote: > My first thought was to use a try-catch block and attempt to open it using the most common filetype, then if that failed, try the next most common type etc. before finally erroring out. > > So basically, using exception handling for flow-control. > > However, is that considered bad practice, or un-Pythonic? It's fairly common to work that way. But you may want to be careful what order you try them in; some codecs might be technically capable of reading other formats than you wanted, so start with the most specific. Alternatively, looking at a file's magic number (either with python-magic/libmagic or by manually reading in a few bytes) might be more efficient. Either way can work, take your choice! ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Amit Saha <amitsaha.in@gmail.com> |
|---|---|
| Date | 2013-11-19 17:22 +1000 |
| Message-ID | <mailman.2890.1384846203.18130.python-list@python.org> |
| In reply to | #59959 |
On Tue, Nov 19, 2013 at 5:13 PM, Victor Hooi <victorhooi@gmail.com> wrote:
> Hi,
>
> I have a script that needs to handle input files of different types (uncompressed, gzipped etc.).
>
> My question is regarding how I should handle the different cases.
>
> My first thought was to use a try-catch block and attempt to open it using the most common filetype, then if that failed, try the next most common type etc. before finally erroring out.
>
> So basically, using exception handling for flow-control.
>
> However, is that considered bad practice, or un-Pythonic?
>
> What other alternative constructs could I also use, and pros and cons?
>
> (I was thinking I could also use python-magic which wraps libmagic, or I can just rely on file extensions).
>
> Other thoughts?
How about starting with a dictionary like this:
file_opener = {'.gz': gz_opener,
'.txt': text_opener,
'.zip': zip_opener}
# and so on.
where the *_opener are say functions which does the job of actually
opening the files.
The above dictionary is keyed on file extensions, but perhaps you
would be better off using MIME types instead.
Assuming you go ahead with using MIME type, how about using
python-magic to detect the type and then look in your dictionary
above, if there is a corresponding file_opener object. If you get a
KeyError, you can raise an exception saying that you cannot handle
this file.
How does that sound?
Best,
Amit.
--
http://echorand.me
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-11-19 09:36 +0000 |
| Message-ID | <mailman.2897.1384853817.18130.python-list@python.org> |
| In reply to | #59959 |
On 19/11/2013 07:13, Victor Hooi wrote: > > So basically, using exception handling for flow-control. > > However, is that considered bad practice, or un-Pythonic? > If it works for you use it, practicality beats purity :) -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Victor Hooi <victorhooi@gmail.com> |
|---|---|
| Date | 2013-11-19 16:30 -0800 |
| Message-ID | <4c5f2389-177c-4056-8857-19e2950f8aa7@googlegroups.com> |
| In reply to | #59969 |
Hi,
Is either approach (try-excepts, or using libmagic) considered more idiomatic? What would you guys prefer yourselves?
Also, is it possible to use either approach with a context manager ("with"), without duplicating lots of code?
For example:
try:
with gzip.open('blah.txt', 'rb') as f:
for line in f:
print(line)
except IOError as e:
with open('blah.txt', 'rb') as f:
for line in f:
print(line)
I'm not sure of how to do this without needing to duplicating the processing lines (everything inside the with)?
And using:
try:
f = gzip.open('blah.txt', 'rb')
except IOError as e:
f = open('blah.txt', 'rb')
finally:
for line in f:
print(line)
won't work, since the exception won't get thrown until you actually try to open the file. Plus, I'm under the impression that I should be using context-managers where I can.
Also, on another note, python-magic will return a string as a result, e.g.:
gzip compressed data, was "blah.txt", from Unix, last modified: Wed Nov 20 10:48:35 2013
I suppose it's enough to just do a?
if "gzip compressed data" in results:
or is there a better way?
Cheers,
Victor
On Tuesday, 19 November 2013 20:36:47 UTC+11, Mark Lawrence wrote:
> On 19/11/2013 07:13, Victor Hooi wrote:
>
> >
>
> > So basically, using exception handling for flow-control.
>
> >
>
> > However, is that considered bad practice, or un-Pythonic?
>
> >
>
>
>
> If it works for you use it, practicality beats purity :)
>
>
>
> --
>
> Python is the second best programming language in the world.
>
> But the best has yet to be invented. Christian Tismer
>
>
>
> Mark Lawrence
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-11-20 01:56 +0000 |
| Message-ID | <528c16b5$0$29992$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #60043 |
On Tue, 19 Nov 2013 16:30:46 -0800, Victor Hooi wrote:
> Hi,
>
> Is either approach (try-excepts, or using libmagic) considered more
> idiomatic? What would you guys prefer yourselves?
Specifically in the case of file types, I consider it better to use
libmagic. But as a general technique, using try...except is a reasonable
approach in many situations.
> Also, is it possible to use either approach with a context manager
> ("with"), without duplicating lots of code?
>
> For example:
>
> try:
> with gzip.open('blah.txt', 'rb') as f:
> for line in f:
> print(line)
> except IOError as e:
> with open('blah.txt', 'rb') as f:
> for line in f:
> print(line)
>
> I'm not sure of how to do this without needing to duplicating the
> processing lines (everything inside the with)?
Write a helper function:
def process(opener):
with opener('blah.txt', 'rb') as f:
for line in f:
print(line)
try:
process(gzip.open)
except IOError:
process(open)
If you have many different things to try:
for opener in [gzip.open, open, ...]:
try:
process(opener)
except IOError:
continue
else:
break
[...]
> Also, on another note, python-magic will return a string as a result,
> e.g.:
>
> gzip compressed data, was "blah.txt", from Unix, last modified: Wed Nov
> 20 10:48:35 2013
>
> I suppose it's enough to just do a?
>
> if "gzip compressed data" in results:
>
> or is there a better way?
*shrug*
Read the docs of python-magic. Do they offer a programmable API? If not,
that kinda sucks.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <mr.cerutti@gmail.com> |
|---|---|
| Date | 2013-11-20 10:05 -0500 |
| Message-ID | <mailman.2967.1384959910.18130.python-list@python.org> |
| In reply to | #60045 |
Steven D'Aprano steve+comp.lang.python@pearwood.info via python.org
8:56 PM (12 hours ago) wrote:
> Write a helper function:
>
> def process(opener):
> with opener('blah.txt', 'rb') as f:
> for line in f:
> print(line)
As another option, you can enter the context manager after you decide.
try:
f = gzip.open('blah.txt', 'rb')
except IOError:
f = open('blah.txt', 'rb')
with f:
# processing
for line in f:
print(line)
contextlib.ExitStack was designed to handle cases where entering
context is optional, and so also works for this use case.
with contextlib.ExitStack() as stack:
try:
f = gzip.open('blah.txt', 'rb')
except IOError:
f = open('blah.txt', 'rb')
stack.enter_context(f)
for line in f:
print(line)
--
Neil Cerutti
On Tue, Nov 19, 2013 at 8:56 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Tue, 19 Nov 2013 16:30:46 -0800, Victor Hooi wrote:
>
>> Hi,
>>
>> Is either approach (try-excepts, or using libmagic) considered more
>> idiomatic? What would you guys prefer yourselves?
>
> Specifically in the case of file types, I consider it better to use
> libmagic. But as a general technique, using try...except is a reasonable
> approach in many situations.
>
>
>> Also, is it possible to use either approach with a context manager
>> ("with"), without duplicating lots of code?
>>
>> For example:
>>
>> try:
>> with gzip.open('blah.txt', 'rb') as f:
>> for line in f:
>> print(line)
>> except IOError as e:
>> with open('blah.txt', 'rb') as f:
>> for line in f:
>> print(line)
>>
>> I'm not sure of how to do this without needing to duplicating the
>> processing lines (everything inside the with)?
>
> Write a helper function:
>
> def process(opener):
> with opener('blah.txt', 'rb') as f:
> for line in f:
> print(line)
>
>
> try:
> process(gzip.open)
> except IOError:
> process(open)
>
>
> If you have many different things to try:
>
>
> for opener in [gzip.open, open, ...]:
> try:
> process(opener)
> except IOError:
> continue
> else:
> break
>
>
>
> [...]
>> Also, on another note, python-magic will return a string as a result,
>> e.g.:
>>
>> gzip compressed data, was "blah.txt", from Unix, last modified: Wed Nov
>> 20 10:48:35 2013
>>
>> I suppose it's enough to just do a?
>>
>> if "gzip compressed data" in results:
>>
>> or is there a better way?
>
> *shrug*
>
> Read the docs of python-magic. Do they offer a programmable API? If not,
> that kinda sucks.
>
>
>
> --
> Steven
> --
> https://mail.python.org/mailman/listinfo/python-list
--
Neil Cerutti <mr.cerutti+python@gmail.com>
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2013-11-20 01:50 +0000 |
| Message-ID | <mailman.2948.1384913119.18130.python-list@python.org> |
| In reply to | #60043 |
On 20/11/2013 00:30, Victor Hooi wrote:
> Hi,
>
> Is either approach (try-excepts, or using libmagic) considered more idiomatic? What would you guys prefer yourselves?
>
> Also, is it possible to use either approach with a context manager ("with"), without duplicating lots of code?
>
> For example:
>
> try:
> with gzip.open('blah.txt', 'rb') as f:
> for line in f:
> print(line)
> except IOError as e:
> with open('blah.txt', 'rb') as f:
> for line in f:
> print(line)
>
> I'm not sure of how to do this without needing to duplicating the processing lines (everything inside the with)?
>
> And using:
>
> try:
> f = gzip.open('blah.txt', 'rb')
> except IOError as e:
> f = open('blah.txt', 'rb')
> finally:
> for line in f:
> print(line)
>
> won't work, since the exception won't get thrown until you actually try to open the file. Plus, I'm under the impression that I should be using context-managers where I can.
>
> Also, on another note, python-magic will return a string as a result, e.g.:
>
> gzip compressed data, was "blah.txt", from Unix, last modified: Wed Nov 20 10:48:35 2013
>
> I suppose it's enough to just do a?
>
> if "gzip compressed data" in results:
>
> or is there a better way?
>
> Cheers,
> Victor
>
> On Tuesday, 19 November 2013 20:36:47 UTC+11, Mark Lawrence wrote:
>> On 19/11/2013 07:13, Victor Hooi wrote:
>>
>>>
>>
>>> So basically, using exception handling for flow-control.
>>
>>>
>>
>>> However, is that considered bad practice, or un-Pythonic?
>>
>>>
>>
>>
>>
>> If it works for you use it, practicality beats purity :)
>>
>>
>>
>> --
>>
>> Python is the second best programming language in the world.
>>
>> But the best has yet to be invented. Christian Tismer
>>
>>
>>
>> Mark Lawrence
Something like
for filetype in filetypes:
try:
process(filetype)
break
except IOError:
pass
??? as it's 01:50 GMT and I can't sleep :(
--
Python is the second best programming language in the world.
But the best has yet to be invented. Christian Tismer
Mark Lawrence
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web