Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #44184
| References | (3 earlier) <mailman.869.1366506610.3114.python-list@python.org> <kl0opb$pcr$1@theodyn.ncf.ca> <atl0i2Fto6uU2@mid.individual.net> <kl3stb$5ck$1@theodyn.ncf.ca> <atnh2jFgv8iU1@mid.individual.net> |
|---|---|
| Date | 2013-04-23 09:36 -0500 |
| Subject | Re: There must be a better way |
| From | Skip Montanaro <skip@pobox.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.975.1366727824.3114.python-list@python.org> (permalink) |
> But a csv.DictReader might still be more efficient.
Depends on what efficiency you care about. The DictReader class is
implemented in Python, and builds a dict for every row. It will never
be more efficient CPU-wise than instantiating the csv.reader type
directly and only doing what you need.
OTOH, the DictReader class "just works" and its usage is more obvious
when you come back later to modify your code. It also makes the code
insensitive to column ordering (though yours seems to be as well, if
I'm reading it correctly). On the programmer efficiency axis, I score
the DictReader class higher than the reader type.
A simple test:
##########################
import csv
from timeit import Timer
setup = '''import csv
lst = ["""a,b,c,d,e,f,g"""]
lst.extend(["""05:38:24,0.6326,1,0,1.0,0.0,0.0"""] * 1000000)
reader = csv.reader(lst)
dreader = csv.DictReader(lst)
'''
t1 = Timer("for row in reader: pass", setup)
t2 = Timer("for row in dreader: pass", setup)
print(min(t1.repeat(number=10)))
print(min(t2.repeat(number=10)))
###############################
demonstrates that the raw reader is, indeed, much faster than the DictReader:
0.972723007202
8.29047989845
but that's for the basic iteration. Whatever you need to add to the
raw reader to insulate yourself from changes to the structure of the
CSV file and improve readability will slow it down, while the
DictReader will never be worse than the above.
Skip
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-20 19:46 -0400
Re: There must be a better way Chris Rebert <clp2@rebertia.com> - 2013-04-20 16:57 -0700
Re: There must be a better way Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-21 00:06 +0000
Re: There must be a better way Tim Chase <python.list@tim.thechases.com> - 2013-04-20 19:34 -0500
Re: There must be a better way Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-20 21:07 -0400
Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 09:15 -0400
Re: There must be a better way Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2013-04-21 16:39 +0300
Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:17 -0400
Re: There must be a better way Peter Otten <__peter__@web.de> - 2013-04-21 15:43 +0200
Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:30 -0400
Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-21 11:30 -0400
Re: There must be a better way Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-22 15:32 +0100
Re: There must be a better way Neil Cerutti <neilc@norwich.edu> - 2013-04-22 14:42 +0000
Re: There must be a better way "Colin J. Williams" <cjw@ncf.ca> - 2013-04-22 13:44 -0400
Re: There must be a better way Neil Cerutti <neilc@norwich.edu> - 2013-04-23 13:36 +0000
Re: There must be a better way Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-04-23 15:15 +0100
Re: There must be a better way Tim Chase <python.list@tim.thechases.com> - 2013-04-23 09:30 -0500
Re: There must be a better way Skip Montanaro <skip@pobox.com> - 2013-04-23 09:36 -0500
Re: There must be a better way (correction) Tim Chase <python.list@tim.thechases.com> - 2013-04-23 10:02 -0500
csiph-web