Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #41800

Re: Separate Rows in reader

Date 2013-03-24 13:28 -0500
From Tim Chase <python.list@tim.thechases.com>
Subject Re: Separate Rows in reader
References (1 earlier) <mailman.3662.1364104023.2939.python-list@python.org> <5a1660c3-ca17-452c-ae0f-ee61f4319a8f@googlegroups.com> <514EF9A3.2000603@davea.name> <mailman.3668.1364132867.2939.python-list@python.org> <ef3267b4-ea6a-48c2-b9fa-acca671ae3d1@u5g2000pbs.googlegroups.com>
Newsgroups comp.lang.python
Message-ID <mailman.3683.1364149622.2939.python-list@python.org> (permalink)

Show all headers | View raw


On 2013-03-24 08:57, rusi wrote:
> On Mar 24, 6:49 pm, Tim Chase <python.l...@tim.thechases.com> wrote:
> After doing:
> 
> >>> import csv
> >>> original = file('friends.csv', 'rU')
> >>> reader = csv.reader(original, delimiter='\t')
> 
> 
> Stripping of the first line is:
> >>> list(reader)[1:]
> >>> [tuple(row) for row in list(reader)[1:]]
> >>> map(tuple,list(reader)[1:])

This works for small sources, but slurps all the data into memory.
Because csv.reader is an iterator/generator, it can process huge CSV
files that wouldn't otherwise fit in memory.  By using either
r.next() (or "next(r)" in newer versions), it fetches one record from
the generator, to be discarded/stored as appropriate.


> Then you can of course make your code more performant thus:
> >>> reader.next()
> >>> (tuple(row) for row in reader)
> 
> In the majority of cases this optimization is not worth it

If the CSV file is large, using the iterator version is usually worth
the small performance penalty, as you don't have to keep the whole
file in memory.  As somebody who regularly deals with 0.5-1GB CSV
files from cellular providers, I speak from experience of having my
machine choke when reading the whole thing in.

> In any case, strewing prints all over the code is a bad habit
> (except for debugging).

Sorry if my print-statements were misinterpreted--I meant them as a
"do what you want with the data here" stand-in (thus the ellipsis).

-tkc


Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-23 22:20 -0700
  Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-24 01:46 -0400
    Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 00:34 -0700
      Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:18 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:11 -0700
      Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-24 09:03 -0400
      Re: Separate Rows in reader Tim Chase <python.list@tim.thechases.com> - 2013-03-24 08:49 -0500
        Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 08:57 -0700
          Re: Separate Rows in reader Tim Chase <python.list@tim.thechases.com> - 2013-03-24 13:28 -0500
            Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 19:08 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:11 -0700
  Re: Separate Rows in reader ypsun <winter0128@gmail.com> - 2013-03-24 03:58 -0700
  Re: Separate Rows in reader ypsun <winter0128@gmail.com> - 2013-03-24 04:10 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 23:52 -0700
      Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-25 06:51 -0700
        Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 18:05 -0700
          Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-25 21:40 -0400
            Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 20:33 -0700
              Re: Separate Rows in reader MRAB <python@mrabarnett.plus.com> - 2013-03-26 03:48 +0000
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-26 00:24 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-26 00:24 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 02:35 -0700
                Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-27 04:18 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 15:12 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 15:26 -0700
                Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-27 18:24 -0700
                Re: Separate Rows in reader Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 01:32 +0000
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 02:35 -0700
                Re: Separate Rows in reader Tim Roberts <timr@probo.com> - 2013-03-28 21:28 -0700
            Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 20:33 -0700
  Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 18:15 -0700

csiph-web