Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #41800

Re: Separate Rows in reader

Path csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python.list@tim.thechases.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.004
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; '(except': 0.05; 'memory.': 0.05; 'appropriate.': 0.09; "wouldn't": 0.11; 'cases': 0.15; '-tkc': 0.16; '24,': 0.16; 'csv': 0.16; 'doing:': 0.16; 'fetches': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'here"': 0.16; 'iterator': 0.16; 'row': 0.16; 'sources,': 0.16; 'subject:reader': 0.16; 'versions),': 0.16; 'wrote:': 0.17; 'tim': 0.18; '>>>': 0.18; '(or': 0.18; 'import': 0.21; 'meant': 0.21; 'large,': 0.22; 'somebody': 0.23; 'machine': 0.24; 'header:In- Reply-To:1': 0.25; 'fit': 0.26; 'in.': 0.27; 'newer': 0.27; 'record': 0.28; '"do': 0.29; 'chase': 0.29; 'prints': 0.29; 'case,': 0.29; 'usually': 0.30; 'code': 0.31; 'file': 0.32; 'to:addr:python-list': 0.33; 'version': 0.34; 'but': 0.36; 'bad': 0.37; 'optimization': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'speak': 0.38; 'performance': 0.39; 'to:addr:python.org': 0.39; 'your': 0.60; 'first': 0.61; 'deals': 0.62; 'worth': 0.63; 'more': 0.63; 'regularly': 0.65; 'cellular': 0.84; 'penalty,': 0.84; 'received:50.22': 0.84; 'habit': 0.91; 'rusi': 0.91
Date Sun, 24 Mar 2013 13:28:44 -0500
From Tim Chase <python.list@tim.thechases.com>
To python-list@python.org
Subject Re: Separate Rows in reader
In-Reply-To <ef3267b4-ea6a-48c2-b9fa-acca671ae3d1@u5g2000pbs.googlegroups.com>
References <fdcdb6d8-c63a-4909-8b56-13204a0d474c@googlegroups.com> <mailman.3662.1364104023.2939.python-list@python.org> <5a1660c3-ca17-452c-ae0f-ee61f4319a8f@googlegroups.com> <514EF9A3.2000603@davea.name> <mailman.3668.1364132867.2939.python-list@python.org> <ef3267b4-ea6a-48c2-b9fa-acca671ae3d1@u5g2000pbs.googlegroups.com>
X-Mailer Claws Mail 3.7.6 (GTK+ 2.20.1; x86_64-pc-linux-gnu)
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-AntiAbuse This header was added to track abuse, please include it with any abuse report
X-AntiAbuse Primary Hostname - boston.accountservergroup.com
X-AntiAbuse Original Domain - python.org
X-AntiAbuse Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse Sender Address Domain - tim.thechases.com
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.3683.1364149622.2939.python-list@python.org> (permalink)
Lines 43
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1364149622 news.xs4all.nl 6853 [2001:888:2000:d::a6]:34364
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:41800

Show key headers only | View raw


On 2013-03-24 08:57, rusi wrote:
> On Mar 24, 6:49 pm, Tim Chase <python.l...@tim.thechases.com> wrote:
> After doing:
> 
> >>> import csv
> >>> original = file('friends.csv', 'rU')
> >>> reader = csv.reader(original, delimiter='\t')
> 
> 
> Stripping of the first line is:
> >>> list(reader)[1:]
> >>> [tuple(row) for row in list(reader)[1:]]
> >>> map(tuple,list(reader)[1:])

This works for small sources, but slurps all the data into memory.
Because csv.reader is an iterator/generator, it can process huge CSV
files that wouldn't otherwise fit in memory.  By using either
r.next() (or "next(r)" in newer versions), it fetches one record from
the generator, to be discarded/stored as appropriate.


> Then you can of course make your code more performant thus:
> >>> reader.next()
> >>> (tuple(row) for row in reader)
> 
> In the majority of cases this optimization is not worth it

If the CSV file is large, using the iterator version is usually worth
the small performance penalty, as you don't have to keep the whole
file in memory.  As somebody who regularly deals with 0.5-1GB CSV
files from cellular providers, I speak from experience of having my
machine choke when reading the whole thing in.

> In any case, strewing prints all over the code is a bad habit
> (except for debugging).

Sorry if my print-statements were misinterpreted--I meant them as a
"do what you want with the data here" stand-in (thus the ellipsis).

-tkc


Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-23 22:20 -0700
  Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-24 01:46 -0400
    Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 00:34 -0700
      Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:18 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:11 -0700
      Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-24 09:03 -0400
      Re: Separate Rows in reader Tim Chase <python.list@tim.thechases.com> - 2013-03-24 08:49 -0500
        Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 08:57 -0700
          Re: Separate Rows in reader Tim Chase <python.list@tim.thechases.com> - 2013-03-24 13:28 -0500
            Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-24 19:08 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 01:11 -0700
  Re: Separate Rows in reader ypsun <winter0128@gmail.com> - 2013-03-24 03:58 -0700
  Re: Separate Rows in reader ypsun <winter0128@gmail.com> - 2013-03-24 04:10 -0700
    Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 23:52 -0700
      Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-25 06:51 -0700
        Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 18:05 -0700
          Re: Separate Rows in reader Dave Angel <davea@davea.name> - 2013-03-25 21:40 -0400
            Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 20:33 -0700
              Re: Separate Rows in reader MRAB <python@mrabarnett.plus.com> - 2013-03-26 03:48 +0000
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-26 00:24 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-26 00:24 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 02:35 -0700
                Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-27 04:18 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 15:12 -0700
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 15:26 -0700
                Re: Separate Rows in reader rusi <rustompmody@gmail.com> - 2013-03-27 18:24 -0700
                Re: Separate Rows in reader Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-28 01:32 +0000
                Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-27 02:35 -0700
                Re: Separate Rows in reader Tim Roberts <timr@probo.com> - 2013-03-28 21:28 -0700
            Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-25 20:33 -0700
  Re: Separate Rows in reader Jiewei Huang <jiewei24@gmail.com> - 2013-03-24 18:15 -0700

csiph-web