Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #31972

Re: Fast forward-backward (write-read)

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <dwightdhutto@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.006
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'context': 0.05; 'result,': 0.05; 'accuracy.': 0.09; 'experimental': 0.09; 'notation': 0.09; 'url:activestate': 0.09; 'cc:addr:python-list': 0.10; 'suggest': 0.11; 'function(s)': 0.16; 'necessity.': 0.16; 'oct': 0.16; 'received:209.85.216.53': 0.16; 'scaled': 0.16; 'segment': 0.16; 'to:addr:pearwood.info': 0.16; 'to:addr:steve+comp.lang.python': 0.16; "to:name:steven d'aprano": 0.16; 'two.': 0.16; 'wrote:': 0.17; 'variables': 0.17; '>>>': 0.18; 'memory': 0.18; 'finally,': 0.22; 'hours,': 0.22; "i'd": 0.22; 'cc:2**0': 0.23; 'example': 0.23; "i've": 0.23; 'cc:no real name:2**0': 0.24; 'second': 0.24; 'device': 0.24; 'external': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; '---': 0.26; 'common': 0.26; 'fit': 0.26; 'url:wiki': 0.26; 'am,': 0.27; 'question': 0.27; 'have,': 0.27; 'message-id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'run': 0.28; "d'aprano": 0.29; 'steven': 0.29; 'surprised': 0.29; 'url:wikipedia': 0.29; 'case,': 0.29; 'url:code': 0.29; 'function': 0.30; 'code': 0.31; 'file': 0.32; 'running': 0.32; 'could': 0.32; 'likely': 0.33; 'operations': 0.33; 'themselves': 0.33; 'received:google.com': 0.34; 'wrong': 0.34; 'list': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'tool': 0.36; 'but': 0.36; 'url:org': 0.36; 'alone': 0.36; 'depends': 0.36; "i'll": 0.36; 'test': 0.36; 'subject: (': 0.36; 'two': 0.37; 'being': 0.37; 'why': 0.37; 'rather': 0.37; 'received:209': 0.37; 'received:209.85.216': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'fact': 0.38; 'some': 0.38; 'url:en': 0.38; 'subject:-': 0.40; 'header:Received:5': 0.40; 'end': 0.40; 'matter': 0.61; 'time,': 0.62; 'worth': 0.63; 'times': 0.63; 'more': 0.63; 'direct': 0.69; 'reviewed': 0.74; '100': 0.78; 'gain': 0.79; 'more:': 0.84; 'subject:Fast': 0.84; 'subject:read': 0.84; 'subject:write': 0.84; 'timings': 0.84; 'average': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=xc+Gh5ILQLet5Mvr/RMl80omLSWA6Eoa7bgeUOrLYqo=; b=uR1AGm7BB7EK5k5B2ADgQRtq6xLfD8tWKs/W1pjkz+bnlJ7pj0cCAVFu7mdq6Warin gO1n8dKXVZZmepCsLo3ZjVq75Jhnd3p8b01GhOozN93JnqV/33WnDWuAEnzt/BYjw+SG jvsKf2Nvj4LwbeozoBp4UXJ7Tfq1lUesG+taQG8CO8L6ERo4KYTHZrMRbviEC3juARWa 6UDDG1A74XL/9e2M1rAWlr/3HBwnuEeJldp6R5T+B5+FB8Ti7cKmoflniOeZcSdUwPOK KTfXvWrH2cqHf5//ORENFLRx/2MRRwQyvWCXiDAEOI0ax9Deb/VzPzkEhdF+FJ6kTIZ9 AQWg==
MIME-Version 1.0
In-Reply-To <50871ff6$0$29978$c3e8da3$5496439d@news.astraweb.com>
References <5086AA35.4000509@it.uu.se> <mailman.2694.1351029058.27098.python-list@python.org> <50871ff6$0$29978$c3e8da3$5496439d@news.astraweb.com>
Date Tue, 23 Oct 2012 19:34:15 -0400
Subject Re: Fast forward-backward (write-read)
From David Hutto <dwightdhutto@gmail.com>
To "Steven D'Aprano" <steve+comp.lang.python@pearwood.info>
Content-Type text/plain; charset=ISO-8859-1
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.2701.1351035258.27098.python-list@python.org> (permalink)
Lines 63
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1351035258 news.xs4all.nl 6921 [2001:888:2000:d::a6]:55368
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:31972

Show key headers only | View raw


On Tue, Oct 23, 2012 at 6:53 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Tue, 23 Oct 2012 17:50:55 -0400, David Hutto wrote:
>
>> On Tue, Oct 23, 2012 at 10:31 AM, Virgil Stokes <vs@it.uu.se> wrote:
>>> I am working with some rather large data files (>100GB)
> [...]
>>> Finally, to my question --- What is a fast way to write these variables
>>> to an external file and then read them in backwards?
>>
>> Don't forget to use timeit for an average OS utilization.
>
> Given that the data files are larger than 100 gigabytes, the time
> required to process each file is likely to be in hours, not microseconds.
> That being the case, timeit is the wrong tool for the job, it is
> optimized for timings tiny code snippets. You could use it, of course,
> but the added inconvenience doesn't gain you any added accuracy.

It depends on the end result, and the fact that if the iterations
themselves are about the same time, then just using a segment of the
iterations could be scaled down, and a full run might be worth it, if
you have a second computer running optimization.

>
> Here's a neat context manager that makes timing long-running code simple:
>
>
> http://code.activestate.com/recipes/577896


I'll test this out for big O notation later. For the OP:

http://en.wikipedia.org/wiki/Big_O_notation





>
>
>
>> I'd suggest two list comprehensions for now, until I've reviewed it some
>> more:
>
> I would be very surprised if the poster will be able to fit 100 gigabytes
> of data into even a single list comprehension, let alone two.
Again, these can be scaled depending on the operations of the function
in question, and the average time of aforementioned function(s)

>
> This is a classic example of why the old external processing algorithms
> of the 1960s and 70s will never be obsolete. No matter how much memory
> you have, there will always be times when you want to process more data
> than you can fit into memory

This is a common misconception. You can engineer a device that
accommodates this if it's a direct experimental necessity.
>

-- 
Best Regards,
David Hutto
CEO: http://www.hitwebdevelopment.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re: Fast forward-backward (write-read) David Hutto <dwightdhutto@gmail.com> - 2012-10-23 17:50 -0400
  Re: Fast forward-backward (write-read) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-23 22:53 +0000
    Re: Fast forward-backward (write-read) Demian Brecht <demianbrecht@gmail.com> - 2012-10-23 15:57 -0700
    Re: Fast forward-backward (write-read) David Hutto <dwightdhutto@gmail.com> - 2012-10-23 19:34 -0400
    Re: Fast forward-backward (write-read) Virgil Stokes <vs@it.uu.se> - 2012-10-24 09:17 +0200
    Re: Fast forward-backward (write-read) Virgil Stokes <vs@it.uu.se> - 2012-10-24 09:19 +0200
    Re: Fast forward-backward (write-read) David Hutto <dwightdhutto@gmail.com> - 2012-10-24 03:26 -0400
    Re: Fast forward-backward (write-read) Grant Edwards <invalid@invalid.invalid> - 2012-10-24 13:56 +0000

csiph-web