Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #31973

Re: Fast forward-backward (write-read)

Path csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.023
X-Spam-Evidence '*H*': 0.95; '*S*': 0.00; 'memory.': 0.05; 'f.close()': 0.07; 'filename': 0.07; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; "'r')": 0.16; 'backward': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'wrote:': 0.17; 'not,': 0.21; 'context.': 0.22; 'somewhere': 0.24; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'looks': 0.26; 'skip:[ 10': 0.26; 'skip:" 20': 0.26; 'guess': 0.27; 'see,': 0.27; 'header:X-Complaints-To:1': 0.28; 'post': 0.28; 'factor': 0.29; 'fast.': 0.29; 'probably': 0.29; 'point': 0.31; 'print': 0.32; 'defining': 0.33; 'retain': 0.33; 'to:addr :python-list': 0.33; 'machines': 0.35; 'so,': 0.35; 'pm,': 0.35; 'something': 0.35; 'received:org': 0.36; 'but': 0.36; "i'll": 0.36; 'subject: (': 0.36; 'rather': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'some': 0.38; 'sure': 0.38; 'to:addr:python.org': 0.39; 'build': 0.39; 'space': 0.39; 'skip:" 10': 0.40; 'subject:-': 0.40; 'header:Received:5': 0.40; "you've": 0.61; 'chance': 0.61; 'close': 0.63; 'results': 0.65; 'forward': 0.66; 'forward,': 0.84; 'received:pacbell.net': 0.84; 'subject:Fast': 0.84; 'subject:read': 0.84; 'subject:write': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From emile <emile@fenx.com>
Subject Re: Fast forward-backward (write-read)
Date Tue, 23 Oct 2012 16:35:40 -0700
References <5086AA35.4000509@it.uu.se> <CA+vVgJV6feUL0gTPC==3fp3Wq8zvRXgoyhVaUYnZNtMfF8qpLw@mail.gmail.com> <CA+vVgJWnAThHhD4cUJzXLGdVojCNA1oV_qKYwa+7UsEqS=x7XQ@mail.gmail.com> <CA+vVgJUNOZD3vBVFDXvXzdXb=c5THQG+B5dSOp_uF1nqDHhPug@mail.gmail.com> <CA+vVgJV2A7oJQCRGFypqsfYXHNVxSuhWJUhCF+FM4NWnXanGcA@mail.gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host adsl-69-226-129-65.dsl.pltn13.pacbell.net
User-Agent Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4
In-Reply-To <CA+vVgJV2A7oJQCRGFypqsfYXHNVxSuhWJUhCF+FM4NWnXanGcA@mail.gmail.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.2702.1351035353.27098.python-list@python.org> (permalink)
Lines 33
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1351035353 news.xs4all.nl 6848 [2001:888:2000:d::a6]:56455
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:31973

Show key headers only | View raw


On 10/23/2012 04:19 PM, David Hutto wrote:
> Whether this is fast enough, or not, I don't know:

well, the OP's original post started with
   "I am working with some rather large data files (>100GB)..."

> filename = "data_file.txt"
> f = open(filename, 'r')
> forward =  [line.rstrip('\n') for line in f.readlines()]

f.readlines() will be big(!) and have overhead... and forward results in 
something again as big.

> backward =  [line.rstrip('\n') for line in reversed(forward)]

and defining backward looks to me to require space to build backward and 
hold reversed(forward)

So, let's see, at that point in time (building backward) you've got
probably somewhere close to 400-500Gb in memory.

My guess -- probably not so fast.  Thrashing is sure to be a factor on 
all but machines I'll never have a chance to work on.


> f.close()
> print forward, "\n\n", "********************\n\n", backward, "\n"


It's good to retain context.

Emile

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Fast forward-backward (write-read) emile <emile@fenx.com> - 2012-10-23 16:35 -0700

csiph-web