Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #31978
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <dwightdhutto@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.021 |
| X-Spam-Evidence | '*H*': 0.96; '*S*': 0.00; 'memory.': 0.05; 'f.close()': 0.07; 'filename': 0.07; 'budget.': 0.09; 'notation': 0.09; 'received:mail-qc0-f174.google.com': 0.09; 'through.': 0.09; 'cc:addr:python-list': 0.10; "'r')": 0.16; '7:35': 0.16; 'backward': 0.16; 'iterated': 0.16; 'oct': 0.16; 'wrote:': 0.17; 'file.': 0.20; 'sort': 0.21; 'trying': 0.21; 'not,': 0.21; 'context.': 0.22; 'cc:2**0': 0.23; 'somewhere': 0.24; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply- To:1': 0.25; 'looks': 0.26; 'skip:[ 10': 0.26; 'skip:" 20': 0.26; 'guess': 0.27; 'see,': 0.27; 'message-id:@mail.gmail.com': 0.27; 'actual': 0.28; 'post': 0.28; 'factor': 0.29; 'fast.': 0.29; 'probably': 0.29; 'point': 0.31; 'system,': 0.32; 'print': 0.32; 'defining': 0.33; 'retain': 0.33; 'received:google.com': 0.34; 'machines': 0.35; 'so,': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'there': 0.35; 'but': 0.36; "i'll": 0.36; 'should': 0.36; 'subject: (': 0.36; 'does': 0.37; 'being': 0.37; 'rather': 0.37; 'received:209': 0.37; 'received:209.85.216': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'some': 0.38; 'sure': 0.38; 'build': 0.39; 'space': 0.39; 'list,': 0.39; 'skip:" 10': 0.40; 'subject:-': 0.40; 'header:Received:5': 0.40; "you've": 0.61; 'dedicated': 0.61; 'chance': 0.61; 'close': 0.63; 'results': 0.65; 'forward': 0.66; 'stated': 0.69; 'access?': 0.84; 'dict,': 0.84; 'forward,': 0.84; 'subject:Fast': 0.84; 'subject:read': 0.84; 'subject:write': 0.84; 'average': 0.93 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=EA2LxeJUIa/W9RTXswAu3KnZIK1aC+8seyHCQrP23Us=; b=SOz1XPX8UIeOk8i64FtUsvQxq/i/rk1vKsweMe0GOTF2TbAvOblUHZqblpBwb0Vev2 h/spdTtNSP8ni5VrAyAaUFRwKJkLvKAQerEtBz+dOireE6XysXQFf1gDZ6zlaqgHZ19h 1ZEVfb5o+aECdYi75Zq9SH7MOp4s6UuO1j8/ULtgOwIrPxqdSUCp0m5zsglpaon4QVe7 CdAloVpZIjK43H6NkkcVugkev2XyZANm4OaZIp1di1HH75j161Rce1o47BUn9AXOowS6 K8FgWFcFwwDTsvujnsFXTe0mtbgvC7PmFH5d9MgzsVJU+/lc/3hE6De4X7PH9tUhZt9g ohFg== |
| MIME-Version | 1.0 |
| In-Reply-To | <k679kb$rs1$1@ger.gmane.org> |
| References | <5086AA35.4000509@it.uu.se> <CA+vVgJV6feUL0gTPC==3fp3Wq8zvRXgoyhVaUYnZNtMfF8qpLw@mail.gmail.com> <CA+vVgJWnAThHhD4cUJzXLGdVojCNA1oV_qKYwa+7UsEqS=x7XQ@mail.gmail.com> <CA+vVgJUNOZD3vBVFDXvXzdXb=c5THQG+B5dSOp_uF1nqDHhPug@mail.gmail.com> <CA+vVgJV2A7oJQCRGFypqsfYXHNVxSuhWJUhCF+FM4NWnXanGcA@mail.gmail.com> <k679kb$rs1$1@ger.gmane.org> |
| Date | Tue, 23 Oct 2012 20:01:36 -0400 |
| Subject | Re: Fast forward-backward (write-read) |
| From | David Hutto <dwightdhutto@gmail.com> |
| To | emile <emile@fenx.com> |
| Content-Type | text/plain; charset=ISO-8859-1 |
| Cc | python-list@python.org |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.2705.1351036898.27098.python-list@python.org> (permalink) |
| Lines | 61 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1351036898 news.xs4all.nl 6916 [2001:888:2000:d::a6]:52752 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:31978 |
Show key headers only | View raw
On Tue, Oct 23, 2012 at 7:35 PM, emile <emile@fenx.com> wrote:
> On 10/23/2012 04:19 PM, David Hutto wrote:
>>
>> Whether this is fast enough, or not, I don't know:
>
>
> well, the OP's original post started with
> "I am working with some rather large data files (>100GB)..."
Well, is this a dedicated system, and one that they have the budget to upgrade?
Data files have some sort of parsing, unless it's one huge dict, or
list, so there has to be an average size to the parse.
So big O notation should begin to refine without a full file.
>
>
>> filename = "data_file.txt"
>> f = open(filename, 'r')
>> forward = [line.rstrip('\n') for line in f.readlines()]
>
>
> f.readlines() will be big(!) and have overhead... and forward results in
> something again as big.
>
Not if an average can be taken, and then refined as the actual gigs
are being iterated through.
>
>> backward = [line.rstrip('\n') for line in reversed(forward)]
>
>
> and defining backward looks to me to require space to build backward and
> hold reversed(forward)
>
> So, let's see, at that point in time (building backward) you've got
> probably somewhere close to 400-500Gb in memory.
>
> My guess -- probably not so fast. Thrashing is sure to be a factor on all
> but machines I'll never have a chance to work on.
But does the OP have access? They never stated their hardware, and
upgradable budget.
>
>
>> f.close()
>> print forward, "\n\n", "********************\n\n", backward, "\n"
>
>
>
> It's good to retain context.
Trying to practice good form ;).
--
Best Regards,
David Hutto
CEO: http://www.hitwebdevelopment.com
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Fast forward-backward (write-read) David Hutto <dwightdhutto@gmail.com> - 2012-10-23 20:01 -0400
csiph-web