Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #31978 > unrolled thread
| Started by | David Hutto <dwightdhutto@gmail.com> |
|---|---|
| First post | 2012-10-23 20:01 -0400 |
| Last post | 2012-10-23 20:01 -0400 |
| Articles | 1 — 1 participant |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Fast forward-backward (write-read) David Hutto <dwightdhutto@gmail.com> - 2012-10-23 20:01 -0400
| From | David Hutto <dwightdhutto@gmail.com> |
|---|---|
| Date | 2012-10-23 20:01 -0400 |
| Subject | Re: Fast forward-backward (write-read) |
| Message-ID | <mailman.2705.1351036898.27098.python-list@python.org> |
On Tue, Oct 23, 2012 at 7:35 PM, emile <emile@fenx.com> wrote:
> On 10/23/2012 04:19 PM, David Hutto wrote:
>>
>> Whether this is fast enough, or not, I don't know:
>
>
> well, the OP's original post started with
> "I am working with some rather large data files (>100GB)..."
Well, is this a dedicated system, and one that they have the budget to upgrade?
Data files have some sort of parsing, unless it's one huge dict, or
list, so there has to be an average size to the parse.
So big O notation should begin to refine without a full file.
>
>
>> filename = "data_file.txt"
>> f = open(filename, 'r')
>> forward = [line.rstrip('\n') for line in f.readlines()]
>
>
> f.readlines() will be big(!) and have overhead... and forward results in
> something again as big.
>
Not if an average can be taken, and then refined as the actual gigs
are being iterated through.
>
>> backward = [line.rstrip('\n') for line in reversed(forward)]
>
>
> and defining backward looks to me to require space to build backward and
> hold reversed(forward)
>
> So, let's see, at that point in time (building backward) you've got
> probably somewhere close to 400-500Gb in memory.
>
> My guess -- probably not so fast. Thrashing is sure to be a factor on all
> but machines I'll never have a chance to work on.
But does the OP have access? They never stated their hardware, and
upgradable budget.
>
>
>> f.close()
>> print forward, "\n\n", "********************\n\n", backward, "\n"
>
>
>
> It's good to retain context.
Trying to practice good form ;).
--
Best Regards,
David Hutto
CEO: http://www.hitwebdevelopment.com
Back to top | Article view | comp.lang.python
csiph-web