Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #31952 > unrolled thread

Re: Fast forward-backward (write-read)

Started byVirgil Stokes <vs@it.uu.se>
First post2012-10-23 20:37 +0200
Last post2012-10-28 23:36 +0100
Articles 6 — 4 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Fast forward-backward (write-read) Virgil Stokes <vs@it.uu.se> - 2012-10-23 20:37 +0200
    Re: Fast forward-backward (write-read) Paul Rubin <no.email@nospam.invalid> - 2012-10-23 16:46 -0700
      Re: Fast forward-backward (write-read) Dave Angel <d@davea.name> - 2012-10-28 07:18 -0400
      Re: Fast forward-backward (write-read) Virgil Stokes <vs@it.uu.se> - 2012-10-28 15:20 +0100
      Re: Fast forward-backward (write-read) Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-10-28 18:21 +0000
      Re: Fast forward-backward (write-read) Virgil Stokes <vs@it.uu.se> - 2012-10-28 23:36 +0100

#31952 — Re: Fast forward-backward (write-read)

FromVirgil Stokes <vs@it.uu.se>
Date2012-10-23 20:37 +0200
SubjectRe: Fast forward-backward (write-read)
Message-ID<mailman.2683.1351018926.27098.python-list@python.org>
On 23-Oct-2012 19:56, Tim Chase wrote:
> On 10/23/12 12:17, Virgil Stokes wrote:
>> On 23-Oct-2012 18:09, Tim Chase wrote:
>>>> Finally, to my question --- What is a fast way to write these
>>>> variables to an external file and then read them in
>>>> backwards?
>>> Am I missing something, or would the fairly-standard "tac"
>>> utility do the reversal you want?  It should[*] be optimized to
>>> handle on-disk files in a smart manner.
>> Not sure about "tac" --- could you provide more details on this
>> and/or a simple example of how it could be used for fast reversed
>> "reading" of a data file?
> Well, if you're reading input.txt (and assuming it's one record per
> line, separated by newlines), you can just use
>
>    tac < input.txt > backwards.txt
>
> which will create a secondary file that is the first file in reverse
> order.  Your program can then process this secondary file in-order
> (which would be backwards from your source).
>
> I might have misunderstood your difficulty, but it _sounded_ like
> you just want to inverse the order of a file.
Yes, I do wish to inverse the order,  but the "forward in time" file will be in 
binary.

--V

[toc] | [next] | [standalone]


#31974

FromPaul Rubin <no.email@nospam.invalid>
Date2012-10-23 16:46 -0700
Message-ID<7xr4ooah0t.fsf@ruckus.brouhaha.com>
In reply to#31952
Virgil Stokes <vs@it.uu.se> writes:
> Yes, I do wish to inverse the order,  but the "forward in time" file
> will be in binary.

I really think it will be simplest to just write the file in forward
order, then use mmap to read it one record at a time.  It might be
possible to squeeze out a little more performance with reordering tricks
but that's the first thing to try.

[toc] | [prev] | [next] | [standalone]


#32310

FromDave Angel <d@davea.name>
Date2012-10-28 07:18 -0400
Message-ID<mailman.2967.1351423115.27098.python-list@python.org>
In reply to#31974
On 10/24/2012 03:14 AM, Virgil Stokes wrote:
> On 24-Oct-2012 01:46, Paul Rubin wrote:
>> Virgil Stokes <vs@it.uu.se> writes:
>>> Yes, I do wish to inverse the order,  but the "forward in time" file
>>> will be in binary.
>> I really think it will be simplest to just write the file in forward
>> order, then use mmap to read it one record at a time.  It might be
>> possible to squeeze out a little more performance with reordering tricks
>> but that's the first thing to try.
> Thanks Paul,
> I am working on this approach now...

If you're using mmap to map the whole file, you'll need 64bit Windows to
start with.  I'd be interested to know if Windows will allow you to mmap
100gb at one stroke.  Have you tried it, or are you starting by figuring
how to access the data from the mmap?

-- 

DaveA

[toc] | [prev] | [next] | [standalone]


#32313

FromVirgil Stokes <vs@it.uu.se>
Date2012-10-28 15:20 +0100
Message-ID<mailman.2971.1351434054.27098.python-list@python.org>
In reply to#31974
On 28-Oct-2012 12:18, Dave Angel wrote:
> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
>> On 24-Oct-2012 01:46, Paul Rubin wrote:
>>> Virgil Stokes <vs@it.uu.se> writes:
>>>> Yes, I do wish to inverse the order,  but the "forward in time" file
>>>> will be in binary.
>>> I really think it will be simplest to just write the file in forward
>>> order, then use mmap to read it one record at a time.  It might be
>>> possible to squeeze out a little more performance with reordering tricks
>>> but that's the first thing to try.
>> Thanks Paul,
>> I am working on this approach now...
> If you're using mmap to map the whole file, you'll need 64bit Windows to
> start with.  I'd be interested to know if Windows will allow you to mmap
> 100gb at one stroke.  Have you tried it, or are you starting by figuring
> how to access the data from the mmap?
Thanks very much for pursuing my query, Dave.

I have not tried it yet --- temporarily side-tracked; but, I will post my 
findings on this issue.

[toc] | [prev] | [next] | [standalone]


#32316

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2012-10-28 18:21 +0000
Message-ID<mailman.2975.1351448514.27098.python-list@python.org>
In reply to#31974
On 28 October 2012 14:20, Virgil Stokes <vs@it.uu.se> wrote:
> On 28-Oct-2012 12:18, Dave Angel wrote:
>>
>> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
>>>
>>> On 24-Oct-2012 01:46, Paul Rubin wrote:
>>>>
>>>> Virgil Stokes <vs@it.uu.se> writes:
>>>>>
>>>>> Yes, I do wish to inverse the order,  but the "forward in time" file
>>>>> will be in binary.
>>>>
>>>> I really think it will be simplest to just write the file in forward
>>>> order, then use mmap to read it one record at a time.  It might be
>>>> possible to squeeze out a little more performance with reordering tricks
>>>> but that's the first thing to try.
>>>
>>> Thanks Paul,
>>> I am working on this approach now...
>>
>> If you're using mmap to map the whole file, you'll need 64bit Windows to
>> start with.  I'd be interested to know if Windows will allow you to mmap
>> 100gb at one stroke.  Have you tried it, or are you starting by figuring
>> how to access the data from the mmap?
>
> Thanks very much for pursuing my query, Dave.
>
> I have not tried it yet --- temporarily side-tracked; but, I will post my
> findings on this issue.

If you are going to use mmap then look at the numpy.memmap function.
This wraps pythons mmap so that you can access the contents of the
mapped binary file as if it was a numpy array. This means that you
don't need to handle the bytes -> float conversions yourself.

>>> import numpy
>>> a = numpy.array([4,5,6], numpy.float64)
>>> a
array([ 4.,  5.,  6.])
>>> with open('tmp.bin', 'wb') as f:  # write forwards
...   a.tofile(f)
...   a.tofile(f)
...
>>> a2 = numpy.memmap('tmp.bin', numpy.float64)
>>> a2
memmap([ 4.,  5.,  6.,  4.,  5.,  6.])
>>> a2[3]
4.0
>>> a2[5:2:-1] # read backwards
memmap([ 6.,  5.,  4.])


Oscar

[toc] | [prev] | [next] | [standalone]


#32320

FromVirgil Stokes <vs@it.uu.se>
Date2012-10-28 23:36 +0100
Message-ID<mailman.2978.1351463769.27098.python-list@python.org>
In reply to#31974
On 2012-10-28 19:21, Oscar Benjamin wrote:
> On 28 October 2012 14:20, Virgil Stokes <vs@it.uu.se> wrote:
>> On 28-Oct-2012 12:18, Dave Angel wrote:
>>> On 10/24/2012 03:14 AM, Virgil Stokes wrote:
>>>> On 24-Oct-2012 01:46, Paul Rubin wrote:
>>>>> Virgil Stokes <vs@it.uu.se> writes:
>>>>>> Yes, I do wish to inverse the order,  but the "forward in time" file
>>>>>> will be in binary.
>>>>> I really think it will be simplest to just write the file in forward
>>>>> order, then use mmap to read it one record at a time.  It might be
>>>>> possible to squeeze out a little more performance with reordering tricks
>>>>> but that's the first thing to try.
>>>> Thanks Paul,
>>>> I am working on this approach now...
>>> If you're using mmap to map the whole file, you'll need 64bit Windows to
>>> start with.  I'd be interested to know if Windows will allow you to mmap
>>> 100gb at one stroke.  Have you tried it, or are you starting by figuring
>>> how to access the data from the mmap?
>> Thanks very much for pursuing my query, Dave.
>>
>> I have not tried it yet --- temporarily side-tracked; but, I will post my
>> findings on this issue.
> If you are going to use mmap then look at the numpy.memmap function.
> This wraps pythons mmap so that you can access the contents of the
> mapped binary file as if it was a numpy array. This means that you
> don't need to handle the bytes -> float conversions yourself.
>
>>>> import numpy
>>>> a = numpy.array([4,5,6], numpy.float64)
>>>> a
> array([ 4.,  5.,  6.])
>>>> with open('tmp.bin', 'wb') as f:  # write forwards
> ...   a.tofile(f)
> ...   a.tofile(f)
> ...
>>>> a2 = numpy.memmap('tmp.bin', numpy.float64)
>>>> a2
> memmap([ 4.,  5.,  6.,  4.,  5.,  6.])
>>>> a2[3]
> 4.0
>>>> a2[5:2:-1] # read backwards
> memmap([ 6.,  5.,  4.])
>
>
> Oscar
Thanks Oscar!

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web