Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #63903 > unrolled thread

Python Fast I/o

Started byAyushi Dalmia <ayushidalmia2604@gmail.com>
First post2014-01-14 05:50 -0800
Last post2014-01-14 08:18 -0600
Articles 6 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  Python Fast I/o Ayushi Dalmia <ayushidalmia2604@gmail.com> - 2014-01-14 05:50 -0800
    Re: Python Fast I/o Chris Angelico <rosuav@gmail.com> - 2014-01-15 01:03 +1100
      Re: Python Fast I/o Ayushi Dalmia <ayushidalmia2604@gmail.com> - 2014-01-14 06:24 -0800
        Re: Python Fast I/o Chris Angelico <rosuav@gmail.com> - 2014-01-15 01:32 +1100
        Re: Python Fast I/o Roy Smith <roy@panix.com> - 2014-01-14 09:40 -0500
    Re: Python Fast I/o Tim Chase <python.list@tim.thechases.com> - 2014-01-14 08:18 -0600

#63903 — Python Fast I/o

FromAyushi Dalmia <ayushidalmia2604@gmail.com>
Date2014-01-14 05:50 -0800
SubjectPython Fast I/o
Message-ID<a9c545f2-fbc3-4ac4-81ee-a91f61c72b84@googlegroups.com>
I need to write into a file for a project which will be evaluated on the basis of time. What is the fastest way to write 200 Mb of data, accumulated as a list into a file.

Presently I am using this:

with open('index.txt','w') as f:
      f.write("".join(data))
      f.close()

where data is a list of strings which I want to dump into the index.txt file

[toc] | [next] | [standalone]


#63905

FromChris Angelico <rosuav@gmail.com>
Date2014-01-15 01:03 +1100
Message-ID<mailman.5458.1389708197.18130.python-list@python.org>
In reply to#63903
On Wed, Jan 15, 2014 at 12:50 AM, Ayushi Dalmia
<ayushidalmia2604@gmail.com> wrote:
> I need to write into a file for a project which will be evaluated on the basis of time. What is the fastest way to write 200 Mb of data, accumulated as a list into a file.
>
> Presently I am using this:
>
> with open('index.txt','w') as f:
>       f.write("".join(data))
>       f.close()

with open('index.txt','w') as f:
    for hunk in data:
        f.write(hunk)

You don't need to f.close() - that's what the 'with' block guarantees.
Iterating over data and writing each block separately means you don't
have to first build up a 200MB string. After that, your performance is
going to be mainly tied to the speed of your disk, not anything that
Python can affect.

ChrisA

[toc] | [prev] | [next] | [standalone]


#63907

FromAyushi Dalmia <ayushidalmia2604@gmail.com>
Date2014-01-14 06:24 -0800
Message-ID<53affb01-0c5e-46e3-9ff7-27a529db6425@googlegroups.com>
In reply to#63905
On Tuesday, January 14, 2014 7:33:08 PM UTC+5:30, Chris Angelico wrote:
> On Wed, Jan 15, 2014 at 12:50 AM, Ayushi Dalmia
> 
> <ayushidalmia2604@gmail.com> wrote:
> 
> > I need to write into a file for a project which will be evaluated on the basis of time. What is the fastest way to write 200 Mb of data, accumulated as a list into a file.
> 
> >
> 
> > Presently I am using this:
> 
> >
> 
> > with open('index.txt','w') as f:
> 
> >       f.write("".join(data))
> 
> >       f.close()
> 
> 
> 
> with open('index.txt','w') as f:
> 
>     for hunk in data:
> 
>         f.write(hunk)
> 
> 
> 
> You don't need to f.close() - that's what the 'with' block guarantees.
> 
> Iterating over data and writing each block separately means you don't
> 
> have to first build up a 200MB string. After that, your performance is
> 
> going to be mainly tied to the speed of your disk, not anything that
> 
> Python can affect.
> 
> 
> 
> ChrisA

Thanks for the tip on the closing of the file. I did not know that with ensures closing of the file after iteration is over. 

Which is more fast?
Creating a 200 Mb string and then dumping into a file or dividing the 200 Mb string into chunks and then writing those chunks. Won't writing the chunks call more i/o operation?

[toc] | [prev] | [next] | [standalone]


#63909

FromChris Angelico <rosuav@gmail.com>
Date2014-01-15 01:32 +1100
Message-ID<mailman.5461.1389709967.18130.python-list@python.org>
In reply to#63907
On Wed, Jan 15, 2014 at 1:24 AM, Ayushi Dalmia
<ayushidalmia2604@gmail.com> wrote:
> On Tuesday, January 14, 2014 7:33:08 PM UTC+5:30, Chris Angelico wrote:
>> On Wed, Jan 15, 2014 at 12:50 AM, Ayushi Dalmia
>>
>> <ayushidalmia2604@gmail.com> wrote:
>>
>> > I need to write into a file for a project which will be evaluated on the basis of time. What is the fastest way to write 200 Mb of data, accumulated as a list into a file.
>>
>> >
>>
>> > Presently I am using this:
>>
>> >
>>
>> > with open('index.txt','w') as f:
>>
>> >       f.write("".join(data))
>>
>> >       f.close()
>>
>>
>>
>> with open('index.txt','w') as f:
>>
>>     for hunk in data:
>>
>>         f.write(hunk)
>>
>>
>>
>> You don't need to f.close() - that's what the 'with' block guarantees.
>>
>> Iterating over data and writing each block separately means you don't
>>
>> have to first build up a 200MB string. After that, your performance is
>>
>> going to be mainly tied to the speed of your disk, not anything that
>>
>> Python can affect.
>>
>>
>>
>> ChrisA

Your quoted text is becoming double spaced, because of bugs in the
Google Groups client. Please either edit this before posting, or
switch to a better newsreader, or use the mailing list:

https://mail.python.org/mailman/listinfo/python-list

Thanks!

> Which is more fast?
> Creating a 200 Mb string and then dumping into a file or dividing the 200 Mb string into chunks and then writing those chunks. Won't writing the chunks call more i/o operation?
>

When you're writing two hundred megabytes, the number of I/O
operations is dominated by that. You have to write that many sectors,
and nothing can change that. Joining your list of strings before
writing incurs the cost of building up a single 200MB string, but even
that is likely to be insignificant in the scheme of things (if you
have the memory available, it won't take very long compared to the
time it takes to write to the disk). Python will buffer its writes, so
you don't have to worry about the details. It's going to do the right
thing for you; you can concentrate on making your code look right.

ChrisA

[toc] | [prev] | [next] | [standalone]


#63910

FromRoy Smith <roy@panix.com>
Date2014-01-14 09:40 -0500
Message-ID<roy-EB17C5.09403414012014@news.panix.com>
In reply to#63907
In article <53affb01-0c5e-46e3-9ff7-27a529db6425@googlegroups.com>,
 Ayushi Dalmia <ayushidalmia2604@gmail.com> wrote:

> Which is more fast?
> Creating a 200 Mb string and then dumping into a file or dividing the 200 Mb 
> string into chunks and then writing those chunks. Won't writing the chunks 
> call more i/o operation?

This sounds like a simple experiment to try.  Write it both ways, time 
each one, and report your results back to the group.

[toc] | [prev] | [next] | [standalone]


#63911

FromTim Chase <python.list@tim.thechases.com>
Date2014-01-14 08:18 -0600
Message-ID<mailman.5462.1389712154.18130.python-list@python.org>
In reply to#63903
On 2014-01-14 05:50, Ayushi Dalmia wrote:
> I need to write into a file for a project which will be evaluated
> on the basis of time. What is the fastest way to write 200 Mb of
> data, accumulated as a list into a file.
> 
> Presently I am using this:
> 
> with open('index.txt','w') as f:
>       f.write("".join(data))
>       f.close()
> 
> where data is a list of strings which I want to dump into the
> index.txt file -- 

Most file-like objects should support a writelines() method which
takes an iterable and should save you the trouble of joining all the
content (and as Chris noted, you don't need the .close() since the
with handles it) so the whole thing would condense to:

  with open('index.txt', 'w') as f:
    f.writelines(data)

-tkc

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web