Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #45482

RE: How to write fast into a file in python?

From Carlos Nepomuceno <carlosnepomuceno@outlook.com>
Subject RE: How to write fast into a file in python?
Date 2013-05-17 21:18 +0300
References <51966d15$0$29997$c3e8da3$5496439d@news.astraweb.com>
Newsgroups comp.lang.python
Message-ID <mailman.1788.1368814763.3114.python-list@python.org> (permalink)

Show all headers | View raw


You've hit the bullseye! ;)

Thanks a lot!!!

> Oh, I forgot to mention: you have a bug in this function. You're already
> including the newline in the len(line), so there is no need to add one.
> The result is that you only generate 44MB instead of 50MB.

That's because I'm running on Windows.
What's the fastest way to check if '\n' translates to 2 bytes on file?

> Here are the results of profiling the above on my computer. Including the
> overhead of the profiler, it takes just over 50 seconds to run your file
> on my computer.
>
> [steve@ando ~]$ python -m cProfile fastwrite5.py
> 17846645 function calls in 53.575 seconds
>

Didn't know the cProfile module.Thanks a lot!

> Ordered by: standard name
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 1 30.561 30.561 53.575 53.575 fastwrite5.py:1(<module>)
> 1 0.000 0.000 0.000 0.000 {cStringIO.StringIO}
> 5948879 5.582 0.000 5.582 0.000 {len}
> 1 0.004 0.004 0.004 0.004 {method 'close' of 'cStringIO.StringO' objects}
> 1 0.000 0.000 0.000 0.000 {method 'close' of 'file' objects}
> 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
> 5948879 9.979 0.000 9.979 0.000 {method 'format' of 'str' objects}
> 1 0.103 0.103 0.103 0.103 {method 'getvalue' of 'cStringIO.StringO' objects}
> 5948879 7.135 0.000 7.135 0.000 {method 'write' of 'cStringIO.StringO' objects}
> 1 0.211 0.211 0.211 0.211 {method 'write' of 'file' objects}
> 1 0.000 0.000 0.000 0.000 {open}
>
>
> As you can see, the time is dominated by repeatedly calling len(),
> str.format() and StringIO.write() methods. Actually writing the data to
> the file is quite a small percentage of the cumulative time.
>
> So, here's another version, this time using a pre-calculated limit. I
> cheated and just copied the result from the fastwrite5 output :-)
>
> # fasterwrite.py
> filename = 'fasterwrite.dat'
> with open(filename, 'w') as f:
> for i in xrange(5948879): # Actually only 44MB, not 50MB.
> f.write('%d\n' % i)
>

I had the same idea but kept the original method because I didn't want to waste time creating a function for calculating the actual number of iterations needed to deliver 50MB of data. ;)

> And the profile results are about twice as fast as fastwrite5 above, with
> only 8 seconds in total writing to my HDD.
>
> [steve@ando ~]$ python -m cProfile fasterwrite.py
> 5948882 function calls in 28.840 seconds
>
> Ordered by: standard name
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 1 20.592 20.592 28.840 28.840 fasterwrite.py:1(<module>)
> 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
> 5948879 8.229 0.000 8.229 0.000 {method 'write' of 'file' objects}
> 1 0.019 0.019 0.019 0.019 {open}
>

I thought there would be a call to format method by "'%d\n' % i". It seems the % operator is a lot faster than format.
I just stopped using it because I read it was going to be deprecated. :(
Why replace such a great and fast operator by a slow method? I mean, why format is been preferred over %?

> Without the overhead of the profiler, it is a little faster:
>
> [steve@ando ~]$ time python fasterwrite.py
>
> real 0m16.187s
> user 0m13.553s
> sys 0m0.508s
>
>
> Although it is still slower than the heavily optimized dd command,
> but not unreasonably slow for a high-level language:
>
> [steve@ando ~]$ time dd if=fasterwrite.dat of=copy.dat
> 90781+1 records in
> 90781+1 records out
> 46479922 bytes (46 MB) copied, 0.737009 seconds, 63.1 MB/s
>
> real 0m0.786s
> user 0m0.071s
> sys 0m0.595s
>
>
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list 		 	   		  

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to write fast into a file in python? lokeshkoppaka@gmail.com - 2013-05-16 20:20 -0700
  Re: How to write fast into a file in python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-17 03:51 +0000
  Re: How to write fast into a file in python? lokeshkoppaka@gmail.com - 2013-05-16 21:35 -0700
    Re: How to write fast into a file in python? Dave Angel <davea@davea.name> - 2013-05-17 07:58 -0400
    RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-17 18:20 +0300
      Re: How to write fast into a file in python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-17 16:42 +0000
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-17 20:25 +0300
      Re: How to write fast into a file in python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-17 17:47 +0000
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-17 21:18 +0300
          Re: How to write fast into a file in python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-18 04:01 +0000
            Re: How to write fast into a file in python? Chris Angelico <rosuav@gmail.com> - 2013-05-18 15:28 +1000
            Re: How to write fast into a file in python? 88888 Dihedral <dihedral88888@googlemail.com> - 2013-05-18 04:09 -0700
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-17 21:33 +0300
        RE: How to write fast into a file in python? Fábio Santos <fabiosantosart@gmail.com> - 2013-05-18 08:49 +0100
        Re: How to write fast into a file in python? Chris Angelico <rosuav@gmail.com> - 2013-05-19 00:29 +1000
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-18 20:00 +0300
          Re: How to write fast into a file in python? Tim Roberts <timr@probo.com> - 2013-05-19 19:04 -0700
        Re: How to write fast into a file in python? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-05-18 15:14 -0400
          Re: How to write fast into a file in python? Roy Smith <roy@panix.com> - 2013-05-18 15:37 -0400
          Re: How to write fast into a file in python? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-05-18 22:23 +0000
        Re: How to write fast into a file in python? Fábio Santos <fabiosantosart@gmail.com> - 2013-05-18 22:19 +0100
        Re: How to write fast into a file in python? Dave Angel <davea@davea.name> - 2013-05-18 22:41 -0400
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-19 06:53 +0300
        Re: How to write fast into a file in python? MRAB <python@mrabarnett.plus.com> - 2013-05-19 16:44 +0100
        RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-20 13:34 +0300
    Re: How to write fast into a file in python? Dan Stromberg <drsalists@gmail.com> - 2013-05-18 12:38 -0700
    RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-19 08:31 +0300
    RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-19 08:42 +0300
    Re: How to write fast into a file in python? Chris Angelico <rosuav@gmail.com> - 2013-05-19 19:21 +1000
    RE: How to write fast into a file in python? Carlos Nepomuceno <carlosnepomuceno@outlook.com> - 2013-05-19 12:41 +0300

csiph-web