Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'scripts': 0.03; 'cache': 0.07; 'subject:file': 0.07; 'string': 0.09; '__name__': 0.09; 'f.close()': 0.09; 'filename': 0.09; 'measure': 0.09; 'optimizing': 0.09; 'subject:into': 0.09; 'things,': 0.09; 'subject:How': 0.10; 'python': 0.11; 'def': 0.12; 'times,': 0.14; '###': 0.16; "'__main__':": 0.16; "'w')": 0.16; '100;': 0.16; '126': 0.16; 'advance!': 0.16; 'cstringio': 0.16; 'importance:': 0.16; 'parse,': 0.16; 'sec': 0.16; 'two,': 0.16; 'subject:python': 0.16; 'followed': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'file,': 0.19; 'code,': 0.22; 'import': 0.22; '+0000': 0.22; 'python?': 0.22; 'to:name:python-list@python.org': 0.22; 'certainly': 0.24; 'fraction': 0.24; 'parse': 0.24; 'received:65.55.116': 0.24; 'regardless': 0.24; "i've": 0.25; 'compiled': 0.26; 'nearly': 0.26; 'read,': 0.26; 'skip:" 20': 0.27; 'gets': 0.27; 'header:In- Reply-To:1': 0.27; 'rest': 0.29; 'skip:- 40': 0.29; 'compared': 0.30; 'waste': 0.30; 'url:mailman': 0.30; 'code': 0.31; "skip:' 10": 0.31; 'once,': 0.31; 'second,': 0.31; 'steven': 0.31; 'file': 0.32; 'probably': 0.32; 'me?': 0.32; 'run': 0.32; 'open': 0.33; 'url:python': 0.33; 'fri,': 0.33; 'raw': 0.33; 'totally': 0.33; 'date:': 0.34; 'really': 0.36; 'executing': 0.36; 'right?': 0.36; 'url:listinfo': 0.36; 'thanks': 0.36; 'subject:?': 0.36; 'url:org': 0.36; 'experience,': 0.37; 'seconds': 0.37; 'so,': 0.37; 'email addr:python.org': 0.37; 'step': 0.37; 'thank': 0.38; 'writes': 0.38; 'to:addr:python-list': 0.38; 'previous': 0.38; 'little': 0.38; 'does': 0.39; 'subject:': 0.39; '\xa0\xa0\xa0': 0.39; 'to:addr:python.org': 0.39; 'url:mail': 0.40; 'how': 0.40; 'read': 0.60; 'increased': 0.61; 'took': 0.61; 'matter': 0.61; 'times': 0.62; 'complete': 0.62; 're:': 0.63; 'such': 0.63; 'choose': 0.64; 'different': 0.65; 'taking': 0.65; 'email name :python-list': 0.65; 'media': 0.66; 'here': 0.66; 'capable': 0.67; '10000': 0.68; 'results': 0.69; 'compiling': 0.84; 'right!': 0.84; 'carlos': 0.91; 'reducing': 0.93; '2013': 0.98 X-TMN: [bNxH/sQRWE1SfiyqCcOQc6RORak5zLTC] X-Originating-Email: [carlosnepomuceno@outlook.com] From: Carlos Nepomuceno To: "python-list@python.org" Subject: RE: How to write fast into a file in python? Date: Fri, 17 May 2013 20:25:37 +0300 Importance: Normal In-Reply-To: <51965e0e$0$29997$c3e8da3$5496439d@news.astraweb.com> References: , <87f9a3d4-427e-472f-bee7-9501ba842b36@googlegroups.com>, <51961B73.2070401@davea.name>, , <51965e0e$0$29997$c3e8da3$5496439d@news.astraweb.com> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginalArrivalTime: 17 May 2013 17:25:37.0914 (UTC) FILETIME=[8C0FF5A0:01CE5323] X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 122 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1368811606 news.xs4all.nl 15940 [2001:888:2000:d::a6]:51764 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:45479 Thank you Steve! You are totally right!=0A= =0A= It takes about 0.2s for the f.write() to return. Certainly because it write= s to the system file cache (~250MB/s).=0A= =0A= Using a little bit different approach I've got:=0A= =0A= C:\src\Python>python -m timeit -cvn3 -r3 -s"from fastwrite5r import run" "r= un()"=0A= raw times: 24 25.1 24.4=0A= 3 loops=2C best of 3: 8 sec per loop=0A= =A0=A0=A0 =0A= =0A= This time it took 8s to complete from previous 11.3s.=0A= =0A= Does those 3.3s are the time to "open=2C read=2C parse=2C compile" steps yo= u told me?=0A= =0A= If so=2C the execute step is really taking 8s=2C right?=0A= =0A= Why does it take so long to build the string to be written? Can it get fast= er?=0A= =0A= Thanks in advance!=0A= =0A= =0A= =0A= ### fastwrite5r.py ###=0A= def run():=0A= =A0=A0=A0 import cStringIO=0A= =A0=A0=A0 size =3D 50*1024*1024=0A= =A0=A0=A0 value =3D 0=0A= =A0=A0=A0 filename =3D 'fastwrite5.dat'=0A= =A0=A0=A0 x =3D 0=0A= =A0=A0=A0 b =3D cStringIO.StringIO()=0A= =A0=A0=A0 while x < size:=0A= =A0=A0=A0=A0=A0=A0=A0 line =3D '{0}\n'.format(value)=0A= =A0=A0=A0=A0=A0=A0=A0 b.write(line)=0A= =A0=A0=A0=A0=A0=A0=A0 value +=3D 1=0A= =A0=A0=A0=A0=A0=A0=A0 x +=3D len(line)+1=0A= =A0=A0=A0 f =3D open(filename=2C 'w')=0A= =A0=A0=A0 f.write(b.getvalue())=0A= =A0=A0=A0 f.close()=0A= =A0=A0=A0 b.close()=0A= =0A= if __name__ =3D=3D '__main__':=0A= =A0=A0=A0 run()=0A= =0A= =0A= =0A= =0A= =0A= ----------------------------------------=0A= > From: steve+comp.lang.python@pearwood.info=0A= > Subject: Re: How to write fast into a file in python?=0A= > Date: Fri=2C 17 May 2013 16:42:55 +0000=0A= > To: python-list@python.org=0A= >=0A= > On Fri=2C 17 May 2013 18:20:33 +0300=2C Carlos Nepomuceno wrote:=0A= >=0A= >> I've got the following results on my desktop PC (Win7/Python2.7.5):=0A= >>=0A= >> C:\src\Python>python -m timeit -cvn3 -r3 "execfile('fastwrite2.py')" raw= =0A= >> times: 123 126 125=0A= >> 3 loops=2C best of 3: 41 sec per loop=0A= >=0A= > Your times here are increased significantly by using execfile. Using=0A= > execfile means that instead of compiling the code once=2C then executing= =0A= > many times=2C it gets compiled over and over and over and over again. In = my=0A= > experience=2C using exec=2C execfile or eval makes your code ten or twent= y=0A= > times slower:=0A= >=0A= > [steve@ando ~]$ python -m timeit 'x =3D 100=3B y =3D x/3'=0A= > 1000000 loops=2C best of 3: 0.175 usec per loop=0A= > [steve@ando ~]$ python -m timeit 'exec("x =3D 100=3B y =3D x/3")'=0A= > 10000 loops=2C best of 3: 37.8 usec per loop=0A= >=0A= >=0A= >> Strangely I just realised that the time it takes to complete such=0A= >> scripts is the same no matter what hard drive I choose to run them. The= =0A= >> results are the same for an SSD (main drive) and a HDD.=0A= >=0A= > There's nothing strange here. The time you measure is dominated by three= =0A= > things=2C in reducing order of importance:=0A= >=0A= > * the poor choice of execfile dominates the time taken=3B=0A= >=0A= > * followed by choice of algorithm=3B=0A= >=0A= > * followed by the time it actually takes to write to the disk=2C which is= =0A= > probably insignificant compared to the other two=2C regardless of whether= =0A= > you are using a HDD or SSD.=0A= >=0A= > Until you optimize the code=2C optimizing the media is a waste of time.= =0A= >=0A= >=0A= >> I think it's very strange to take 11.3s to write 50MB (4.4MB/s)=0A= >> sequentially on a SSD which is capable of 140MB/s.=0A= >=0A= > It doesn't. It takes 11.3 seconds to open a file=2C read it into memory= =2C=0A= > parse it=2C compile it into byte-code=2C and only then execute it. My=0A= > prediction is that the call to f.write() and f.close() probably take a=0A= > fraction of a second=2C and nearly all of the rest of the time is taken b= y=0A= > other calculations.=0A= >=0A= >=0A= >=0A= > --=0A= > Steven=0A= > --=0A= > http://mail.python.org/mailman/listinfo/python-list =