Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #103606
| From | BartC <bc@freeuk.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: How to read from a file to an arbitrary delimiter efficiently? |
| Date | 2016-02-27 20:03 +0000 |
| Organization | A noiseless patient Spider |
| Message-ID | <nasv9k$ij8$1@dont-email.me> (permalink) |
| References | <56cea44e$0$11128$c3e8da3@news.astraweb.com> <nasj2p$hec$1@dont-email.me> |
On 27/02/2016 16:35, BartC wrote: > On 25/02/2016 06:50, Steven D'Aprano wrote: >> I have a need to read to an arbitrary delimiter, which might be any of a >> (small) set of characters. For the sake of the exercise, lets say it is >> either ! or ? (for example). > However those aren't the main reasons for the poor speed. The limiting > factor here is reading one byte at a time. Just a loop like this: > > while f.read(1): > pass > > without doing anything else, seems to take most of the time. (3.6 > seconds, compared with 5.6 seconds of your readchunks() on a 6MB version > of your test file, on Python 2.7. readlines() took about 0.2 seconds.) > > Any faster solutions would need to read more than one byte at a time. I've done some more test using Python 3.4, with the same 200,000 line 6MB test file: 0.25 seconds Scan the file with 'for line in f' 2.25 seconds Scan the file with your readlines() routine 4.0 seconds Scan the file with your readchunks() routine 0.65 seconds Scan the file with using a buffer This latter test uses a 64-byte buffer, reading not more than an extra 63 bytes, but resetting the file position to just past the end of of each identified chunk so that any subsequent read works as expected. This test (the code is too untidy to post) only checks for two specific delimiters (not an arbitrary string fill of them). (It also counts EOF as a valid delimiter so counts one more chunk.) Increasing the buffer size doesn't help, and beyond 256 bytes slowed things down (for this input) as it spends too long rereading data. -- Bartc
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-25 17:50 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-02-25 08:37 +0100
Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 21:40 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Dan Sommers <dan@tombstonezero.net> - 2016-02-27 14:40 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-02-27 12:03 -0500
Re: How to read from a file to an arbitrary delimiter efficiently? Marko Rauhamaa <marko@pacujo.net> - 2016-02-27 19:47 +0200
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-25 18:30 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 20:49 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:17 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:18 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Serhiy Storchaka <storchaka@gmail.com> - 2016-02-27 17:23 +0200
Re: How to read from a file to an arbitrary delimiter efficiently? Paul Rubin <no.email@nospam.invalid> - 2016-02-24 23:48 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:37 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:38 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 16:35 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:03 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:28 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-28 20:28 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Tim Delaney <timothy.c.delaney@gmail.com> - 2016-02-29 08:00 +1100
csiph-web