Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #103590

Re: How to read from a file to an arbitrary delimiter efficiently?

From BartC <bc@freeuk.com>
Newsgroups comp.lang.python
Subject Re: How to read from a file to an arbitrary delimiter efficiently?
Date 2016-02-27 16:35 +0000
Organization A noiseless patient Spider
Message-ID <nasj2p$hec$1@dont-email.me> (permalink)
References <56cea44e$0$11128$c3e8da3@news.astraweb.com>

Show all headers | View raw


On 25/02/2016 06:50, Steven D'Aprano wrote:
> I have a need to read to an arbitrary delimiter, which might be any of a
> (small) set of characters. For the sake of the exercise, lets say it is
> either ! or ? (for example).

>
> # Read a chunk of bytes/characters from an open file.
> def chunkiter(f, delim):
>      buffer = []
>      b = f.read(1)
>      while b:
>          buffer.append(b)
>          if b in delim:
>              yield ''.join(buffer)
>              buffer = []
>          b = f.read(1)
>      if buffer:
>          yield ''.join(buffer)

At first sight, it's not surprising it's slow when you throw in 
generators and whatnot in there.

However those aren't the main reasons for the poor speed. The limiting 
factor here is reading one byte at a time. Just a loop like this:

    while f.read(1):
       pass

without doing anything else, seems to take most of the time. (3.6 
seconds, compared with 5.6 seconds of your readchunks() on a 6MB version 
of your test file, on Python 2.7. readlines() took about 0.2 seconds.)

Any faster solutions would need to read more than one byte at a time.

(This bottleneck occurs in C too if you try and do read a file using 
only fgetc(), compared with any buffered solutions.)

-- 
bartc

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-25 17:50 +1100
  Re: How to read from a file to an arbitrary delimiter efficiently? Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-02-25 08:37 +0100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 21:40 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Dan Sommers <dan@tombstonezero.net> - 2016-02-27 14:40 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-02-27 12:03 -0500
        Re: How to read from a file to an arbitrary delimiter efficiently? Marko Rauhamaa <marko@pacujo.net> - 2016-02-27 19:47 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-25 18:30 +1100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 20:49 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:17 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:18 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Serhiy Storchaka <storchaka@gmail.com> - 2016-02-27 17:23 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Paul Rubin <no.email@nospam.invalid> - 2016-02-24 23:48 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:37 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:38 -0800
  Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 16:35 +0000
    Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:03 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-28 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Tim Delaney <timothy.c.delaney@gmail.com> - 2016-02-29 08:00 +1100

csiph-web