Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #103483

Re: How to read from a file to an arbitrary delimiter efficiently?

From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Subject Re: How to read from a file to an arbitrary delimiter efficiently?
Date 2016-02-25 18:30 +1100
Message-ID <mailman.116.1456385901.20994.python-list@python.org> (permalink)
References <56cea44e$0$11128$c3e8da3@news.astraweb.com>

Show all headers | View raw


On Thu, Feb 25, 2016 at 5:50 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>
> # Read a chunk of bytes/characters from an open file.
> def chunkiter(f, delim):
>     buffer = []
>     b = f.read(1)
>     while b:
>         buffer.append(b)
>         if b in delim:
>             yield ''.join(buffer)
>             buffer = []
>         b = f.read(1)
>     if buffer:
>         yield ''.join(buffer)

How bad is it if you over-read? If it's absolutely critical that you
not read anything from the buffer that you shouldn't, then yeah, it's
going to be slow. But if you're never going to read the file using
anything other than this iterator, the best thing to do is to read
more at a time. Simple and naive method:

def chunkiter(f, delim):
    """Don't use [ or ] as the delimiter, kthx"""
    buffer = ""
    b = f.read(256)
    while b:
        buffer += b
        *parts, buffer = re.split("["+delim+"]", buffer)
        yield from parts
    if buffer: yield buffer

How well does that perform?

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-25 17:50 +1100
  Re: How to read from a file to an arbitrary delimiter efficiently? Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-02-25 08:37 +0100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 21:40 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Dan Sommers <dan@tombstonezero.net> - 2016-02-27 14:40 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-02-27 12:03 -0500
        Re: How to read from a file to an arbitrary delimiter efficiently? Marko Rauhamaa <marko@pacujo.net> - 2016-02-27 19:47 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-25 18:30 +1100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 20:49 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:17 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:18 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Serhiy Storchaka <storchaka@gmail.com> - 2016-02-27 17:23 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Paul Rubin <no.email@nospam.invalid> - 2016-02-24 23:48 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:37 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:38 -0800
  Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 16:35 +0000
    Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:03 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-28 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Tim Delaney <timothy.c.delaney@gmail.com> - 2016-02-29 08:00 +1100

csiph-web