Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #103483
| Path | csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Chris Angelico <rosuav@gmail.com> |
| Newsgroups | comp.lang.python |
| Subject | Re: How to read from a file to an arbitrary delimiter efficiently? |
| Date | Thu, 25 Feb 2016 18:30:25 +1100 |
| Lines | 35 |
| Message-ID | <mailman.116.1456385901.20994.python-list@python.org> (permalink) |
| References | <56cea44e$0$11128$c3e8da3@news.astraweb.com> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=UTF-8 |
| X-Trace | news.uni-berlin.de O1QOXr8GjnGQjyLK1AtQZAvfeKzl+Kn3iGvsyMMK6pZA== |
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.003 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'chunk': 0.07; 'subject:file': 0.07; 'cc:addr:python-list': 0.09; 'subject:How': 0.09; 'method:': 0.09; 'slow.': 0.09; 'yeah,': 0.09; 'def': 0.13; 'thu,': 0.15; '2016': 0.16; 'buffer)': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'iterator,': 0.16; 'naive': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'file.': 0.22; 'feb': 0.23; 'header:In-Reply-To:1': 0.24; 'message-id:@mail.gmail.com': 0.27; 'yield': 0.27; "skip:' 10": 0.28; 'received:209.85.213.174': 0.29; "d'aprano": 0.33; 'steven': 0.33; 'open': 0.33; 'file': 0.34; 'received:google.com': 0.35; 'but': 0.36; 'received:209.85': 0.36; 'subject:?': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'received:209.85.213': 0.37; 'received:209': 0.38; 'anything': 0.38; 'does': 0.39; 'subject:from': 0.39; 'more': 0.63; 'chrisa': 0.84; "shouldn't,": 0.84; 'subject:read': 0.84; 'absolutely': 0.88; 'to:none': 0.91 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=pLhrscUIN+aGEJWfZQCB7Tc0R851U9E/WVffmrOVckA=; b=jNDmF26K/f8oVEjPoRkgZ42JEz0YWjUNayu0dUQSfNNIoi0BG5/cMAvisALSe2UhyA a49/bSzJghaZkJK6aU/+vHNIZP+lNXemi6h6Jytv2R6rGBg0wQWGVh+C8PIozeAvcrOG FsRkvxIBw0/4rtmNGfY8pGO+YPvwLxn+Fhnfq/VZtQ9V2zwWxmE5UmVRqXVJMhovIMNg j1kBX8eRL6nXmtpKeZ4OCX6x5qIKpBPKEVQXiFKTfKqpjAr/uCicmpn+yNqrsQW9obTV 7Q+Uzd5hG+Wtx+rZDqi48vxT5I6jP9a1WqHm7aJCIoiETqC6aeVBonbdLa1qz4QUrxhs Wuig== |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=pLhrscUIN+aGEJWfZQCB7Tc0R851U9E/WVffmrOVckA=; b=aSGluNmtEEyuljIypUJd5ZXITJsS+otXnhyrQyZdtQdBiCYv8YuNgedpk0XYQapCXw CqDuICSoSGqXsJgkYAL6ECXboyZjx6Rl++rswyU+uKq5NeaxE6clK61hKOwPID+VyNDQ VmA1DikF42DM7qOeaXXNMVhmb6Z/mci/udJW6aEadNnVFAl2Xx4t2JSwpyTLWGxJC97n pQx0jStnp9GcxDL+X2AYvoE5WFfDFGLLzI7CSeNqCxTUQwflj00uX4UNQhpJovQ1fyJi fhPXf05/1PUZVaM9C77oKeHuM/fEUDTEUVpleUl2I/EYwiLaWqcGPS2uFiDUIV09VdOU TXgA== |
| X-Gm-Message-State | AG10YOSfjl9SyDBjshUixgJRCg5LthtWnWRH6GCEqxOkJNZUVfLfH/lp//qiNADUwbPVYKCcozcRFzqg0OugIg== |
| X-Received | by 10.50.137.35 with SMTP id qf3mr1757344igb.92.1456385425134; Wed, 24 Feb 2016 23:30:25 -0800 (PST) |
| In-Reply-To | <56cea44e$0$11128$c3e8da3@news.astraweb.com> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.21rc2 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Xref | csiph.com comp.lang.python:103483 |
Show key headers only | View raw
On Thu, Feb 25, 2016 at 5:50 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>
> # Read a chunk of bytes/characters from an open file.
> def chunkiter(f, delim):
> buffer = []
> b = f.read(1)
> while b:
> buffer.append(b)
> if b in delim:
> yield ''.join(buffer)
> buffer = []
> b = f.read(1)
> if buffer:
> yield ''.join(buffer)
How bad is it if you over-read? If it's absolutely critical that you
not read anything from the buffer that you shouldn't, then yeah, it's
going to be slow. But if you're never going to read the file using
anything other than this iterator, the best thing to do is to read
more at a time. Simple and naive method:
def chunkiter(f, delim):
"""Don't use [ or ] as the delimiter, kthx"""
buffer = ""
b = f.read(256)
while b:
buffer += b
*parts, buffer = re.split("["+delim+"]", buffer)
yield from parts
if buffer: yield buffer
How well does that perform?
ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-25 17:50 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-02-25 08:37 +0100
Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 21:40 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Dan Sommers <dan@tombstonezero.net> - 2016-02-27 14:40 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-02-27 12:03 -0500
Re: How to read from a file to an arbitrary delimiter efficiently? Marko Rauhamaa <marko@pacujo.net> - 2016-02-27 19:47 +0200
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-25 18:30 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 20:49 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:17 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:18 +1100
Re: How to read from a file to an arbitrary delimiter efficiently? Serhiy Storchaka <storchaka@gmail.com> - 2016-02-27 17:23 +0200
Re: How to read from a file to an arbitrary delimiter efficiently? Paul Rubin <no.email@nospam.invalid> - 2016-02-24 23:48 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:37 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:38 -0800
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 16:35 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:03 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:28 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-28 20:28 +0000
Re: How to read from a file to an arbitrary delimiter efficiently? Tim Delaney <timothy.c.delaney@gmail.com> - 2016-02-29 08:00 +1100
csiph-web