Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #12283
| Path | csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.glorb.com!solaris.cc.vt.edu!news.vt.edu!newsfeed-00.mathworks.com!panix!roy |
|---|---|
| From | Roy Smith <roy@panix.com> |
| Newsgroups | comp.lang.python |
| Subject | Re: Record seperator |
| Date | Sat, 27 Aug 2011 13:45:31 -0400 |
| Organization | PANIX Public Access Internet and UNIX, NYC |
| Lines | 30 |
| Message-ID | <roy-F7BDDC.13453127082011@news.panix.com> (permalink) |
| References | <slrnj5fo7u.4ra.greymausg@hmaus.org> <mailman.451.1314385354.27778.python-list@python.org> <slrnj5i1g9.581.greymausg@hmaus.org> <4e592852$0$29965$c3e8da3$5496439d@news.astraweb.com> |
| NNTP-Posting-Host | localhost |
| X-Trace | reader1.panix.com 1314467133 23280 127.0.0.1 (27 Aug 2011 17:45:33 GMT) |
| X-Complaints-To | abuse@panix.com |
| NNTP-Posting-Date | Sat, 27 Aug 2011 17:45:33 +0000 (UTC) |
| User-Agent | MT-NewsWatcher/3.5.3b3 (Intel Mac OS X) |
| Xref | x330-a1.tempe.blueboxinc.net comp.lang.python:12283 |
Show key headers only | View raw
In article <4e592852$0$29965$c3e8da3$5496439d@news.astraweb.com>,
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> open("file.txt") # opens the file
> .read() # reads the contents of the file
> .split("\n\n") # splits the text on double-newlines.
The biggest problem with this code is that read() slurps the entire file
into a string. That's fine for moderately sized files, but will fail
(or at least be grossly inefficient) for very large files.
It's always annoyed me a little that while it's easy to iterate over the
lines of a file, it's more complicated to iterate over a file character
by character. You could write your own generator to do that:
for c in getchar(open("file.txt")):
whatever
def getchar(f):
for line in f:
for c in line:
yield c
but that's annoyingly verbose (and probably not hugely efficient).
Of course, the next problem for the specific problem at hand is that
even with an iterator over the characters of a file, split() only works
on strings. It would be nice to have a version of split which took an
iterable and returned an iterator over the split components. Maybe
there is such a thing and I'm just missing it?
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Record seperator greymaus <greymausg@mail.com> - 2011-08-26 18:39 +0000
Re: Record seperator "D'Arcy J.M. Cain" <darcy@druid.net> - 2011-08-26 15:02 -0400
Re: Record seperator greymaus <greymausg@mail.com> - 2011-08-27 16:59 +0000
Re: Record seperator Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-28 03:24 +1000
Re: Record seperator Roy Smith <roy@panix.com> - 2011-08-27 13:45 -0400
Re: Record seperator ChasBrown <cbrown@cbrownsystems.com> - 2011-08-27 11:40 -0700
Re: Record seperator Terry Reedy <tjreedy@udel.edu> - 2011-08-27 16:03 -0400
Re: Record seperator Roy Smith <roy@panix.com> - 2011-08-27 17:07 -0400
Re: Record seperator Terry Reedy <tjreedy@udel.edu> - 2011-08-27 20:55 -0400
Re: Record seperator Chris Angelico <rosuav@gmail.com> - 2011-08-28 06:07 +1000
Re: Record seperator greymaus <greymausg@mail.com> - 2011-08-28 10:03 +0000
csiph-web