Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #12283
| From | Roy Smith <roy@panix.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: Record seperator |
| Date | 2011-08-27 13:45 -0400 |
| Organization | PANIX Public Access Internet and UNIX, NYC |
| Message-ID | <roy-F7BDDC.13453127082011@news.panix.com> (permalink) |
| References | <slrnj5fo7u.4ra.greymausg@hmaus.org> <mailman.451.1314385354.27778.python-list@python.org> <slrnj5i1g9.581.greymausg@hmaus.org> <4e592852$0$29965$c3e8da3$5496439d@news.astraweb.com> |
In article <4e592852$0$29965$c3e8da3$5496439d@news.astraweb.com>,
Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> open("file.txt") # opens the file
> .read() # reads the contents of the file
> .split("\n\n") # splits the text on double-newlines.
The biggest problem with this code is that read() slurps the entire file
into a string. That's fine for moderately sized files, but will fail
(or at least be grossly inefficient) for very large files.
It's always annoyed me a little that while it's easy to iterate over the
lines of a file, it's more complicated to iterate over a file character
by character. You could write your own generator to do that:
for c in getchar(open("file.txt")):
whatever
def getchar(f):
for line in f:
for c in line:
yield c
but that's annoyingly verbose (and probably not hugely efficient).
Of course, the next problem for the specific problem at hand is that
even with an iterator over the characters of a file, split() only works
on strings. It would be nice to have a version of split which took an
iterable and returned an iterator over the split components. Maybe
there is such a thing and I'm just missing it?
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Record seperator greymaus <greymausg@mail.com> - 2011-08-26 18:39 +0000
Re: Record seperator "D'Arcy J.M. Cain" <darcy@druid.net> - 2011-08-26 15:02 -0400
Re: Record seperator greymaus <greymausg@mail.com> - 2011-08-27 16:59 +0000
Re: Record seperator Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-28 03:24 +1000
Re: Record seperator Roy Smith <roy@panix.com> - 2011-08-27 13:45 -0400
Re: Record seperator ChasBrown <cbrown@cbrownsystems.com> - 2011-08-27 11:40 -0700
Re: Record seperator Terry Reedy <tjreedy@udel.edu> - 2011-08-27 16:03 -0400
Re: Record seperator Roy Smith <roy@panix.com> - 2011-08-27 17:07 -0400
Re: Record seperator Terry Reedy <tjreedy@udel.edu> - 2011-08-27 20:55 -0400
Re: Record seperator Chris Angelico <rosuav@gmail.com> - 2011-08-28 06:07 +1000
Re: Record seperator greymaus <greymausg@mail.com> - 2011-08-28 10:03 +0000
csiph-web