Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #103594

Re: How to read from a file to an arbitrary delimiter efficiently?

From Dennis Lee Bieber <wlfraed@ix.netcom.com>
Newsgroups comp.lang.python
Subject Re: How to read from a file to an arbitrary delimiter efficiently?
Date 2016-02-27 12:03 -0500
Organization IISS Elusive Unicorn
Message-ID <mailman.183.1456592632.20994.python-list@python.org> (permalink)
References <56cea44e$0$11128$c3e8da3@news.astraweb.com> <mailman.115.1456385844.20994.python-list@python.org> <56d17d13$0$1596$c3e8da3$5496439d@news.astraweb.com>

Show all headers | View raw


On Sat, 27 Feb 2016 21:40:17 +1100, Steven D'Aprano <steve@pearwood.info>
declaimed the following:

>Wow. Ten years and still no solution :-(
>
>Thanks for finding the issue, but the solutions given don't suit my use
>case. I don't want an iterator that operates on pre-read blocks, I want
>something that will read a record from a file, and leave the file pointer
>one entry past the end of the record.
>
>Oh, and records are likely fairly short, but there may be a lot of them.

	Considering that most of the world has settled on the view that files
are just linear streams (curse you, UNIX) anything working with "records"
has to build the concept on top of the stream. Either by making records
"fixed width" (allowing for fast random access: recNum*recLen => seek
position), though likely giving up the stream access... Or by wrapping the
stream with something that does parsing/buffering.

	Old days, in my world, the first was more common -- after all, the
"common" input method was 80-column Hollerith cards; records consisted of
reading one (or a set) of cards and then handling what was on that multiple
of 80 characters. My college computer system, by default, used an ISAM
structure for editor text files -- but that was a system where the ISAM
overhead was handled transparently by the OS, not a user-level linked
library (how many libraries are there for ISAM access in C?), so even
simple "type"/"print" commands properly displayed the contents.

	The other format is the Pascal style counted-string saved as file
contents, in which each "record" is prefaced with a length code. While not
as fast as fixed-length records, it does allow for rather fast scanning of
a file by reading the length field, then seeking that many bytes further
before reading the next length. But again, the I/O library has to retain
knowledge of what the record length was, and how far into a record one has
advanced (if not doing full record I/O) so that one recognizes the next
length field.

	I will admit that I miss the idea of OS support for higher level file
structures (even the TRS-80 had OS support for fixed length random access
files -- and not by wasting the rest of a disk sector; the OS did the
packing/unpacking of shorter records into the sectors).
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-25 17:50 +1100
  Re: How to read from a file to an arbitrary delimiter efficiently? Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-02-25 08:37 +0100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 21:40 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Dan Sommers <dan@tombstonezero.net> - 2016-02-27 14:40 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2016-02-27 12:03 -0500
        Re: How to read from a file to an arbitrary delimiter efficiently? Marko Rauhamaa <marko@pacujo.net> - 2016-02-27 19:47 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-25 18:30 +1100
    Re: How to read from a file to an arbitrary delimiter efficiently? Steven D'Aprano <steve@pearwood.info> - 2016-02-27 20:49 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:17 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Chris Angelico <rosuav@gmail.com> - 2016-02-27 23:18 +1100
      Re: How to read from a file to an arbitrary delimiter efficiently? Serhiy Storchaka <storchaka@gmail.com> - 2016-02-27 17:23 +0200
  Re: How to read from a file to an arbitrary delimiter efficiently? Paul Rubin <no.email@nospam.invalid> - 2016-02-24 23:48 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:37 -0800
    Re: How to read from a file to an arbitrary delimiter efficiently? wxjmfauth@gmail.com - 2016-02-25 06:38 -0800
  Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 16:35 +0000
    Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:03 +0000
      Re: How to read from a file to an arbitrary delimiter efficiently? BartC <bc@freeuk.com> - 2016-02-27 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-28 20:28 +0000
  Re: How to read from a file to an arbitrary delimiter efficiently? Tim Delaney <timothy.c.delaney@gmail.com> - 2016-02-29 08:00 +1100

csiph-web