Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #68801 > unrolled thread

Re: Reading in cooked mode (was Re: Python MSI not installing, log file showing name of a Viatnemese communist revolutionary)

Started byCameron Simpson <cs@zip.com.au>
First post2014-03-23 13:16 +1100
Last post2014-03-23 12:22 +0200
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Reading in cooked mode (was Re: Python MSI not installing, log file showing name of a Viatnemese communist revolutionary) Cameron Simpson <cs@zip.com.au> - 2014-03-23 13:16 +1100
    Re: Reading in cooked mode Marko Rauhamaa <marko@pacujo.net> - 2014-03-23 12:22 +0200

#68801 — Re: Reading in cooked mode (was Re: Python MSI not installing, log file showing name of a Viatnemese communist revolutionary)

FromCameron Simpson <cs@zip.com.au>
Date2014-03-23 13:16 +1100
SubjectRe: Reading in cooked mode (was Re: Python MSI not installing, log file showing name of a Viatnemese communist revolutionary)
Message-ID<mailman.8411.1395541003.18130.python-list@python.org>
On 23Mar2014 12:37, Chris Angelico <rosuav@gmail.com> wrote:
> On Sun, Mar 23, 2014 at 12:07 PM, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> wrote:
> > On Sun, 23 Mar 2014 02:09:20 +1100, Chris Angelico wrote:
> >> On Sun, Mar 23, 2014 at 1:50 AM, Steven D'Aprano
> >> <steve+comp.lang.python@pearwood.info> wrote:
> >>> Line endings are terminators: they end the line. Whether you consider
> >>> the terminator part of the line or not is a matter of opinion (is the
> >>> cover of a book part of the book?) but consider this:
> >>>
> >>>     If you say that the end of lines are *not* part of the line, then
> >>>     that implies that some parts of the file are not inside any line at
> >>>     all. And that would be just weird.
> >>
> >> Not so weird IMO. A file is not a concatenation of lines; it is a stream
> >> of bytes.
> >
> > But a *text file* is a concatenation of lines. The "text file" model is
> > important enough that nearly all programming languages offer a line-based
> > interface to files, and some (Python at least, possibly others) make it
> > the default interface so that iterating over the file gives you lines
> > rather than bytes -- even in "binary" mode.
> 
> And lines are delimited entities. A text file is a sequence of lines,
> separated by certain characters.
[...snip...]

As far as I'm concerned, a text file is a sequence lines, each of
which is _terminated_ by a newline (or the OS end-of-line flavour).

So I say "terminated by", not "separated by".

Plenty of people use editors that consider end-of-line to be a
separator and not a terminator, leading to supposed text files
lacking trailing newlines (or end-of-line of OS).

I consider this sloppy and error prone.

I like to be able to read a file and if it lacks a final newline
then I have a good clue that the file was incompletely written.
Editors (and other tools) that won't enforce a trailing newline as
omitting an easy way to give a fairly robust indication of completion
at no benefit to the user. (Not to mention the visual annoyance of
"cat file" when there's no trailing newline.)

So I'm happy to write code that errors if a line lacks a trailing
newline, and thus I consider the newline to be an intergral part
of the line.

Having passed that sanity check, for most machine readable text
formats I'm usually happy to use:

  line = line.rstrip()

to get the salient part of the line.

(Of course, lines extended with slosh-extension or the like need
pickier handling.)

Cheers,
-- 
Cameron Simpson <cs@zip.com.au>

If at first you don't succeed, your sky-diving days are over.
        - Paul Blumstein, paulb@harley.tti.com, DoD #36

[toc] | [next] | [standalone]


#68805 — Re: Reading in cooked mode

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-03-23 12:22 +0200
SubjectRe: Reading in cooked mode
Message-ID<87eh1tjju4.fsf@elektro.pacujo.net>
In reply to#68801
Cameron Simpson <cs@zip.com.au>:

> Plenty of people use editors that consider end-of-line to be a
> separator and not a terminator, leading to supposed text files lacking
> trailing newlines (or end-of-line of OS).

I use an editor (emacs) that considers the end-of-line to be a byte
among others.

> I consider this sloppy and error prone.

If any editor, emacs is smart, but it generally doesn't insert
characters on its own. I like it that way.

> So I'm happy to write code that errors if a line lacks a trailing
> newline, and thus I consider the newline to be an intergral part of
> the line.

For sure, any file reader must think the situation through. Note, for
example, that CPython doesn't require the source code file to end in a
newline.


Marko

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web