Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89909

Re: when does newlines get set in universal newlines mode?

From Peter Otten <__peter__@web.de>
Subject Re: when does newlines get set in universal newlines mode?
Date 2015-05-04 17:17 +0200
Organization None
References <3c45772b-77e0-4c17-8b3d-aa246c4b511c@googlegroups.com> <mi7n22$mbc$1@ger.gmane.org> <CAPTjJmp_5uZE1Zm2DVOG1CHDGYr9i1jLAOVaRaOVf8eXkE=btw@mail.gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.91.1430752649.12865.python-list@python.org> (permalink)

Show all headers | View raw


Chris Angelico wrote:

> On Mon, May 4, 2015 at 10:01 PM, Peter Otten <__peter__@web.de> wrote:
>> I tried:
>>
>>>>> with open("tmp.txt", "wb") as f: f.write("alpha\r\nbeta\rgamma\n")
>> ...
>>>>> f = open("tmp.txt", "rU")
>>>>> f.newlines
>>>>> f.readline()
>> 'alpha\n'
>>>>> f.newlines
>> # expected: '\r\n'
>>>>> f.readline()
>> 'beta\n'
>>>>> f.newlines
>> '\r\n' # expected: ('\r', '\r\n')
>>>>> f.readline()
>> 'gamma\n'
>>>>> f.newlines
>> ('\r', '\n', '\r\n')
>>
>> I believe this is a bug.
> 
> I'm not sure it is, actually; imagine the text is coming in one
> character at a time (eg from a pipe), and it's seen "alpha\r". It
> knows that this is a line, so it emits it; but until the next
> character is read, it can't know whether it's going to be \r or \r\n.
> What should it do? Read another character, which might block? Put "\r"
> into .newlines, which might be wrong? Once it sees the \n, it knows
> that it was \r\n (or rather, it assumes that files do not have lines
> of text terminated by \r followed by blank lines terminated by \n -
> because that would be stupid).
> 
> It may be worth documenting this limitation, but it's not something
> that can easily be fixed without removing support for \r newlines -
> although that might be an option, given that non-OSX Macs are
> basically history now.

OK, you convinced me. Then I tried:

>>> with open("tmp.txt", "wb") as f: f.write("0\r\n3\r5\n7")
... 
>>> assert len(open("tmp.txt", "rb").read()) == 8
>>> f = open("tmp.txt", "rU")
>>> f.readline()
'0\n'
>>> f.newlines
>>> f.tell()
3
>>> f.newlines
'\r\n'

Hm, so tell() moves the file pointer? Is that sane?

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

when does newlines get set in universal newlines mode? arekfu@gmail.com - 2015-05-04 02:50 -0700
  Re: when does newlines get set in universal newlines mode? Peter Otten <__peter__@web.de> - 2015-05-04 14:01 +0200
  Re: when does newlines get set in universal newlines mode? Chris Angelico <rosuav@gmail.com> - 2015-05-04 22:13 +1000
    Re: when does newlines get set in universal newlines mode? Davide Mancusi <arekfu@gmail.com> - 2015-05-04 06:35 -0700
      Re: when does newlines get set in universal newlines mode? Terry Reedy <tjreedy@udel.edu> - 2015-05-04 13:38 -0400
    Re: when does newlines get set in universal newlines mode? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-05-05 18:31 +1000
      Re: when does newlines get set in universal newlines mode? Chris Angelico <rosuav@gmail.com> - 2015-05-05 18:41 +1000
        Re: when does newlines get set in universal newlines mode? Davide Mancusi <arekfu@gmail.com> - 2015-05-05 02:23 -0700
          Re: when does newlines get set in universal newlines mode? Chris Angelico <rosuav@gmail.com> - 2015-05-05 19:28 +1000
            Re: when does newlines get set in universal newlines mode? Davide Mancusi <arekfu@gmail.com> - 2015-05-05 03:58 -0700
  Re: when does newlines get set in universal newlines mode? Peter Otten <__peter__@web.de> - 2015-05-04 17:17 +0200
  Re: when does newlines get set in universal newlines mode? Chris Angelico <rosuav@gmail.com> - 2015-05-05 01:26 +1000
  Re: when does newlines get set in universal newlines mode? Ian Kelly <ian.g.kelly@gmail.com> - 2015-05-04 09:33 -0600

csiph-web