Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #3947

Re: detecting newline character

Message-Id <2838616.PzL39ZcT7Z@PointedEars.de>
From Thomas 'PointedEars' Lahn <PointedEars@web.de>
Organization PointedEars Software (PES)
Date 2011-04-24 11:19 +0200
Subject Re: detecting newline character
Newsgroups comp.lang.python
References (1 earlier) <mailman.781.1303585944.9059.python-list@python.org> <1450471.fLXkozyCbt@PointedEars.de> <mailman.784.1303590332.9059.python-list@python.org> <4d21a2c2-ff43-488f-8d12-b17f0bb94da1@q21g2000vbs.googlegroups.com> <mailman.791.1303633309.9059.python-list@python.org>
Followup-To comp.lang.python

Followups directed to: comp.lang.python

Show all headers | View raw


Daniel Geržo wrote:

> On 24.4.2011 9:05, jmfauth wrote:
>> Use the io module.
> 
> For the record, when I use io.open(file=self.path, mode="rt",
> encoding=enc)) as fobj:
> 
> my tests are passing and everything seems to work fine.
> 
> That indicates there is a bug with codecs module and universal newline
> support.

No, it proves that you either have not bothered to read the underlying 
source code and documentation (despite it has been quoted to you), or have 
not understood it.

It is clear now that codecs.open() would not support universal newlines from 
at least Python 2.6 forward as it is *documented* that it opens files in 
*binary mode* only.  The source code that I have posted shows that it 
therefore actively removes 'U' from the mode string when the `encoding' 
argument was passed, and always appends 'b' to the mode if not present.  As 
a result, __builtin__.open() is called without 'U' in the `mode' argument, 
which is *documented* to set file.newlines to None (regardless whether 
Python was compiled with universal newline support).

<http://docs.python.org/library/stdtypes.html?highlight=newlines#file.newlines>

`io' is a more general module than `codecs', therefore io.open() does not 
have those restrictions (but it has others – RTSL!¹).  Did you note that 
your `mode' argument does not contain `b'?  Append it and you will see why 
this cannot work.

The bug, if any, is that codecs.open() does not check for your wrong `mode' 
argument, while io.open() does.

_____
¹  RTSL: Read the Source, Luke!

-- 
PointedEars

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: detecting newline character Chris Rebert <clp2@rebertia.com> - 2011-04-23 12:12 -0700
  Re: detecting newline character Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-04-23 21:33 +0200
    Re: detecting newline character Daniel Geržo <danger@rulez.sk> - 2011-04-23 22:25 +0200
      Re: detecting newline character Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-04-24 01:30 +0200
      Re: detecting newline character jmfauth <wxjmfauth@gmail.com> - 2011-04-24 00:05 -0700
        Re: detecting newline character Daniel Geržo <danger@rulez.sk> - 2011-04-24 10:21 +0200
          Re: detecting newline character Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-04-24 11:19 +0200
            Re: detecting newline character Daniel Geržo <danger@rulez.sk> - 2011-04-24 11:49 +0200
              Re: detecting newline character Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-04-24 14:50 +0200

csiph-web