Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <51EE6C15.5070701@swing.be>
References: <gksnu85fe69utl50s4e1tik0bhinndls3m@4ax.com> <mailman.4956.1374418117.3114.python-list@python.org> <pb2ou8577r2ahshc0h3oeqoe25dei9gv1o@4ax.com> <mailman.4966.1374428821.3114.python-list@python.org> <hbhou89g1ie8igtrpl24ajibc2251ea1po@4ax.com> <mailman.4969.1374452915.3114.python-list@python.org> <368qu85msgfhuk2j2s13qj0bqn4rkcint9@4ax.com> <mailman.4973.1374496190.3114.python-list@python.org> <at9qu8tq4gbp5rfgd7mq2eo8ui6uh3vnvg@4ax.com> <CAPTjJmrVChLrcAZvDj4frqEFiy=gJzKP3mGnCRxkM6t6euxcVQ@mail.gmail.com> <51ED3CEB.1070706@gmail.com> <mailman.4980.1374502532.3114.python-list@python.org> <XnsA2065B2C39831duncanbooth@127.0.0.1> <mailman.4997.1374571174.3114.python-list@python.org> <XnsA2067049ADEF7duncanbooth@127.0.0.1> <51EE6C15.5070701@swing.be>
Date: Tue, 23 Jul 2013 08:39:59 -0400
Subject: Re: Strange behaviour with os.linesep
From: Jason Swails <jason.swails@gmail.com>
To: Vincent Vande Vyvre <vincent.vandevyvre@swing.be>
Content-Type: multipart/alternative; boundary=485b397dd6b750657d04e22d1703
Cc: python list <python-list@python.org>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.5002.1374583207.3114.python-list@python.org>
Lines: 159
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:51083

--485b397dd6b750657d04e22d1703
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre <
vincent.vandevyvre@swing.be> wrote:

> On Windows a script where de endline are the system line sep, the files
> are open with a double line in Eric4, Notepad++ or Gedit but they are
> correctly displayed in the MS Bloc-Notes.
>
> Example with this code:
> ------------------------------**----------------
> # -*- coding: utf-8 -*-
>
> import os
> L_SEP = os.linesep
>
> def write():
>     strings = ['# -*- coding: utf-8 -*-\n',
>                 'import os\n',
>                 'import sys\n']
>     with open('writetest.py', 'w') as outf:
>         for s in strings:
>             outf.write(s.replace('\n', L_SEP))
>

I must ask why you are setting strings with a newline line ending only to
replace them later with os.linesep.  This seems convoluted compared to
doing something like

def write():
    strings = ['#-*- coding: utf-8 -*-', 'import os', 'import sys']
    with open('writetest.py', 'w') as outf:
        for s in strings:
            outf.write(s)
            outf.write(L_SEP)

Or something equivalent.

If, however, the source strings come from a file you've created somewhere
(and are loaded by reading in that file line by line), then I can see a
problem.  DOS line endings are carriage returns ('\r\n'), whereas standard
UNIX files use just newlines ('\n').  Therefore, if you are using the code:

s.replace('\n', L_SEP)

in Windows, using a Windows-generated file, then what you are likely doing
is converting the string sequence '\r\n' into '\r\r\n', which is not what
you want to do.  I can imagine some text editors interpreting that as two
endlines (since there are 2 \r's).  Indeed, when I execute the code:

>>> l = open('test.txt', 'w')
>>> l.write('This is the first line\r\r\n')
>>> l.write('This is the second\r\r\n')
>>> l.close()

on UNIX and open the resulting file in gedit, it is double-spaced, but if I
just dump it to the screen using 'cat', it is single-spaced.

If you want to make your code a bit more cross-platform, you should strip
out all types of end line characters from the strings before you write
them.  So something like this:

with open('writetest.py', 'w') as outf:
    for s in strings:
        outf.write(s.rstrip('\r\n'))
        outf.write(L_SEP)

Hope this helps,
Jason

--485b397dd6b750657d04e22d1703
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"color:rgb(0,0,0)"><b=
r></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tu=
e, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:vincent.vandevyvre@swing.be" target=3D"_blank">vincent.vandevy=
vre@swing.be</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">On Windows a script where de endline are the system line s=
ep, the files are open with a double line in Eric4, Notepad++ or Gedit but =
they are correctly displayed in the MS Bloc-Notes.<br>

<br>
Example with this code:<br>
------------------------------<u></u>----------------<br>
# -*- coding: utf-8 -*-<br>
<br>
import os<br>
L_SEP =3D os.linesep<br>
<br>
def write():<br>
=A0 =A0 strings =3D [&#39;# -*- coding: utf-8 -*-\n&#39;,<br>
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 &#39;import os\n&#39;,<br>
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 &#39;import sys\n&#39;]<br>
=A0 =A0 with open(&#39;writetest.py&#39;, &#39;w&#39;) as outf:<br>
=A0 =A0 =A0 =A0 for s in strings:<br>
=A0 =A0 =A0 =A0 =A0 =A0 outf.write(s.replace(&#39;\n&#39;, L_SEP))<br></blo=
ckquote><div><br></div><div style=3D"color:rgb(0,0,0)" class=3D"gmail_defau=
lt">I must ask why you are setting strings with a newline line ending only =
to replace them later with os.linesep. =A0This seems convoluted compared to=
 doing something like</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div styl=
e=3D"color:rgb(0,0,0)" class=3D"gmail_default">def write():</div><div style=
=3D"color:rgb(0,0,0)" class=3D"gmail_default">=A0 =A0 strings =3D [&#39;#-*=
- coding: utf-8 -*-&#39;, &#39;import os&#39;, &#39;import sys&#39;]</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default">=A0 =A0 with open(&=
#39;writetest.py&#39;, &#39;w&#39;) as outf:</div><div style=3D"color:rgb(0=
,0,0)" class=3D"gmail_default">=A0 =A0 =A0 =A0 for s in strings:</div><div =
style=3D"color:rgb(0,0,0)" class=3D"gmail_default">
=A0 =A0 =A0 =A0 =A0 =A0 outf.write(s)</div><div style=3D"color:rgb(0,0,0)" =
class=3D"gmail_default">=A0 =A0 =A0 =A0 =A0 =A0 outf.write(L_SEP)</div><div=
 style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div style=3D=
"color:rgb(0,0,0)" class=3D"gmail_default">
Or something equivalent.</div><div style=3D"color:rgb(0,0,0)" class=3D"gmai=
l_default"><br></div><div style=3D"color:rgb(0,0,0)" class=3D"gmail_default=
">If, however, the source strings come from a file you&#39;ve created somew=
here (and are loaded by reading in that file line by line), then I can see =
a problem. =A0DOS line endings are carriage returns (&#39;\r\n&#39;), where=
as standard UNIX files use just newlines (&#39;\n&#39;). =A0Therefore, if y=
ou are using the code:</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div styl=
e=3D"color:rgb(0,0,0)" class=3D"gmail_default">s.replace(&#39;\n&#39;, L_SE=
P)</div><div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><=
div style=3D"color:rgb(0,0,0)" class=3D"gmail_default">
in Windows, using a Windows-generated file, then what you are likely doing =
is converting the string sequence &#39;\r\n&#39; into &#39;\r\r\n&#39;, whi=
ch is not what you want to do. =A0I can imagine some text editors interpret=
ing that as two endlines (since there are 2 \r&#39;s). =A0Indeed, when I ex=
ecute the code:</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div clas=
s=3D"gmail_default"><div class=3D"gmail_default"><span style=3D"color:rgb(0=
,0,0)">&gt;&gt;&gt; l =3D open(&#39;test.txt&#39;, &#39;w&#39;)</span><br><=
/div><div class=3D"gmail_default">
<font color=3D"#000000">&gt;&gt;&gt; l.write(&#39;This is the first line\r\=
r\n&#39;)</font></div><div class=3D"gmail_default"><font color=3D"#000000">=
&gt;&gt;&gt; l.write(&#39;This is the second\r\r\n&#39;)</font></div><div c=
lass=3D"gmail_default">
<font color=3D"#000000">&gt;&gt;&gt; l.close()</font></div><div class=3D"gm=
ail_default"><br></div><div style=3D"color:rgb(0,0,0)">on UNIX and open the=
 resulting file in gedit, it is double-spaced, but if I just dump it to the=
 screen using &#39;cat&#39;, it is single-spaced.</div>
<div style=3D"color:rgb(0,0,0)"><br></div><div style=3D"color:rgb(0,0,0)">I=
f you want to make your code a bit more cross-platform, you should strip ou=
t all types of end line characters from the strings before you write them. =
=A0So something like this:</div>
<div style=3D"color:rgb(0,0,0)"><br></div><div style=3D"color:rgb(0,0,0)">w=
ith open(&#39;writetest.py&#39;, &#39;w&#39;) as outf:</div><div style=3D"c=
olor:rgb(0,0,0)">=A0 =A0 for s in strings:</div><div style=3D"color:rgb(0,0=
,0)">=A0 =A0 =A0 =A0 outf.write(s.rstrip(&#39;\r\n&#39;))</div>
<div style=3D"color:rgb(0,0,0)">=A0 =A0 =A0 =A0 outf.write(L_SEP)</div><div=
 style=3D"color:rgb(0,0,0)"><br></div><div style=3D"color:rgb(0,0,0)">Hope =
this helps,<br>Jason</div></div></div>
</div></div>

--485b397dd6b750657d04e22d1703--