Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'resulting': 0.04; '-*-': 0.07; 'utf-8': 0.07; 'string': 0.09; 'coding:': 0.09; 'if,': 0.09; 'indeed,': 0.09; 'newline': 0.09; 'windows,': 0.09; 'cc:addr :python-list': 0.11; 'def': 0.12; 'windows': 0.15; "'w')": 0.16; '23,': 0.16; 'carriage': 0.16; 'cc:name:python list': 0.16; 'dump': 0.16; 'line),': 0.16; 'newlines': 0.16; '\xa0this': 0.16; 'wrote:': 0.18; 'do.': 0.18; 'bit': 0.19; 'file,': 0.19; 'later': 0.20; 'seems': 0.21; '>>>': 0.22; 'example': 0.22; 'import': 0.22; 'cc:addr:python.org': 0.22; '>>>': 0.24; 'replace': 0.24; '\xa0so': 0.24; 'cc:2**0': 0.24; 'source': 0.25; 'script': 0.25; 'this:': 0.26; 'code:': 0.26; 'somewhere': 0.26; 'header:In-Reply- To:1': 0.27; 'skip:- 40': 0.29; 'am,': 0.29; 'unix': 0.29; 'characters': 0.30; 'compared': 0.30; 'converting': 0.30; 'dos': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; '(since': 0.31; 'equivalent.': 0.31; 'strip': 0.31; 'file': 0.32; 'text': 0.33; 'open': 0.33; 'screen': 0.34; 'subject:with': 0.35; 'created': 0.35; 'problem.': 0.35; 'something': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'sequence': 0.36; 'doing': 0.36; 'should': 0.36; 'two': 0.37; 'skip:o 20': 0.38; 'jason': 0.38; 'files': 0.38; 'hope': 0.61; 'first': 0.61; "you've": 0.63; 'therefore,': 0.64; 'more': 0.64; 'jul': 0.74; 'ending': 0.78; 'whereas': 0.91; 'imagine': 0.93; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=gakeWisFLdbGCIH0DyFDDspDe5EE/383XYcP/TxFWmA=; b=d40SqWVeZ9mf9A2o+3HS7zL93PwPu/n3m00EewJ9CFjcQO3jqstcs0LXC4XNd1UU2q AS5qhWyeSfxPcF00R4evUN/+6Vg0tkqEmXv57k96cn2OVFrUJ//KxMgtHpwJNQCX391L mZk7M5WqWmVyZoMoapxAK6jHyoaW9eK0vfKrDrnfBBaA+HAc2vmk1YYwHNALLFudxGsP 2QBwog3BT/pABTnWEk6KcqMYBsGZtY56ivTmMTO8zQISh+YGRgamNzODwMQOU4nrkEA+ 8A1Dq6ZXDXli5jh1DnwYBwalcj3tdDZXKCOht4N6Qs1NLsKOaOXnZedHiXeDu/vvMmXK EWPA== MIME-Version: 1.0 X-Received: by 10.42.52.201 with SMTP id k9mr18843504icg.47.1374583199391; Tue, 23 Jul 2013 05:39:59 -0700 (PDT) In-Reply-To: <51EE6C15.5070701@swing.be> References: <368qu85msgfhuk2j2s13qj0bqn4rkcint9@4ax.com> <51ED3CEB.1070706@gmail.com> <51EE6C15.5070701@swing.be> Date: Tue, 23 Jul 2013 08:39:59 -0400 Subject: Re: Strange behaviour with os.linesep From: Jason Swails To: Vincent Vande Vyvre Content-Type: multipart/alternative; boundary=485b397dd6b750657d04e22d1703 Cc: python list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 159 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1374583207 news.xs4all.nl 15871 [2001:888:2000:d::a6]:49436 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:51083 --485b397dd6b750657d04e22d1703 Content-Type: text/plain; charset=ISO-8859-1 On Tue, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre < vincent.vandevyvre@swing.be> wrote: > On Windows a script where de endline are the system line sep, the files > are open with a double line in Eric4, Notepad++ or Gedit but they are > correctly displayed in the MS Bloc-Notes. > > Example with this code: > ------------------------------**---------------- > # -*- coding: utf-8 -*- > > import os > L_SEP = os.linesep > > def write(): > strings = ['# -*- coding: utf-8 -*-\n', > 'import os\n', > 'import sys\n'] > with open('writetest.py', 'w') as outf: > for s in strings: > outf.write(s.replace('\n', L_SEP)) > I must ask why you are setting strings with a newline line ending only to replace them later with os.linesep. This seems convoluted compared to doing something like def write(): strings = ['#-*- coding: utf-8 -*-', 'import os', 'import sys'] with open('writetest.py', 'w') as outf: for s in strings: outf.write(s) outf.write(L_SEP) Or something equivalent. If, however, the source strings come from a file you've created somewhere (and are loaded by reading in that file line by line), then I can see a problem. DOS line endings are carriage returns ('\r\n'), whereas standard UNIX files use just newlines ('\n'). Therefore, if you are using the code: s.replace('\n', L_SEP) in Windows, using a Windows-generated file, then what you are likely doing is converting the string sequence '\r\n' into '\r\r\n', which is not what you want to do. I can imagine some text editors interpreting that as two endlines (since there are 2 \r's). Indeed, when I execute the code: >>> l = open('test.txt', 'w') >>> l.write('This is the first line\r\r\n') >>> l.write('This is the second\r\r\n') >>> l.close() on UNIX and open the resulting file in gedit, it is double-spaced, but if I just dump it to the screen using 'cat', it is single-spaced. If you want to make your code a bit more cross-platform, you should strip out all types of end line characters from the strings before you write them. So something like this: with open('writetest.py', 'w') as outf: for s in strings: outf.write(s.rstrip('\r\n')) outf.write(L_SEP) Hope this helps, Jason --485b397dd6b750657d04e22d1703 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


On Tu= e, Jul 23, 2013 at 7:42 AM, Vincent Vande Vyvre <vincent.vandevy= vre@swing.be> wrote:
On Windows a script where de endline are the system line s= ep, the files are open with a double line in Eric4, Notepad++ or Gedit but = they are correctly displayed in the MS Bloc-Notes.

Example with this code:
----------------------------------------------
# -*- coding: utf-8 -*-

import os
L_SEP =3D os.linesep

def write():
=A0 =A0 strings =3D ['# -*- coding: utf-8 -*-\n',
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 'import os\n',
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 'import sys\n']
=A0 =A0 with open('writetest.py', 'w') as outf:
=A0 =A0 =A0 =A0 for s in strings:
=A0 =A0 =A0 =A0 =A0 =A0 outf.write(s.replace('\n', L_SEP))

I must ask why you are setting strings with a newline line ending only = to replace them later with os.linesep. =A0This seems convoluted compared to= doing something like

def write():
=A0 =A0 strings =3D ['#-*= - coding: utf-8 -*-', 'import os', 'import sys']
=A0 =A0 with open(&= #39;writetest.py', 'w') as outf:
=A0 =A0 =A0 =A0 for s in strings:
=A0 =A0 =A0 =A0 =A0 =A0 outf.write(s)
=A0 =A0 =A0 =A0 =A0 =A0 outf.write(L_SEP)

Or something equivalent.

If, however, the source strings come from a file you've created somew= here (and are loaded by reading in that file line by line), then I can see = a problem. =A0DOS line endings are carriage returns ('\r\n'), where= as standard UNIX files use just newlines ('\n'). =A0Therefore, if y= ou are using the code:

s.replace('\n', L_SE= P)

<= div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"> in Windows, using a Windows-generated file, then what you are likely doing = is converting the string sequence '\r\n' into '\r\r\n', whi= ch is not what you want to do. =A0I can imagine some text editors interpret= ing that as two endlines (since there are 2 \r's). =A0Indeed, when I ex= ecute the code:

>>> l =3D open('test.txt', 'w')
<= /div>
>>> l.write('This is the first line\r\= r\n')
= >>> l.write('This is the second\r\r\n')
>>> l.close()

on UNIX and open the= resulting file in gedit, it is double-spaced, but if I just dump it to the= screen using 'cat', it is single-spaced.

I= f you want to make your code a bit more cross-platform, you should strip ou= t all types of end line characters from the strings before you write them. = =A0So something like this:

w= ith open('writetest.py', 'w') as outf:
=A0 =A0 for s in strings:
=A0 =A0 =A0 =A0 outf.write(s.rstrip('\r\n'))
=A0 =A0 =A0 =A0 outf.write(L_SEP)

Hope = this helps,
Jason
--485b397dd6b750657d04e22d1703--