Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Tim Chase Newsgroups: comp.lang.python Subject: Re: Irregular last line in a text file, was Re: Regular expressions Date: Wed, 4 Nov 2015 09:33:02 -0600 Lines: 27 Message-ID: References: <662g3blobme52hfoududj27err185v2npm@4ax.com> <20151102204237.6a78abdf@bigbox.christie.dr> <56382F33.8050905@gmail.com> <20151103055018.535e3e42@bigbox.christie.dr> <56397dd9$0$1601$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de e3EmSL2uvVO8tSMLUdhX0AVkn8LAX8380l728GNsjNYA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subject:text': 0.04; 'skip:\\ 20': 0.05; '(so': 0.07; 'subject:file': 0.07; 'lines:': 0.09; 'skip:\\ 30': 0.09; 'throw': 0.09; 'files.': 0.13; 'def': 0.13; 'wednesday': 0.15; '"hello': 0.16; "'hello": 0.16; '-tkc': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:Regular': 0.16; 'subject:expressions': 0.16; 'wrote:': 0.16; 'string': 0.17; '>>>': 0.20; 'windows': 0.20; '2015': 0.20; 'parameter': 0.22; 'tim': 0.24; 'header:In-Reply-To:1': 0.24; 'yield': 0.27; 'chase': 0.29; 'subject:last': 0.30; "d'aprano": 0.33; 'steven': 0.33; 'text': 0.35; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'charset:us-ascii': 0.37; 'end': 0.39; 'data': 0.39; 'takes': 0.39; 'to:addr:python.org': 0.40; 'valuable': 0.61; 'skip:n 10': 0.62; 'more': 0.63; 'received:46': 0.63; 'want:': 0.84 X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-MC-Relay: Neutral X-MailChannels-SenderId: wwwh|x-authuser|tim@thechases.com X-MailChannels-Auth-Id: wwwh X-MC-Loop-Signature: 1446651263959:2038481716 X-MC-Ingress-Time: 1446651263958 In-Reply-To: <56397dd9$0$1601$c3e8da3$5496439d@news.astraweb.com> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) X-AuthUser: tim@thechases.com X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:98232 On 2015-11-04 14:39, Steven D'Aprano wrote: > On Wednesday 04 November 2015 03:56, Tim Chase wrote: >> Or even more valuable to me: >> >> with open(..., newline="strip") as f: >> assert all(not line.endswith(("\n", "\r")) for line in f) > > # Works only on Windows text files. > def chomp(lines): > for line in lines: > yield line.rstrip('\r\n') .rstrip() takes a string that is a set of characters, so it will remove any \r or \n at the end of the string (so it works with both Windows & *nix line-endings) whereas just using .rstrip() without a parameter can throw away data you might want: >>> "hello \r\n\r\r\n\n\n".rstrip("\r\n") 'hello ' >>> "hello \r\n\r\r\n\n\n".rstrip() 'hello' -tkc