Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Tim Chase Newsgroups: comp.lang.python Subject: Re: Irregular last line in a text file, was Re: Regular expressions Date: Tue, 3 Nov 2015 13:45:47 -0600 Lines: 35 Message-ID: References: <662g3blobme52hfoududj27err185v2npm@4ax.com> <20151102204237.6a78abdf@bigbox.christie.dr> <56382F33.8050905@gmail.com> <20151103055018.535e3e42@bigbox.christie.dr> <20151103105653.622d5e34@bigbox.christie.dr> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de jHpkwWlS5eGzLkT2rfW+0QxU9GAVWVcyL1IIYW1ko7yw== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'resulting': 0.04; 'subject:text': 0.04; 'lines,': 0.05; 'subject:file': 0.07; 'ignoring': 0.09; 'logic': 0.09; 'open()': 0.09; 'def': 0.13; '-tkc': 0.16; '10+': 0.16; 'doing,': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'iterable:': 0.16; 'iterating': 0.16; 'loops': 0.16; 'newlines': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:Regular': 0.16; 'subject:expressions': 0.16; 'wrote:': 0.16; '>>>': 0.20; 'file.': 0.22; 'this:': 0.23; 'header:In-Reply-To:1': 0.24; "i've": 0.25; 'yield': 0.27; 'subject:last': 0.30; "i'd": 0.31; "can't": 0.32; 'usually': 0.33; 'file': 0.34; 'behind': 0.35; 'something': 0.35; 'but': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'being': 0.37; 'method': 0.37; 'charset:us- ascii': 0.37; 'wanted': 0.37; 'seem': 0.37; 'end': 0.39; 'to:addr:python.org': 0.40; 'where': 0.40; 'some': 0.40; 'entire': 0.61; 'back': 0.62; 'more': 0.63; 'effective': 0.63; 'times': 0.63; 'account': 0.66; 'received:23': 0.84; 'notion': 0.91 X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-Sender-Id: wwwh|x-authuser|tim@thechases.com X-MC-Relay: Neutral X-MailChannels-SenderId: wwwh|x-authuser|tim@thechases.com X-MailChannels-Auth-Id: wwwh X-MC-Loop-Signature: 1446580027802:3321224427 X-MC-Ingress-Time: 1446580027802 In-Reply-To: X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) X-AuthUser: tim@thechases.com X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:98188 On 2015-11-03 11:39, Ian Kelly wrote: > >> because I have countless loops that look something like > >> > >> with open(...) as f: > >> for line in f: > >> line = line.rstrip('\r\n') > >> process(line) > > > > What would happen if you read a file opened like this without > > iterating over lines? > > I think I'd go with this: > > >>> def strip_newlines(iterable): > ... for line in iterable: > ... yield line.rstrip('\r\n') > ... Behind the scenes, this is what I usually end up doing, but the effective logic is the same. I just like the notion of being able to tell open() that I want iteratation to happen over the *content* of the lines, ignoring the new-line delimiters. I can't think of more than 1-2 times in my last 10+ years of Pythoning that I've actually had potential use for the newlines, usually on account of simply feeding the entire line back into some filelike.write() method where I wanted the newlines in the resulting file. But even in those cases, I seem to recall stripping off the arbitrary newlines (LF vs. CR/LF) and then adding my own known line delimiter. -tkc