Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #98121 > unrolled thread
| Started by | Seymore4Head <Seymore4Head@Hotmail.invalid> |
|---|---|
| First post | 2015-11-02 20:09 -0500 |
| Last post | 2015-11-03 22:15 +0000 |
| Articles | 6 on this page of 106 — 30 participants |
Back to article view | Back to comp.lang.python
Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 20:09 -0500
Re: Regular expressions MRAB <python@mrabarnett.plus.com> - 2015-11-03 01:19 +0000
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-02 20:42 -0600
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-02 22:58 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:38 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:33 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 19:04 -0700
Re: Regular expressions Dan Sommers <dan@tombstonezero.net> - 2015-11-04 02:55 +0000
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:23 +1100
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 20:47 -0700
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-04 13:27 +0000
Re: Regular expressions Nobody <nobody@nowhere.invalid> - 2015-11-04 05:05 +0000
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-04 09:57 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:28 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 20:48 -0600
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:03 +1100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-05 09:33 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 23:05 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-05 08:00 -0600
Re: Regular expressions Albert van der Horst <albert@spenarnc.xs4all.nl> - 2015-11-05 13:39 +0000
Re: Regular expressions Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-04 08:00 -0500
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-04 08:13 -0700
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:00 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:24 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:24 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:59 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 09:18 +0100
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 11:52 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-06 21:36 +0100
Re: Regular expressions Larry Martell <larry.martell@gmail.com> - 2015-11-06 15:42 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:34 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 22:27 -0800
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:42 -0600
Re: Regular expressions Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-11-05 20:55 +1300
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:06 +1100
What does “grep” stand for? (was: Regular expressions) Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 05:24 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 20:38 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:42 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 08:32 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:00 +1100
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 10:19 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 18:29 +0000
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 14:56 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 20:19 +0000
Re: What does “grep” stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-05 20:18 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-05 19:36 -0800
Re: What does “grep” stand for? Dan Sommers <dan@tombstonezero.net> - 2015-11-06 05:31 +0000
Re: What does “grep” stand for? William Ray Wing <wrw@mac.com> - 2015-11-06 08:25 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-06 19:21 -0800
Re: What does ???grep??? stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-06 14:15 +0000
Re: What does ???grep??? stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-06 20:03 -0500
Re: What does “grep” stand for? (was: Regular expressions) Tim Chase <python.list@tim.thechases.com> - 2015-11-04 13:05 -0600
Re: Regular expressions Terry Reedy <tjreedy@udel.edu> - 2015-11-04 18:08 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:29 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 21:12 -0600
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 14:26 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:48 +1100
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 08:21 +0100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 19:47 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:43 -0800
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:38 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 01:52 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:13 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:33 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:42 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:26 +1100
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:07 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:54 -0800
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-05 10:14 +0100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:02 -0500
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 11:54 +1100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-05 10:07 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 12:46 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-03 18:15 +1100
Re: Regular expressions Nick Sarbicki <nick.a.sarbicki@gmail.com> - 2015-11-03 08:43 +0000
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:22 -0800
Re: Regular expressions Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-03 12:38 +0000
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:53 -0600
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-03 10:34 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-03 11:10 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 03:20 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:35 +1100
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-04 12:41 +0100
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 14:56 +0000
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 20:51 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:33 -0700
Re: Regular expressions Robin Koch <robin.koch@t-online.de> - 2015-11-03 23:58 +0100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 10:25 +0100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:50 -0600
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 15:00 +0100
Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 17:12 +0200
Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 16:35 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 18:42 +0200
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 10:56 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:39 +1100
Re: Irregular last line in a text file, was Re: Regular expressions Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-11-04 10:07 +0000
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:33 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 18:44 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:33 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:39 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 13:45 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 22:15 +0000
Page 6 of 6 — ← Prev page 1 2 3 4 5 [6]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2015-11-04 09:33 -0600 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <mailman.20.1446651776.16136.python-list@python.org> |
| In reply to | #98205 |
On 2015-11-04 14:39, Steven D'Aprano wrote:
> On Wednesday 04 November 2015 03:56, Tim Chase wrote:
>> Or even more valuable to me:
>>
>> with open(..., newline="strip") as f:
>> assert all(not line.endswith(("\n", "\r")) for line in f)
>
> # Works only on Windows text files.
> def chomp(lines):
> for line in lines:
> yield line.rstrip('\r\n')
.rstrip() takes a string that is a set of characters, so it will
remove any \r or \n at the end of the string (so it works with
both Windows & *nix line-endings) whereas just using .rstrip()
without a parameter can throw away data you might want:
>>> "hello \r\n\r\r\n\n\n".rstrip("\r\n")
'hello '
>>> "hello \r\n\r\r\n\n\n".rstrip()
'hello'
-tkc
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2015-11-03 18:44 +0100 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <mailman.40.1446572684.8789.python-list@python.org> |
| In reply to | #98163 |
Tim Chase wrote:
> On 2015-11-03 16:35, Peter Otten wrote:
>> I wish there were a way to prohibit such files. Maybe a special
>> value
>>
>> with open(..., newline="normalize") f:
>> assert all(line.endswith("\n") for line in f)
>>
>> to ensure that all lines end with "\n"?
>
> Or even more valuable to me:
>
> with open(..., newline="strip") as f:
> assert all(not line.endswith(("\n", "\r")) for line in f)
>
> because I have countless loops that look something like
>
> with open(...) as f:
> for line in f:
> line = line.rstrip('\r\n')
> process(line)
Indeed. It's obvious now you're saying it...
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-11-03 11:33 -0700 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <mailman.44.1446575677.8789.python-list@python.org> |
| In reply to | #98163 |
On Tue, Nov 3, 2015 at 9:56 AM, Tim Chase <python.list@tim.thechases.com> wrote:
> On 2015-11-03 16:35, Peter Otten wrote:
>> I wish there were a way to prohibit such files. Maybe a special
>> value
>>
>> with open(..., newline="normalize") f:
>> assert all(line.endswith("\n") for line in f)
>>
>> to ensure that all lines end with "\n"?
>
> Or even more valuable to me:
>
> with open(..., newline="strip") as f:
> assert all(not line.endswith(("\n", "\r")) for line in f)
>
> because I have countless loops that look something like
>
> with open(...) as f:
> for line in f:
> line = line.rstrip('\r\n')
> process(line)
What would happen if you read a file opened like this without
iterating over lines?
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2015-11-03 11:39 -0700 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <mailman.45.1446576019.8789.python-list@python.org> |
| In reply to | #98163 |
On Tue, Nov 3, 2015 at 11:33 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Tue, Nov 3, 2015 at 9:56 AM, Tim Chase <python.list@tim.thechases.com> wrote:
>> Or even more valuable to me:
>>
>> with open(..., newline="strip") as f:
>> assert all(not line.endswith(("\n", "\r")) for line in f)
>>
>> because I have countless loops that look something like
>>
>> with open(...) as f:
>> for line in f:
>> line = line.rstrip('\r\n')
>> process(line)
>
> What would happen if you read a file opened like this without
> iterating over lines?
I think I'd go with this:
>>> def strip_newlines(iterable):
... for line in iterable:
... yield line.rstrip('\r\n')
...
>>> list(strip_newlines(['one\n', 'two\r', 'three']))
['one', 'two', 'three']
Or if I care about optimizing the for loop (but we're talking about
file I/O, so probably not), this might be faster:
>>> import operator
>>> def strip_newlines(iterable):
... return map(operator.methodcaller('rstrip', '\r\n'), iterable)
...
>>> list(strip_newlines(['one\n', 'two\r', 'three']))
['one', 'two', 'three']
Then the iteration is just:
for line in strip_newlines(f):
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2015-11-03 13:45 -0600 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <mailman.48.1446581076.8789.python-list@python.org> |
| In reply to | #98163 |
On 2015-11-03 11:39, Ian Kelly wrote:
> >> because I have countless loops that look something like
> >>
> >> with open(...) as f:
> >> for line in f:
> >> line = line.rstrip('\r\n')
> >> process(line)
> >
> > What would happen if you read a file opened like this without
> > iterating over lines?
>
> I think I'd go with this:
>
> >>> def strip_newlines(iterable):
> ... for line in iterable:
> ... yield line.rstrip('\r\n')
> ...
Behind the scenes, this is what I usually end up doing, but the
effective logic is the same. I just like the notion of being able to
tell open() that I want iteratation to happen over the *content* of
the lines, ignoring the new-line delimiters.
I can't think of more than 1-2 times in my last 10+ years of
Pythoning that I've actually had potential use for the newlines,
usually on account of simply feeding the entire line back into some
filelike.write() method where I wanted the newlines in the resulting
file. But even in those cases, I seem to recall stripping off the
arbitrary newlines (LF vs. CR/LF) and then adding my own known line
delimiter.
-tkc
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2015-11-03 22:15 +0000 |
| Subject | Re: Irregular last line in a text file, was Re: Regular expressions |
| Message-ID | <n1bbmu$qse$1@reader1.panix.com> |
| In reply to | #98188 |
On 2015-11-03, Tim Chase <python.list@tim.thechases.com> wrote:
[re. iterating over lines in a file]
> I can't think of more than 1-2 times in my last 10+ years of
> Pythoning that I've actually had potential use for the newlines,
If you can think of 1-2 times when you've been interating over the
lines in a file and wanted to see the EOL markers, then that's 1-2
times more than I've ever wanted to see them since I started using
Python 16 years ago...
--
Grant Edwards grant.b.edwards Yow! ! Up ahead! It's a
at DONUT HUT!!
gmail.com
[toc] | [prev] | [standalone]
Page 6 of 6 — ← Prev page 1 2 3 4 5 [6]
Back to top | Article view | comp.lang.python
csiph-web