Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #98364
| X-Received | by 10.68.217.8 with SMTP id ou8mr12747981pbc.4.1446839553518; Fri, 06 Nov 2015 11:52:33 -0800 (PST) |
|---|---|
| X-Received | by 10.50.25.131 with SMTP id c3mr295843igg.6.1446839553483; Fri, 06 Nov 2015 11:52:33 -0800 (PST) |
| Path | csiph.com!au2pb.net!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!peer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!i2no453129igv.0!news-out.google.com!z4ni1035ign.0!nntp.google.com!i2no453128igv.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail |
| Newsgroups | comp.lang.python |
| Date | Fri, 6 Nov 2015 11:52:33 -0800 (PST) |
| In-Reply-To | <n1f39d$o44$1@dont-email.me> |
| Complaints-To | groups-abuse@google.com |
| Injection-Info | glegroupsg2000goo.googlegroups.com; posting-host=65.183.81.59; posting-account=h7ZmAgoAAAChnNHfIDF1M6YGHuonryl5 |
| NNTP-Posting-Host | 65.183.81.59 |
| References | <mailman.3.1446523111.8789.python-list@python.org> <d39290cf-cb26-470f-a987-2f71e3860f97@googlegroups.com> <mailman.5.1446525488.8789.python-list@python.org> <bb15756d-7181-421d-835e-b2fbfc1c1774@googlegroups.com> <mailman.0.1446602668.16136.python-list@python.org> <n1bs2g$376$1@dont-email.me> <56397a18$0$11094$c3e8da3@news.astraweb.com> <56397FC6.9040700@gmail.com> <n1ch8v$irh$1@ger.gmane.org> <mailman.19.1446650043.16136.python-list@python.org> <3f3l3bpm478nbnsec6c9tr6rre0aontkq1@4ax.com> <5e15df62-00b1-4746-83f8-c0821514d20b@googlegroups.com> <563abdda$0$1614$c3e8da3$5496439d@news.astraweb.com> <2fd7d161-b1cb-4274-b8dc-0157916413f1@googlegroups.com> <n1f39d$o44$1@dont-email.me> |
| User-Agent | G2/1.0 |
| MIME-Version | 1.0 |
| Message-ID | <a379f9ca-dd27-412c-a005-bfef9b9e6abc@googlegroups.com> (permalink) |
| Subject | Re: Regular expressions |
| From | rurpy@yahoo.com |
| Injection-Date | Fri, 06 Nov 2015 19:52:33 +0000 |
| Content-Type | text/plain; charset=ISO-8859-1 |
| X-Received-Bytes | 5405 |
| X-Received-Body-CRC | 766479522 |
| Xref | csiph.com comp.lang.python:98364 |
Show key headers only | View raw
On 11/05/2015 01:18 AM, Christian Gollwitzer wrote:
> Am 05.11.15 um 06:59 schrieb rurpy:
>>> Can you call yourself a well-rounded programmer without at least
>>> a basic understanding of some regex library? Well, probably not.
>>> But that's part of the problem with regexes. They have, to some
>>> degree, driven out potentially better -- or at least differently
>>> bad -- pattern matching solutions, such as (E)BNF grammars,
>>> SNOBOL pattern matching, or lowly globbing patterns. Or even
>>> alternative idioms, like Hypercard's "chunking" idioms.
>>
>> Hmm, very good point. I wonder why all those "potentially better"
>> solutions have not been more widely adopted? A conspiracy by a
>> secret regex cabal?
>
> I'm mostly on the pro-side of the regex discussion, but this IS a
> valid point. regexes are not always a good way to express a pattern,
> even if the pattern is regular. The point is, that you can't build
> them up easily piece-by-piece. Say, you want a regex like "first an
> international phone number, then a name, then a second phone number"
> - you will have to *repeat* the pattern for phone number twice. In
> more complex cases this can become a nightmare, like the monster that
> was mentioned before to validate an email.
>
> A better alternative, then, is PEG for example. You can easily write
> [...]
That is the solution adopted by Perl 6. I have always thought lexing
and parsing solutions for Python were a weak spot in the Python eco-
system and I was about to write that I would love to see a PEG parser
for python when I saw this:
http://fdik.org/pyPEG/
Unfortunately it suffers from the same problem that Pyparsing, Ply
and the rest suffer from: they use Python syntax to express the
parsing rules rather than using a dedicated problem-specific syntax
such as you used to illustrate peg parsing:
> pattern <- phone_number name phone_number phone_number <- '+' [0-9]+
> ( '-' [0-9]+ )* name <- [[:alpha:]]+
Some here have complained about excessive brevity of regexs but I
much prefer using problem-specific syntax like "(a*)" to having to
express a pattern using python with something like
star = RegexMatchAny()
a_group = RegexGroup('a' + star)
...
and I don't want to have to do something similar with PEG (or Ply
or Pyparsing) to formulate their rules.
>[...]
> As a 12 year old, not knowing anything about pattern recognition, but
> thinking I was the king, as is usual for boys in that age, I sat down
> and manually constructed a recursive descent parser in a BASIC like
> language. It had 1000 lines and took me a few weeks to get it
> correct. Finally the solution was accepted as working, but my
> participation was rejected because the solutions lacked
> documentation. 16 years later I used the problem for a course on
> string processing (that's what the PDF is for), and asked the
> students to solve it using regexes. My own solution consists of 67
> characters, and it took me5 minutes to write it down.
>
> Admittedly, this problem is constructed, but solving similar tasks by
> regexes is still something that I need to do on a daily basis, when I
> get data from other scientists in odd formats and I need to
> preprocess them. I know people who use a spreadsheet and copy/paste
> millions of datapoints manually becasue they lack the knowledge of
> using such tools.
I think in many cases those most hostile to regexes are the also
those who use them (or need to use them) the least. While my use
of regexes are limited to fairly simple ones they are complicated
enough that I'm sure it would take orders of magnitude longer
to get the same effect in python.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 20:09 -0500
Re: Regular expressions MRAB <python@mrabarnett.plus.com> - 2015-11-03 01:19 +0000
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-02 20:42 -0600
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-02 22:58 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:38 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:33 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 19:04 -0700
Re: Regular expressions Dan Sommers <dan@tombstonezero.net> - 2015-11-04 02:55 +0000
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:23 +1100
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 20:47 -0700
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-04 13:27 +0000
Re: Regular expressions Nobody <nobody@nowhere.invalid> - 2015-11-04 05:05 +0000
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-04 09:57 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:28 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 20:48 -0600
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:03 +1100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-05 09:33 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 23:05 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-05 08:00 -0600
Re: Regular expressions Albert van der Horst <albert@spenarnc.xs4all.nl> - 2015-11-05 13:39 +0000
Re: Regular expressions Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-04 08:00 -0500
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-04 08:13 -0700
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:00 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:24 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:24 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:59 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 09:18 +0100
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 11:52 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-06 21:36 +0100
Re: Regular expressions Larry Martell <larry.martell@gmail.com> - 2015-11-06 15:42 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:34 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 22:27 -0800
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:42 -0600
Re: Regular expressions Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-11-05 20:55 +1300
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:06 +1100
What does “grep” stand for? (was: Regular expressions) Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 05:24 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 20:38 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:42 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 08:32 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:00 +1100
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 10:19 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 18:29 +0000
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 14:56 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 20:19 +0000
Re: What does “grep” stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-05 20:18 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-05 19:36 -0800
Re: What does “grep” stand for? Dan Sommers <dan@tombstonezero.net> - 2015-11-06 05:31 +0000
Re: What does “grep” stand for? William Ray Wing <wrw@mac.com> - 2015-11-06 08:25 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-06 19:21 -0800
Re: What does ???grep??? stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-06 14:15 +0000
Re: What does ???grep??? stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-06 20:03 -0500
Re: What does “grep” stand for? (was: Regular expressions) Tim Chase <python.list@tim.thechases.com> - 2015-11-04 13:05 -0600
Re: Regular expressions Terry Reedy <tjreedy@udel.edu> - 2015-11-04 18:08 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:29 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 21:12 -0600
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 14:26 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:48 +1100
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 08:21 +0100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 19:47 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:43 -0800
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:38 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 01:52 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:13 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:33 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:42 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:26 +1100
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:07 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:54 -0800
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-05 10:14 +0100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:02 -0500
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 11:54 +1100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-05 10:07 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 12:46 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-03 18:15 +1100
Re: Regular expressions Nick Sarbicki <nick.a.sarbicki@gmail.com> - 2015-11-03 08:43 +0000
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:22 -0800
Re: Regular expressions Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-03 12:38 +0000
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:53 -0600
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-03 10:34 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-03 11:10 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 03:20 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:35 +1100
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-04 12:41 +0100
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 14:56 +0000
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 20:51 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:33 -0700
Re: Regular expressions Robin Koch <robin.koch@t-online.de> - 2015-11-03 23:58 +0100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 10:25 +0100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:50 -0600
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 15:00 +0100
Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 17:12 +0200
Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 16:35 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 18:42 +0200
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 10:56 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:39 +1100
Re: Irregular last line in a text file, was Re: Regular expressions Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-11-04 10:07 +0000
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:33 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 18:44 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:33 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:39 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 13:45 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 22:15 +0000
csiph-web