Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #98300
| Path | csiph.com!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Tim Chase <python.list@tim.thechases.com> |
| Newsgroups | comp.lang.python |
| Subject | Re: Regular expressions |
| Date | Thu, 5 Nov 2015 08:00:00 -0600 |
| Lines | 54 |
| Message-ID | <mailman.52.1446732098.16136.python-list@python.org> (permalink) |
| References | <662g3blobme52hfoududj27err185v2npm@4ax.com> <mailman.0.1446519578.8789.python-list@python.org> <hp9g3b9hsn06edb0po8bduegjqkmpo4p8n@4ax.com> <mailman.3.1446523111.8789.python-list@python.org> <d39290cf-cb26-470f-a987-2f71e3860f97@googlegroups.com> <mailman.5.1446525488.8789.python-list@python.org> <bb15756d-7181-421d-835e-b2fbfc1c1774@googlegroups.com> <mailman.0.1446602668.16136.python-list@python.org> <n1bs2g$376$1@dont-email.me> <56397a18$0$11094$c3e8da3@news.astraweb.com> <56397FC6.9040700@gmail.com> <mailman.6.1446627436.16136.python-list@python.org> <563abee1$0$1614$c3e8da3$5496439d@news.astraweb.com> <mailman.48.1446712433.16136.python-list@python.org> <563b45fa$0$1593$c3e8da3$5496439d@news.astraweb.com> |
| Mime-Version | 1.0 |
| Content-Type | text/plain; charset=US-ASCII |
| Content-Transfer-Encoding | 7bit |
| X-Trace | news.uni-berlin.de KKbNiogcHvLU+taoGPW/rAEx9Yscn6BSiKqNUfKtrCQA== |
| Return-Path | <python.list@tim.thechases.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.015 |
| X-Spam-Evidence | '*H*': 0.97; '*S*': 0.00; 'string.': 0.04; 'builtin': 0.07; 'literal': 0.09; 'oh,': 0.09; 'underscore': 0.09; '-tkc': 0.16; 'fnmatch': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'simplicity,': 0.16; 'subject:Regular': 0.16; 'subject:expressions': 0.16; 'verbose': 0.16; 'wrote:': 0.16; 'string': 0.17; 'test.': 0.18; 'cloud': 0.20; 'extension': 0.20; 'code.': 0.23; "haven't": 0.24; 'header :In-Reply-To:1': 0.24; 'module': 0.25; 'followed': 0.27; 'finally,': 0.27; '1-3': 0.29; 'that.': 0.30; 'too.': 0.30; 'code': 0.30; 'another': 0.32; "d'aprano": 0.33; 'steven': 0.33; 'file': 0.34; 'skip:. 20': 0.35; 'express': 0.35; 'sometimes': 0.35; 'but': 0.36; 'should': 0.36; 'modules': 0.36; 'to:addr :python-list': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'charset:us-ascii': 0.37; 'presence': 0.38; 'end': 0.39; 'test': 0.39; 'rather': 0.39; 'to:addr:python.org': 0.40; 'back': 0.62; 'course': 0.62; 'more': 0.63; 'capture': 0.66; 'received:50': 0.66; 'afraid': 0.67; 'letters': 0.67; 'add-on': 0.84; 'by*': 0.84; 'clarity.': 0.84; 'clearer': 0.84; 'commenting': 0.84; 'touched': 0.84 |
| X-Sender-Id | wwwh|x-authuser|tim@thechases.com |
| X-Sender-Id | wwwh|x-authuser|tim@thechases.com |
| X-MC-Relay | Neutral |
| X-MailChannels-SenderId | wwwh|x-authuser|tim@thechases.com |
| X-MailChannels-Auth-Id | wwwh |
| X-MC-Loop-Signature | 1446732082911:1436015162 |
| X-MC-Ingress-Time | 1446732082910 |
| In-Reply-To | <563b45fa$0$1593$c3e8da3$5496439d@news.astraweb.com> |
| X-Mailer | Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) |
| X-AuthUser | tim@thechases.com |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.20+ |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Xref | csiph.com comp.lang.python:98300 |
Show key headers only | View raw
On 2015-11-05 23:05, Steven D'Aprano wrote:
> Oh the shame, I knew that. Somehow I tangled myself in a knot,
> thinking that it had to be 1 *followed by* zero or more characters.
> But of course it's not a glob, it's a regex.
But that's a good reminder of fnmatch/glob modules too. Sometimes
all you need is to express a simple glob, in which case using a
regexp can cloud the clarity.
The overarching principle is to go for clarity & simplicity, rather
than favoring built-ins/glob/regex/parser modules all the time.
Want to test for presence in a string? Just use the builtin "a in b"
test. At the beginning/end? Use .startswith()/.endswith() for
clarity. Need to check if a string is purely
digits/alpha/alphanumerics/etc? Use the
string .is{alnum,alpha,decimal,digit,identifier,lower,numeric,printable,space,title,upper}
methods on the string.
For simple wild-carding, use the fnmatch module to do simple
globbing.
For more complex pattern matching, you've got regexps.
Finally, for occasions when you're searching for repeated/nested
structures, using an add-on module like pyparsing will give you
clearer code.
Oh, and with regexps, people should be less afraid of verbose
multi-line strings with commenting
r = re.compile(r"""
^ # start of the string
(?P<year>\d{4}) # capture 4 digits
- # a literal dash
(?P<month>\d{1,2}) # capture 1-2 digits
- # another literal dash
(?P<day>\d{1,2}) # capture 1-2 digits
_ # a literal underscore
(?P<accountnum> # capture the account-number
[A-Z]{1,3} # 1-3 letters
\d+ # followed by 1+ digits
)
\.txt # the extension of the file (ignored)
$ # the end of the string
""", re.VERBOSE)
They are a LOT easier to come back to if you haven't touched the code
for a year.
-tkc
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 20:09 -0500
Re: Regular expressions MRAB <python@mrabarnett.plus.com> - 2015-11-03 01:19 +0000
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-02 20:42 -0600
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-02 22:58 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:38 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:33 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 19:04 -0700
Re: Regular expressions Dan Sommers <dan@tombstonezero.net> - 2015-11-04 02:55 +0000
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:23 +1100
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 20:47 -0700
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-04 13:27 +0000
Re: Regular expressions Nobody <nobody@nowhere.invalid> - 2015-11-04 05:05 +0000
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-04 09:57 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:28 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 20:48 -0600
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:03 +1100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-05 09:33 +0100
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 23:05 +1100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-05 08:00 -0600
Re: Regular expressions Albert van der Horst <albert@spenarnc.xs4all.nl> - 2015-11-05 13:39 +0000
Re: Regular expressions Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-04 08:00 -0500
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-04 08:13 -0700
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:00 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:24 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:24 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:59 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 09:18 +0100
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 11:52 -0800
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-06 21:36 +0100
Re: Regular expressions Larry Martell <larry.martell@gmail.com> - 2015-11-06 15:42 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:34 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 22:27 -0800
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:42 -0600
Re: Regular expressions Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-11-05 20:55 +1300
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:06 +1100
What does “grep” stand for? (was: Regular expressions) Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 05:24 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 20:38 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:42 +1100
Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 08:32 +0100
Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:00 +1100
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 10:19 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 18:29 +0000
Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 14:56 -0500
Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 20:19 +0000
Re: What does “grep” stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-05 20:18 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-05 19:36 -0800
Re: What does “grep” stand for? Dan Sommers <dan@tombstonezero.net> - 2015-11-06 05:31 +0000
Re: What does “grep” stand for? William Ray Wing <wrw@mac.com> - 2015-11-06 08:25 -0500
Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-06 19:21 -0800
Re: What does ???grep??? stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-06 14:15 +0000
Re: What does ???grep??? stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-06 20:03 -0500
Re: What does “grep” stand for? (was: Regular expressions) Tim Chase <python.list@tim.thechases.com> - 2015-11-04 13:05 -0600
Re: Regular expressions Terry Reedy <tjreedy@udel.edu> - 2015-11-04 18:08 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:29 -0500
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 21:12 -0600
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 14:26 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:48 +1100
Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 08:21 +0100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 19:47 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:43 -0800
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:38 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 01:52 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:13 -0800
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:33 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:42 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:26 +1100
Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:07 +1100
Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:54 -0800
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-05 10:14 +0100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:02 -0500
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 11:54 +1100
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-05 10:07 -0500
Re: Regular expressions rurpy@yahoo.com - 2015-11-06 12:46 -0800
Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-03 18:15 +1100
Re: Regular expressions Nick Sarbicki <nick.a.sarbicki@gmail.com> - 2015-11-03 08:43 +0000
Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:22 -0800
Re: Regular expressions Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-03 12:38 +0000
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:53 -0600
Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-03 10:34 -0500
Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-03 11:10 -0500
Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 03:20 +1100
Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:35 +1100
Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-04 12:41 +0100
Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 14:56 +0000
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 20:51 -0700
Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:33 -0700
Re: Regular expressions Robin Koch <robin.koch@t-online.de> - 2015-11-03 23:58 +0100
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 10:25 +0100
Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:50 -0600
Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 15:00 +0100
Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 17:12 +0200
Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 16:35 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 18:42 +0200
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 10:56 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:39 +1100
Re: Irregular last line in a text file, was Re: Regular expressions Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-11-04 10:07 +0000
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:33 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 18:44 +0100
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:33 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:39 -0700
Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 13:45 -0600
Re: Irregular last line in a text file, was Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 22:15 +0000
csiph-web