Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #98200

Re: Regular expressions

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Tim Chase <python.list@tim.thechases.com>
Newsgroups comp.lang.python
Subject Re: Regular expressions
Date Tue, 3 Nov 2015 21:12:08 -0600
Lines 54
Message-ID <mailman.1.1446607194.16136.python-list@python.org> (permalink)
References <662g3blobme52hfoududj27err185v2npm@4ax.com> <mailman.0.1446519578.8789.python-list@python.org> <hp9g3b9hsn06edb0po8bduegjqkmpo4p8n@4ax.com> <mailman.3.1446523111.8789.python-list@python.org> <d39290cf-cb26-470f-a987-2f71e3860f97@googlegroups.com> <mailman.5.1446525488.8789.python-list@python.org> <bb15756d-7181-421d-835e-b2fbfc1c1774@googlegroups.com> <563967A7.4060308@gmail.com>
Mime-Version 1.0
Content-Type text/plain; charset=US-ASCII
Content-Transfer-Encoding 7bit
X-Trace news.uni-berlin.de aDPNVUzJMb558vd0aC3bvg9n+wfWlNjW6sbooq2w0RjQ==
Return-Path <python.list@tim.thechases.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; "'')": 0.07; 'expressions': 0.07; 'postgresql': 0.07; '(aka': 0.09; 'grep': 0.09; 'indexes': 0.09; 'input,': 0.09; 'index': 0.13; 'output': 0.13; 'missed': 0.15; '"grep': 0.16; '-tkc': 0.16; '...)': 0.16; 'expression.': 0.16; 'expressions,': 0.16; 'expressions.': 0.16; 'from:addr:python.list': 0.16; 'from:addr:tim.thechases.com': 0.16; 'from:name:tim chase': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:Regular': 0.16; 'subject:expressions': 0.16; 'wrote:': 0.16; 'otherwise,': 0.20; 'default,': 0.22; 'killer': 0.22; 'select': 0.23; '(or': 0.23; 'insert': 0.23; 'header:In-Reply-To:1': 0.24; "i've": 0.25; 'helpful': 0.27; 'supported': 0.27; 'followed': 0.27; 'least': 0.27; "skip:' 10": 0.28; 'values': 0.28; 'regular': 0.29; 'subset': 0.29; 'allows': 0.30; 'creating': 0.30; 'certainly': 0.30; 'e.g.': 0.30; 'query': 0.30; 'fixed': 0.31; 'table': 0.32; 'michael': 0.33; 'know.': 0.34; 'skip:c 30': 0.35; 'but': 0.36; 'should': 0.36; 'lines': 0.36; '(and': 0.36; 'mode': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'charset:us-ascii': 0.37; 'things': 0.38; 'doing': 0.38; 'whatever': 0.39; 'does': 0.39; "didn't": 0.39; 'rather': 0.39; 'to:addr:python.org': 0.40; 'where': 0.40; 'still': 0.40; 'hope': 0.61; 'entire': 0.61; 'default': 0.61; 'research': 0.62; 'skip:n 10': 0.62; 'received:50': 0.66; 'engines.': 0.93; 'contacts': 0.97
X-Sender-Id wwwh|x-authuser|tim@thechases.com
X-Sender-Id wwwh|x-authuser|tim@thechases.com
X-MC-Relay Neutral
X-MailChannels-SenderId wwwh|x-authuser|tim@thechases.com
X-MailChannels-Auth-Id wwwh
X-MC-Loop-Signature 1446606809105:2614896613
X-MC-Ingress-Time 1446606809105
In-Reply-To <563967A7.4060308@gmail.com>
X-Mailer Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu)
X-AuthUser tim@thechases.com
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:98200

Show key headers only | View raw


On 2015-11-03 19:04, Michael Torrie wrote:
> Grep can use regular expressions (and I do so with it regularly),
> but it's default mode is certainly not regular expressions, and it
> is still very powerful.

I suspect you're thinking of `fgrep` (AKA "grep -F") which uses fixed
strings rather than regular expressions.  By default, `grep` certainly
does use regular expressions:

  tim@linux$ seq 5 | grep "1*"
  tim@bsd$ jot 5 | grep "1*"

will output the entire input, not just lines containing a "1"
followed by an asterisk.

> I've never used regular expressions in a database query language;
> until this moment I didn't know any supported such things in their
> queries.  Good to know.  How you would index for regular
> expressions in queries I don't know.

At least PostgreSQL allows for creating indexes on a particular
regular expression.  E.g. (shooting from the hip so I might have
missed something):

  CREATE TABLE contacts (
   -- ...
   phonenumber VARCHAR(15),
   -- ...
   )
  CREATE INDEX contacts_just_phone_digits_idx
   ON contacts((regexp_replace(phonenumber, '[^0-9]', '')));

  INSERT INTO contacts(..., phonenumber, ...)
   VALUES (..., '800-555-1212', ...)

  SELECT *
  FROM contacts
  WHERE -- should use contacts_just_phone_digits_idx
   regexp_replace(phonenumber, '[^0-9]', '') = '8005551212';

It's not as helpful as one might hope because you're stuck using a
fixed regexp rather than an arbitrary regexp, but if you have a
particular regexp you search for frequently, you can index it.
Otherwise, you'd be doing full table-scans (or at least a full scan
of whatever subset the active non-regexp'ed index yields) which can
be pretty killer on performance.

You'd have to research on other DB engines.

-tkc



Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 20:09 -0500
  Re: Regular expressions MRAB <python@mrabarnett.plus.com> - 2015-11-03 01:19 +0000
    Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
  Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-02 20:42 -0600
    Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-02 22:17 -0500
      Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-02 22:58 -0500
        Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
          Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:38 -0700
            Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:33 -0800
              Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 19:04 -0700
                Re: Regular expressions Dan Sommers <dan@tombstonezero.net> - 2015-11-04 02:55 +0000
                Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:23 +1100
                Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-03 20:47 -0700
                Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-04 13:27 +0000
                Re: Regular expressions Nobody <nobody@nowhere.invalid> - 2015-11-04 05:05 +0000
                Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-04 09:57 +0100
                Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:28 +1100
                Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 20:48 -0600
                Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:03 +1100
                Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-05 09:33 +0100
                Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 23:05 +1100
                Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-05 08:00 -0600
                Re: Regular expressions Albert van der Horst <albert@spenarnc.xs4all.nl> - 2015-11-05 13:39 +0000
                Re: Regular expressions Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-04 08:00 -0500
                Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-04 08:13 -0700
                Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:00 -0500
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:24 -0800
                Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:24 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:59 -0800
                Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 09:18 +0100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-06 11:52 -0800
                Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-06 21:36 +0100
                Re: Regular expressions Larry Martell <larry.martell@gmail.com> - 2015-11-06 15:42 -0500
                Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:34 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 22:27 -0800
                Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:42 -0600
                Re: Regular expressions Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2015-11-05 20:55 +1300
                Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:06 +1100
                What does “grep” stand for? (was: Regular expressions) Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 05:24 +1100
                Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 20:38 +0100
                Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:42 +1100
                Re: What does “grep” stand for? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-05 08:32 +0100
                Re: What does “grep” stand for? Chris Angelico <rosuav@gmail.com> - 2015-11-05 19:00 +1100
                Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 10:19 -0500
                Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 18:29 +0000
                Re: What does “grep” stand for? Random832 <random832@fastmail.com> - 2015-11-05 14:56 -0500
                Re: What does “grep” stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-05 20:19 +0000
                Re: What does “grep” stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-05 20:18 -0500
                Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-05 19:36 -0800
                Re: What does “grep” stand for? Dan Sommers <dan@tombstonezero.net> - 2015-11-06 05:31 +0000
                Re: What does “grep” stand for? William Ray Wing <wrw@mac.com> - 2015-11-06 08:25 -0500
                Re: What does “grep” stand for? Larry Hudson <orgnut@yahoo.com> - 2015-11-06 19:21 -0800
                Re: What does ???grep??? stand for? Grant Edwards <invalid@invalid.invalid> - 2015-11-06 14:15 +0000
                Re: What does ???grep??? stand for? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-11-06 20:03 -0500
                Re: What does “grep” stand for? (was: Regular expressions) Tim Chase <python.list@tim.thechases.com> - 2015-11-04 13:05 -0600
                Re: Regular expressions Terry Reedy <tjreedy@udel.edu> - 2015-11-04 18:08 -0500
                Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:29 -0500
              Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 21:12 -0600
              Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 14:26 +1100
              Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:48 +1100
                Re: Regular expressions Christian Gollwitzer <auriocus@gmx.de> - 2015-11-04 08:21 +0100
                Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 19:47 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:43 -0800
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 06:38 -0800
                Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 01:52 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 16:13 -0800
                Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-05 11:33 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:42 -0800
                Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 13:26 +1100
                Re: Regular expressions Ben Finney <ben+python@benfinney.id.au> - 2015-11-05 14:07 +1100
                Re: Regular expressions rurpy@yahoo.com - 2015-11-04 21:54 -0800
                Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-05 10:14 +0100
                Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-04 18:02 -0500
                Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-05 11:54 +1100
                Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-05 10:07 -0500
                Re: Regular expressions rurpy@yahoo.com - 2015-11-06 12:46 -0800
          Re: Regular expressions Steven D'Aprano <steve@pearwood.info> - 2015-11-03 18:15 +1100
            Re: Regular expressions Nick Sarbicki <nick.a.sarbicki@gmail.com> - 2015-11-03 08:43 +0000
            Re: Regular expressions rurpy@yahoo.com - 2015-11-03 16:22 -0800
      Re: Regular expressions Denis McMahon <denismfmcmahon@gmail.com> - 2015-11-03 12:38 +0000
      Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:53 -0600
      Re: Regular expressions Joel Goldstick <joel.goldstick@gmail.com> - 2015-11-03 10:34 -0500
        Re: Regular expressions Seymore4Head <Seymore4Head@Hotmail.invalid> - 2015-11-03 11:10 -0500
          Re: Regular expressions Chris Angelico <rosuav@gmail.com> - 2015-11-04 03:20 +1100
            Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:35 +1100
              Re: Regular expressions Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2015-11-04 12:41 +0100
    Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 14:56 +0000
  Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 20:51 -0700
    Re: Regular expressions rurpy@yahoo.com - 2015-11-02 20:23 -0800
      Re: Regular expressions Michael Torrie <torriem@gmail.com> - 2015-11-02 21:33 -0700
      Re: Regular expressions Robin Koch <robin.koch@t-online.de> - 2015-11-03 23:58 +0100
  Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 10:25 +0100
  Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 05:50 -0600
  Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 15:00 +0100
    Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 17:12 +0200
      Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 16:35 +0100
        Re: Irregular last line in a text file, was Re: Regular expressions Jussi Piitulainen <harvesting@makes.email.invalid> - 2015-11-03 18:42 +0200
      Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 10:56 -0600
        Re: Irregular last line in a text file, was Re: Regular expressions Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-11-04 14:39 +1100
          Re: Irregular last line in a text file, was Re: Regular expressions Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2015-11-04 10:07 +0000
          Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-04 09:33 -0600
      Re: Irregular last line in a text file, was Re: Regular expressions Peter Otten <__peter__@web.de> - 2015-11-03 18:44 +0100
      Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:33 -0700
      Re: Irregular last line in a text file, was Re: Regular expressions Ian Kelly <ian.g.kelly@gmail.com> - 2015-11-03 11:39 -0700
      Re: Irregular last line in a text file, was Re: Regular expressions Tim Chase <python.list@tim.thechases.com> - 2015-11-03 13:45 -0600
        Re: Irregular last line in a text file, was Re: Regular expressions Grant Edwards <invalid@invalid.invalid> - 2015-11-03 22:15 +0000

csiph-web