Path: csiph.com!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: Chris Angelico <rosuav@gmail.com>
Newsgroups: comp.lang.python
Subject: Re: Regular expressions
Date: Wed, 4 Nov 2015 14:26:42 +1100
Lines: 23
Message-ID: <mailman.2.1446607605.16136.python-list@python.org>
References: <662g3blobme52hfoududj27err185v2npm@4ax.com> <mailman.0.1446519578.8789.python-list@python.org> <hp9g3b9hsn06edb0po8bduegjqkmpo4p8n@4ax.com> <mailman.3.1446523111.8789.python-list@python.org> <d39290cf-cb26-470f-a987-2f71e3860f97@googlegroups.com> <mailman.5.1446525488.8789.python-list@python.org> <bb15756d-7181-421d-835e-b2fbfc1c1774@googlegroups.com> <563967A7.4060308@gmail.com> <20151103211208.2a7ec561@bigbox.christie.dr>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
In-Reply-To: <20151103211208.2a7ec561@bigbox.christie.dr>
Precedence: list
Xref: csiph.com comp.lang.python:98202

On Wed, Nov 4, 2015 at 2:12 PM, Tim Chase <python.list@tim.thechases.com> wrote:
> It's not as helpful as one might hope because you're stuck using a
> fixed regexp rather than an arbitrary regexp, but if you have a
> particular regexp you search for frequently, you can index it.
> Otherwise, you'd be doing full table-scans (or at least a full scan
> of whatever subset the active non-regexp'ed index yields) which can
> be pretty killer on performance.

If the regex anchors the start of the string, you can generally use an
index to save at least some effort. Otherwise, you're relying on some
kind of alternate indexing style, such as:

http://www.postgresql.org/docs/current/static/pgtrgm.html

which specifically mentions regex searches as being indexable.

Some more info, including 'explain' results:

http://www.depesz.com/2013/04/10/waiting-for-9-3-support-indexing-of-regular-expression-searches-in-contribpg_trgm/

But this kind of thing isn't widely supported across databases.

ChrisA