Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #7738 > unrolled thread

Composing regex from a list

Started byTheSaint <nobody@nowhere.net.no>
First post2011-06-16 20:48 +0800
Last post2011-06-16 15:25 +0200
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Composing regex from a list TheSaint <nobody@nowhere.net.no> - 2011-06-16 20:48 +0800
    Re: Composing regex from a list Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-16 13:09 +0000
      Re: Composing regex from a list TheSaint <nobody@nowhere.net.no> - 2011-06-17 19:45 +0800
    Re: Composing regex from a list Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-06-16 15:25 +0200

#7738 — Composing regex from a list

FromTheSaint <nobody@nowhere.net.no>
Date2011-06-16 20:48 +0800
SubjectComposing regex from a list
Message-ID<itcu3f$b1k$1@speranza.aioe.org>
Hello,
Is it possible to compile a regex by supplying a list?

lst= ['good', 'brilliant'. 'solid']

re.compile(r'^'(any_of_lst))

without to go into a *for* cicle?

-- 
goto /dev/null

[toc] | [next] | [standalone]


#7740

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-06-16 13:09 +0000
Message-ID<4dfa007f$0$30002$c3e8da3$5496439d@news.astraweb.com>
In reply to#7738
On Thu, 16 Jun 2011 20:48:46 +0800, TheSaint wrote:

> Hello,
> Is it possible to compile a regex by supplying a list?
> 
> lst= ['good', 'brilliant'. 'solid']
> 
> re.compile(r'^'(any_of_lst))
> 
> without to go into a *for* cicle?


How about this?


def compile_alternatives(*args):
    alternatives = map(lambda s: '(' + s + ')', args)
    alternatives = '|'.join(alternatives)
    return re.compile(alternatives)


>>> x = compile_alternatives('spam', 'ham', 'cheese')
>>> x.search('fried egg and spam').group()
'spam'



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#7817

FromTheSaint <nobody@nowhere.net.no>
Date2011-06-17 19:45 +0800
Message-ID<itfeof$ddp$1@speranza.aioe.org>
In reply to#7740
Steven D'Aprano wrote:

> def compile_alternatives(*args):

Thank you all, for these good points. For my eyes seem that explicit or 
implicit it will take some looping to concatenate the list elements into a 
string.

I will see pypy later.

-- 
goto /dev/null

[toc] | [prev] | [next] | [standalone]


#7741

FromVlastimil Brom <vlastimil.brom@gmail.com>
Date2011-06-16 15:25 +0200
Message-ID<mailman.15.1308230734.1164.python-list@python.org>
In reply to#7738
2011/6/16 TheSaint <nobody@nowhere.net.no>:
> Hello,
> Is it possible to compile a regex by supplying a list?
>
> lst= ['good', 'brilliant'. 'solid']
> re.compile(r'^'(any_of_lst))
>
> without to go into a *for* cicle?
>

In simple cases, you can just join the list of alternatives on "|" and
incorporate it in the pattern  - e.g. in non capturing parentheses:
(?: ...)
cf.:
>>>
>>> lst= ['good', 'brilliant', 'solid']
>>> import re
>>> re.findall(r"^(?:"+"|".join(lst)+")", u"solid sample text; brilliant QWERT")
[u'solid']
>>>

[using findall just to show the result directly, it is not that usual
with starting ^ ...]

However, if there can be metacharacters like [ ] | . ? * + ... in the
alternative "words", you have to use re.escape(...) on each of these
before.

Or you can use a newer regex implementation with more features
http://pypi.python.org/pypi/regex

which was just provisionally enhanced with an option for exactly this usecase:
cf. Additional features: Named lists on the above page; in this case:

>>> import regex # http://pypi.python.org/pypi/regex
>>> regex.findall(r"^\L<options>", u"solid sample text; brilliant QWERT", options=lst)
[u'solid']
>>>

hth,
  vbr

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web