Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73185

Re: Python's re module and genealogy problem

References <bvr01iFu926U1@mid.individual.net>
Date 2014-06-11 20:09 +0200
Subject Re: Python's re module and genealogy problem
From Vlastimil Brom <vlastimil.brom@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.11018.1402510564.18130.python-list@python.org> (permalink)

Show all headers | View raw


2014-06-11 14:23 GMT+02:00 BrJohan <brjohan@gmail.com>:
> For some genealogical purposes I consider using Python's re module.
>...
>
> Now, my problem: Is there a way to decide whether any two - or more - of
> those regular expressions will match the same string?
>
> Or, stated a little differently:
>
> Can it, for a pair of regular expressions be decided whether at least one
> string matching both of those regular expressions, can be constructed?
> --
> https://mail.python.org/mailman/listinfo/python-list

Hi,
i guess, you could reuse some available generators for strings
matching a given regular expression, see e.g.:
http://stackoverflow.com/questions/492716/reversing-a-regular-expression-in-python/
for example a pyparsing recipe:
http://stackoverflow.com/questions/492716/reversing-a-regular-expression-in-python/5006339#5006339

which might be general enough for your needs - of course, you cannot
use unbound quantifiers, backreferences, etc.

Then you can test for identical strings in the generated outputs -
e.g. using the set(...) and its intersection method.

You might also check a much more powerful regex library
https://pypi.python.org/pypi/regex

which, beyond other features, also supports the mentioned fuzzy matches, cf.

>>> regex.findall(r"\bSm(?:ith){e<3}\b", "Smith Smithe Smyth Smythe Smijth")
['Smith', 'Smithe', 'Smyth', 'Smythe', 'Smijth']
>>>
(but, of course, you will have to be careful with this feature in
order to reduce false positives)

hth,
   vbr

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Python's re module and genealogy problem BrJohan <brjohan@gmail.com> - 2014-06-11 14:23 +0200
  Re: Python's re module and genealogy problem Robert Kern <robert.kern@gmail.com> - 2014-06-11 14:26 +0100
    Re: Python's re module and genealogy problem Mark H Harris <harrismh777@gmail.com> - 2014-06-11 09:08 -0500
  Re: Python's re module and genealogy problem Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-06-11 15:55 +0200
  Re: Python's re module and genealogy problem Michael Torrie <torriem@gmail.com> - 2014-06-11 09:34 -0600
  Re: Python's re module and genealogy problem Nick Cash <nick.cash@npcinternational.com> - 2014-06-11 16:21 +0000
  Re: Python's re module and genealogy problem Simon Ward <simon@bleah.co.uk> - 2014-06-11 18:21 +0100
  Re: Python's re module and genealogy problem Vlastimil Brom <vlastimil.brom@gmail.com> - 2014-06-11 20:09 +0200
  Re: Python's re module and genealogy problem BrJohan <brjohan@gmail.com> - 2014-06-13 17:17 +0200
    Re: Python's re module and genealogy problem Peter Otten <__peter__@web.de> - 2014-06-13 18:26 +0200
    Re: Python's re module and genealogy problem Dan Sommers <dan@tombstonezero.net> - 2014-06-14 05:14 +0000
  Re: Python's re module and genealogy problem Tony the Tiger <tony@tiger.invalid> - 2014-06-14 08:35 +0000

csiph-web