Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #73177

Re: Python's re module and genealogy problem

Date 2014-06-11 09:34 -0600
From Michael Torrie <torriem@gmail.com>
Subject Re: Python's re module and genealogy problem
References <bvr01iFu926U1@mid.individual.net>
Newsgroups comp.lang.python
Message-ID <mailman.11012.1402500909.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 06/11/2014 06:23 AM, BrJohan wrote:
> For some genealogical purposes I consider using Python's re module.
> 
> Rather many names can be spelled in a number of similar ways, and in 
> order to match names even if they are spelled differently, I will build 
> regular expressions, each of which is supposed to match  a number of 
> similar names.
> 
> I guess that there will be a few hundred such regular expressions 
> covering most popular names.
> 
> Now, my problem: Is there a way to decide whether any two - or more - of 
> those regular expressions will match the same string?
> 
> Or, stated a little differently:
> 
> Can it, for a pair of regular expressions be decided whether at least 
> one string matching both of those regular expressions, can be constructed?
> 
> If it is possible to make such a decision, then how? Anyone aware of an 
> algorithm for this?

You might want to search for fuzzy matching algorithms. Years ago, there
was an algorithm called soundex that would generate fuzzy fingerprints
for words that would hide differences in spelling, etc.  Unfortunately
such an algorithm would be language dependent.  The problem you are
trying to solve is one of those very hard problems in computers and math.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Python's re module and genealogy problem BrJohan <brjohan@gmail.com> - 2014-06-11 14:23 +0200
  Re: Python's re module and genealogy problem Robert Kern <robert.kern@gmail.com> - 2014-06-11 14:26 +0100
    Re: Python's re module and genealogy problem Mark H Harris <harrismh777@gmail.com> - 2014-06-11 09:08 -0500
  Re: Python's re module and genealogy problem Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2014-06-11 15:55 +0200
  Re: Python's re module and genealogy problem Michael Torrie <torriem@gmail.com> - 2014-06-11 09:34 -0600
  Re: Python's re module and genealogy problem Nick Cash <nick.cash@npcinternational.com> - 2014-06-11 16:21 +0000
  Re: Python's re module and genealogy problem Simon Ward <simon@bleah.co.uk> - 2014-06-11 18:21 +0100
  Re: Python's re module and genealogy problem Vlastimil Brom <vlastimil.brom@gmail.com> - 2014-06-11 20:09 +0200
  Re: Python's re module and genealogy problem BrJohan <brjohan@gmail.com> - 2014-06-13 17:17 +0200
    Re: Python's re module and genealogy problem Peter Otten <__peter__@web.de> - 2014-06-13 18:26 +0200
    Re: Python's re module and genealogy problem Dan Sommers <dan@tombstonezero.net> - 2014-06-14 05:14 +0000
  Re: Python's re module and genealogy problem Tony the Tiger <tony@tiger.invalid> - 2014-06-14 08:35 +0000

csiph-web