Path: csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Tue, 07 Aug 2012 18:12:06 +0100
From: MRAB <regex@mrabarnett.plus.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120713 Thunderbird/14.0
MIME-Version: 1.0
To: python-list@python.org
Subject: Re: Q on regex
References: <1344336754.13108.2@numa-i>
In-Reply-To: <1344336754.13108.2@numa-i>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Precedence: list
Reply-To: python-list@python.org
Newsgroups: comp.lang.python
Message-ID: <mailman.3064.1344359523.4697.python-list@python.org>
Lines: 45
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:26730

On 07/08/2012 11:52, Helmut Jarausch wrote:> Hi Matthew,
 >
 > how to fix the code below to match  'Hellmuth' instead of ' Hellmut' ?
 >
 > A negative look behind in front of the pattern doesn't help since it
 > counts
 > as an error. One would need a means to mix a required match with a
 > fuzzy match.
 >
 >
 > #!/usr/bin/python3
 > import regex
 >
 > Author= regex.compile(r'(?:Helmut){e<=2}')
 >
 > R=Author.search('Jarausch Hellmuth')
 > if R :
 >     print("matching string : |{0}|".format(R.group()))
 >     # matches  ' Hellmut'
 > else :
 >     print("nothing matched")
 >
There are two ways you could do it.

One way is to put word boundaries outside the fuzzy match:

     Author = regex.compile(r'\b(?:Helmut){e<=2}\b')

The other way is to use the 'ENHANCEMATCH' flag (or '(?e)' in the
pattern), which tells it to 'enhance' the match by reducing the number
of errors:

     Author = regex.compile(r'(?e)(?:Helmut){e<=2}')

 >
 > Many thanks for a hint,
 > Helmut.
 >
 > P.S. I've tried to initiate a discussion on adding your module to the
 > current standard library on C.L.P.
 >
The problem with adding it to the standard library is that any releases
would be tied to Python's releases and I would have much less leeway in
making changes. There are a number of other modules which remain
outside the standard library for just that reason.