Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #3749

Re: Groups in regular expressions don't repeat as expected

Date 2011-04-20 13:34 -0700
From John Nagle <nagle@animats.com>
Newsgroups comp.lang.python
Subject Re: Groups in regular expressions don't repeat as expected
References <4daf31e3$0$10596$742ec2ed@news.sonic.net> <918q69FjfgU2@mid.individual.net>
Message-ID <4daf4344$0$10519$742ec2ed@news.sonic.net> (permalink)
Organization Sonic.Net

Show all headers | View raw


On 4/20/2011 12:23 PM, Neil Cerutti wrote:
> On 2011-04-20, John Nagle<nagle@animats.com>  wrote:
>> Here's something that surprised me about Python regular expressions.
>>
>>>>> krex = re.compile(r"^([a-z])+$")
>>>>> s = "abcdef"
>>>>> ms = krex.match(s)
>>>>> ms.groups()
>> ('f',)
>>
>> The parentheses indicate a capturing group within the
>> regular expression, and the "+" indicates that the
>> group can appear one or more times.  The regular
>> expression matches that way.  But instead of returning
>> a captured group for each character, it returns only the
>> last one.
>>
>> The documentation in fact says that, at
>>
>> http://docs.python.org/library/re.html
>>
>> "If a group is contained in a part of the pattern that matched multiple
>> times, the last match is returned."
>>
>> That's kind of lame, though. I'd expect that there would be some way
>> to retrieve all matches.
>
> .findall
>

     Findall does something a bit different. It returns a list of
matches of the entire pattern, not repeats of groups within
the pattern.

     Consider a regular expression for matching domain names:

 >>> kre = re.compile(r'^([a-zA-Z0-9\-]+)(?:\.([a-zA-Z0-9\-]+))+$')
 >>> s = 'www.example.com'
 >>> ms = kre.match(s)
 >>> ms.groups()
('www', 'com')
 >>> msall = kre.findall(s)
 >>> msall
[('www', 'com')]

This is just a simple example.  But it illustrates an unnecessary
limitation.  The matcher can do the repeated matching; you just can't
get the results out.

				John Nagle

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-20 12:20 -0700
  Re: Groups in regular expressions don't repeat as expected Neil Cerutti <neilc@norwich.edu> - 2011-04-20 19:23 +0000
    Re: Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-20 13:34 -0700
      Re: Groups in regular expressions don't repeat as expected Neil Cerutti <neilc@norwich.edu> - 2011-04-21 13:16 +0000
        Re: Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-24 12:43 -0700
  Re: Groups in regular expressions don't repeat as expected MRAB <python@mrabarnett.plus.com> - 2011-04-20 21:03 +0100
  Re: Groups in regular expressions don't repeat as expected Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-04-21 15:57 +0200
  Re: Groups in regular expressions don't repeat as expected Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-04-21 20:36 +0200

csiph-web