Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #3959

Re: Groups in regular expressions don't repeat as expected

Date 2011-04-24 12:43 -0700
From John Nagle <nagle@animats.com>
Newsgroups comp.lang.python
Subject Re: Groups in regular expressions don't repeat as expected
References <4daf31e3$0$10596$742ec2ed@news.sonic.net> <918q69FjfgU2@mid.individual.net> <4daf4344$0$10519$742ec2ed@news.sonic.net> <91ap1kF1pjU2@mid.individual.net>
Message-ID <4db47d66$0$10524$742ec2ed@news.sonic.net> (permalink)
Organization Sonic.Net

Show all headers | View raw


On 4/21/2011 6:16 AM, Neil Cerutti wrote:
> On 2011-04-20, John Nagle<nagle@animats.com>  wrote:
>>       Findall does something a bit different. It returns a list of
>> matches of the entire pattern, not repeats of groups within
>> the pattern.
>>
>>       Consider a regular expression for matching domain names:
>>
>>>>> kre = re.compile(r'^([a-zA-Z0-9\-]+)(?:\.([a-zA-Z0-9\-]+))+$')
>>>>> s = 'www.example.com'
>>>>> ms = kre.match(s)
>>>>> ms.groups()
>> ('www', 'com')
>>>>> msall = kre.findall(s)
>>>>> msall
>> [('www', 'com')]
>>
>> This is just a simple example.  But it illustrates an unnecessary
>> limitation.  The matcher can do the repeated matching; you just can't
>> get the results out.
>
> Thanks for the further explantion.
>
> Assuming a fake API that returned multiple group matches as a
> tuple:
>
>>> ? print(re.match(r"^([a-z])+$", "abcdef").groups())
> (('a', 'b', 'c', 'd', 'e', 'f'),)
>
> I was thinking of applying findall something like this, but you
> have to make multiple calls:
>
>>>> m = re.match(r"^[a-z]+$", s)
>>>> if m:
> ...   print(re.findall(r"[a-z]", m.group()))
> ...
> ['a', 'b', 'c', 'd', 'e', 'f']
>
> I can see that getting really annoying. Is there a better way to
> make multiple group matches accessible without adding a third
> element type as a group element?

     The most elegant solution would be to have a regular expression
function that returned a tree of tuples or lists.  Then you could
express an entire language syntax as a regular expression and
get out a parse tree.

     Since the regular expression system is actually doing that work,
then discarding the results, it seems a reasonable extension.
I'm not suggesting extending regular expression matching itself,
just the way the results are stored.

				John Nagle

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-20 12:20 -0700
  Re: Groups in regular expressions don't repeat as expected Neil Cerutti <neilc@norwich.edu> - 2011-04-20 19:23 +0000
    Re: Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-20 13:34 -0700
      Re: Groups in regular expressions don't repeat as expected Neil Cerutti <neilc@norwich.edu> - 2011-04-21 13:16 +0000
        Re: Groups in regular expressions don't repeat as expected John Nagle <nagle@animats.com> - 2011-04-24 12:43 -0700
  Re: Groups in regular expressions don't repeat as expected MRAB <python@mrabarnett.plus.com> - 2011-04-20 21:03 +0100
  Re: Groups in regular expressions don't repeat as expected Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-04-21 15:57 +0200
  Re: Groups in regular expressions don't repeat as expected Vlastimil Brom <vlastimil.brom@gmail.com> - 2011-04-21 20:36 +0200

csiph-web