Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102025

Re: one more question on regex

From Vlastimil Brom <vlastimil.brom@gmail.com>
Newsgroups comp.lang.python
Subject Re: one more question on regex
Date 2016-01-22 21:10 +0100
Message-ID <mailman.173.1453493453.15297.python-list@python.org> (permalink)
References <n7ti39$7rt$1@gioia.aioe.org> <n7tj3j$9ra$1@gioia.aioe.org>

Show all headers | View raw


2016-01-22 16:50 GMT+01:00 mg <noOne@nowhere.com>:
> Il Fri, 22 Jan 2016 15:32:57 +0000, mg ha scritto:
>
>> python 3.4.3
>>
>> import re re.search('(ab){2}','abzzabab')
>> <_sre.SRE_Match object; span=(4, 8), match='abab'>
>>
>>>>> re.findall('(ab){2}','abzzabab')
>> ['ab']
>>
>> Why for search() the match is 'abab' and for findall the match is 'ab'?
>
> finditer seems to be consistent with search:
> regex = re.compile('(ab){2}')
>
> for match in regex.finditer('abzzababab'):
>   print ("%s: %s" % (match.start(), match.span() ))
> ...
> 4: (4, 8)
>
> --
> https://mail.python.org/mailman/listinfo/python-list

Hi,
as was already pointed out, findall "collects" the content of the
capturing groups (if present), rather than the whole matching text;

for repeated captures the last content of them is taken discarding the
previous ones; cf.:

>>> re.findall('(?i)(a)x(b)+','axbB')
[('a', 'B')]
>>>
(for multiple capturing groups in the pattern, a tuple of captured
parts are collected)

or with your example with differenciated parts of the string using
upper/lower case:
>>> re.findall('(?i)(ab){2}','aBzzAbAB')
['AB']
>>>

hth,
   vbr

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

one more question on regex mg <noOne@nowhere.com> - 2016-01-22 15:32 +0000
  Re: one more question on regex Peter Otten <__peter__@web.de> - 2016-01-22 16:47 +0100
  Re: one more question on regex mg <noOne@nowhere.com> - 2016-01-22 15:50 +0000
    Re: one more question on regex Vlastimil Brom <vlastimil.brom@gmail.com> - 2016-01-22 21:10 +0100
      Re: one more question on regex mg <noOne@nowhere.com> - 2016-01-22 22:47 +0000
        Re: one more question on regex Vlastimil Brom <vlastimil.brom@gmail.com> - 2016-01-23 11:39 +0100

csiph-web