Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #102025
| From | Vlastimil Brom <vlastimil.brom@gmail.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: one more question on regex |
| Date | 2016-01-22 21:10 +0100 |
| Message-ID | <mailman.173.1453493453.15297.python-list@python.org> (permalink) |
| References | <n7ti39$7rt$1@gioia.aioe.org> <n7tj3j$9ra$1@gioia.aioe.org> |
2016-01-22 16:50 GMT+01:00 mg <noOne@nowhere.com>:
> Il Fri, 22 Jan 2016 15:32:57 +0000, mg ha scritto:
>
>> python 3.4.3
>>
>> import re re.search('(ab){2}','abzzabab')
>> <_sre.SRE_Match object; span=(4, 8), match='abab'>
>>
>>>>> re.findall('(ab){2}','abzzabab')
>> ['ab']
>>
>> Why for search() the match is 'abab' and for findall the match is 'ab'?
>
> finditer seems to be consistent with search:
> regex = re.compile('(ab){2}')
>
> for match in regex.finditer('abzzababab'):
> print ("%s: %s" % (match.start(), match.span() ))
> ...
> 4: (4, 8)
>
> --
> https://mail.python.org/mailman/listinfo/python-list
Hi,
as was already pointed out, findall "collects" the content of the
capturing groups (if present), rather than the whole matching text;
for repeated captures the last content of them is taken discarding the
previous ones; cf.:
>>> re.findall('(?i)(a)x(b)+','axbB')
[('a', 'B')]
>>>
(for multiple capturing groups in the pattern, a tuple of captured
parts are collected)
or with your example with differenciated parts of the string using
upper/lower case:
>>> re.findall('(?i)(ab){2}','aBzzAbAB')
['AB']
>>>
hth,
vbr
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
one more question on regex mg <noOne@nowhere.com> - 2016-01-22 15:32 +0000
Re: one more question on regex Peter Otten <__peter__@web.de> - 2016-01-22 16:47 +0100
Re: one more question on regex mg <noOne@nowhere.com> - 2016-01-22 15:50 +0000
Re: one more question on regex Vlastimil Brom <vlastimil.brom@gmail.com> - 2016-01-22 21:10 +0100
Re: one more question on regex mg <noOne@nowhere.com> - 2016-01-22 22:47 +0000
Re: one more question on regex Vlastimil Brom <vlastimil.brom@gmail.com> - 2016-01-23 11:39 +0100
csiph-web