Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Neil Cerutti Newsgroups: comp.lang.python Subject: Re: Groups in regular expressions don't repeat as expected Date: 20 Apr 2011 19:23:53 GMT Organization: Norwich University Lines: 30 Message-ID: <918q69FjfgU2@mid.individual.net> References: <4daf31e3$0$10596$742ec2ed@news.sonic.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: individual.net 3pRwr24uhO3ITDBVJQHSEASBvp0ebBfK+/4EG8vFE8gX1ZvyR8 Cancel-Lock: sha1:fkqh6/xfi05vUGvtmkZBLOpvCBc= User-Agent: slrn/0.9.9p1/mm/ao (Win32) Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:3742 On 2011-04-20, John Nagle wrote: > Here's something that surprised me about Python regular expressions. > > >>> krex = re.compile(r"^([a-z])+$") > >>> s = "abcdef" > >>> ms = krex.match(s) > >>> ms.groups() > ('f',) > > The parentheses indicate a capturing group within the > regular expression, and the "+" indicates that the > group can appear one or more times. The regular > expression matches that way. But instead of returning > a captured group for each character, it returns only the > last one. > > The documentation in fact says that, at > > http://docs.python.org/library/re.html > > "If a group is contained in a part of the pattern that matched multiple > times, the last match is returned." > > That's kind of lame, though. I'd expect that there would be some way > to retrieve all matches. .findall -- Neil Cerutti