Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '"if': 0.04; 'returned.': 0.07; 'though.': 0.07; 'python': 0.07; 'url:pypi': 0.09; '>>>>': 0.09; 'argument,': 0.09; 'arguments,': 0.09; 'expressions.': 0.09; 'match.': 0.09; 'subject:don': 0.09; 'tuple': 0.09; '>>>': 0.12; 'matched': 0.16; 'matches.': 0.16; 'subject:regular': 0.16; '\xa0john': 0.16; 'header:In-Reply-To:1': 0.22; 'times,': 0.25; 'expect': 0.26; 'received:209.85.161.46': 0.26; 'received:mail- fx0-f46.google.com': 0.26; 'url:mailman': 0.27; 'function': 0.27; 'message-id:@mail.gmail.com': 0.28; 'lists': 0.28; 'received:209.85.161': 0.29; 'hi,': 0.29; 'list': 0.30; 'pattern': 0.31; 'match': 0.31; 'import': 0.32; 'to:addr:python-list': 0.32; 'url:listinfo': 0.33; 'regular': 0.34; 'there': 0.35; 'like:': 0.35; 'surprised': 0.35; 'some': 0.37; 'received:209.85': 0.37; 'url:python': 0.37; 'strings': 0.38; 'received:google.com': 0.38; 'url:org': 0.38; 'help': 0.39; 'to:addr:python.org': 0.39; 'received:209': 0.39; 'would': 0.40; 'header:Received:5': 0.40; 'contained': 0.40; 'retrieve': 0.60; 'argument;': 0.84; 'subject:Groups': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=B9Afy98tqTMdQXuQ5wISDPq9eDsK/RWK4kUvmsx2Nfg=; b=BskiAtYYt6KJ1xlWnmL1hdxbRVY/bi6NqX+ECsAC+NNVepaIvrrVj6uYzZtbJcqkuF FIUltauvpvY8Qh4kjwzU+1UXY7dzuoIvI/f41Kczt/hFn8j4Yer18rrtio+qGWPuoR+F LvNg0zH8LAqQLtoOyNmxAsxOdyPabDJS3o53c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=vgiiy9Wq/HuP2H9U4wHBdbTjr06iYWe7XU3JsD2AjtZqhcraVkeLy+cxm6A3rMByD/ PDVbSCWr8CHeJgPAhRW7SOvWya8SHsffG491D9qK/sxvdyLJNJ5jkr+6fgexOh1tWyMy VUSRDTtamttyx4ICmedKz8wu0MHRYFmkiftAk= MIME-Version: 1.0 In-Reply-To: <4daf31e3$0$10596$742ec2ed@news.sonic.net> References: <4daf31e3$0$10596$742ec2ed@news.sonic.net> Date: Thu, 21 Apr 2011 15:57:22 +0200 Subject: Re: Groups in regular expressions don't repeat as expected From: Vlastimil Brom To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 53 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1303394244 news.xs4all.nl 41102 [::ffff:82.94.164.166]:57361 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:3796 2011/4/20 John Nagle : > Here's something that surprised me about Python regular expressions. > >>>> krex =3D re.compile(r"^([a-z])+$") >>>> s =3D "abcdef" >>>> ms =3D krex.match(s) >>>> ms.groups() > ('f',) > >... > "If a group is contained in a part of the pattern that matched multiple > times, the last match is returned." > > That's kind of lame, though. I'd expect that there would be some way > to retrieve all matches. > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0John Nagle > -- > http://mail.python.org/mailman/listinfo/python-list Hi, do you mean something like: >>> import regex >>> ms =3D regex.match(r"^([a-z])+$", "abcdef") >>> ms.captures(1) ['a', 'b', 'c', 'd', 'e', 'f'] >>> >>> help(ms.captures) Help on built-in function captures: captures(...) captures([group1, ...]) --> list of strings or tuple of list of strings= . Return the captures of one or more subgroups of the match. If there is= a single argument, the result is a list of strings; if there are multiple arguments, the result is a tuple of lists with one item per argument; i= f there are no arguments, the captures of the whole match is returned. G= roup 0 is the whole match. >>> cf. http://pypi.python.org/pypi/regex hth, vbr