Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.006 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'else:': 0.03; 'string.': 0.05; 'matches': 0.07; 'string': 0.09; 'matched': 0.09; 'subject:How': 0.10; 'times,': 0.14; "'b'": 0.16; '.........': 0.16; '12:57': 0.16; 'displays.': 0.16; 'expression.': 0.16; 'line)': 0.16; 'string:': 0.16; 'wrote:': 0.18; 'import': 0.22; 'portion': 0.22; 'print': 0.22; 'string,': 0.24; 'equivalent': 0.26; 'second': 0.26; 'code:': 0.26; 'header:In-Reply-To:1': 0.27; 'matching': 0.30; 'message-id:@mail.gmail.com': 0.30; 'regular': 0.32; 'anywhere': 0.35; 'one,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'possible': 0.36; 'subject:?': 0.36; 'example,': 0.37; 'two': 0.37; 'to:addr:python- list': 0.38; 'pm,': 0.38; 'does': 0.39; 'to:addr:python.org': 0.39; 'entire': 0.61; "you're": 0.61; 'first': 0.61; 'times': 0.62; 'group,': 0.63; 'groups.': 0.74; 'jul': 0.74; 'subject:this': 0.83 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=VJIWnxLK/OAC4bJo667PP9IZFiA9SOqfR55NJtVpfa8=; b=0W6uFWG5l3RKfLFKNCcrsT0nSlNgL4kkbUreF9L4/r0Ttn2Ru4im9LgwMxapAFHuTv 47L0S0HX4ptKuJ4DCdY1JzhYRZUPh8t/8GFxzqok0/FSwMLPE8mOTIdbBTbYK9jVbC+K yzgzE7MkPsZjFfPTe/0enaIjsrIet0J5O8n/DCZ8Gwd/X5qIcP7LMRaC+n2N1YbCNT9q OUdNTetUKbB27LHBPaHxvxGiR2TrvYNnng58CBmUHH2D9E+Db3m/YwzScL8+EaGdGVR8 rkbQLC3HFJc5I4SzhtXYcm3YU+1AqgTddIXcoNSyzi6rFuyLFnxmx34+u58V5k90a/Mm OMuw== X-Received: by 10.70.109.169 with SMTP id ht9mr4643807pdb.106.1404674844073; Sun, 06 Jul 2014 12:27:24 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <93a40570-00ed-4507-aa16-221d7e500468@googlegroups.com> References: <93a40570-00ed-4507-aa16-221d7e500468@googlegroups.com> From: Ian Kelly Date: Sun, 6 Jul 2014 13:26:44 -0600 Subject: Re: How to write this repeat matching? To: Python Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 47 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1404675307 news.xs4all.nl 2965 [2001:888:2000:d::a6]:56550 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:74054 On Sun, Jul 6, 2014 at 12:57 PM, wrote: > I write the following code: > > ....... > import re > > line = "abcdb" > > matchObj = re.match( 'a[bcd]*b', line) > > if matchObj: > print "matchObj.group() : ", matchObj.group() > print "matchObj.group(0) : ", matchObj.group() > print "matchObj.group(1) : ", matchObj.group(1) > print "matchObj.group(2) : ", matchObj.group(2) > else: > print "No match!!" > ......... > > In which I have used its match pattern, but the result is not 'abcb' You're never going to get a match of 'abcb' on that string, because 'abcb' is not found anywhere in that string. There are two possible matches for the given pattern over that string: 'abcdb' and 'ab'. The first one matches the [bcd]* three times, and the second one matches it zero times. Because the matching is greedy, you get the result that matches three times. It cannot match one, two or four times because then there would be no 'b' following the [bcd]* portion as required by the pattern. > > Only matchObj.group(0): abcdb > > displays. All other group(s) have no content. Calling match.group(0) is equivalent to calling match.group without arguments. In that case it returns the matched string of the entire regular expression. match.group(1) and match.group(2) will return the value of the first and second matching group respectively, but the pattern does not have any matching groups. If you want a matching group, then enclose the part that you want it to match in parentheses. For example, if you change the pattern to: matchObj = re.match('a([bcd]*)b', line) then the value of matchObj.group(1) will be 'bcd'