Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.064 X-Spam-Evidence: '*H*': 0.87; '*S*': 0.00; 'python': 0.08; 'likely.': 0.09; 'am,': 0.12; 'assertion': 0.16; 'capturing': 0.16; 'numbering': 0.16; 'subject:expression': 0.16; 'subject:non': 0.16; 'wrote:': 0.18; 'perl': 0.18; 'jan': 0.19; '(which': 0.19; 'seems': 0.20; 'header:In-Reply-To:1': 0.22; 'correct,': 0.23; "shouldn't": 0.23; 'extract': 0.24; 'subject: ?': 0.24; 'subject: : ': 0.25; 'expect': 0.26; 'not.': 0.28; 'url:mailman': 0.28; 'message-id:@mail.gmail.com': 0.28; 'matches': 0.29; 'odd': 0.29; 'second': 0.29; 'tue,': 0.32; 'url:listinfo': 0.32; 'pretty': 0.32; 'there': 0.33; 'received:209.85.160': 0.33; 'done': 0.34; 'to:addr:python-list': 0.34; 'anything': 0.34; 'received:209.85.160.46': 0.35; 'received:mail- pw0-f46.google.com': 0.35; 'regular': 0.35; 'url:python': 0.36; 'but': 0.37; 'except': 0.37; 'received:google.com': 0.37; 'skip:- 40': 0.37; 'received:209.85': 0.38; 'url:docs': 0.39; 'url:org': 0.39; 'should': 0.39; 'mark': 0.39; 'received:209': 0.40; 'to:addr:python.org': 0.40; 'might': 0.40; 'difference': 0.40; 'more': 0.61; 'your': 0.61; 'subject:are': 0.65; '2012': 0.67; 'scheme.': 0.67; 'care': 0.71; 'candide': 0.84; 'divergent': 0.84; 'proportional': 0.84; 'subject:groups': 0.84; 'technically': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=fjIXqrA+VkzEVohtAhgIoSpfRpv6zF/g4fPF905Obt4=; b=llTaVfYf2HljPakzbI+RV0yroJZKjxKJrEPsTOpAPAd1oinviiaOLKzOQWhC3sunRm uUY77hG7dZEDrp1n5HfqDc0DV02LDOprcjkUF3Nw+eqOvex9RF0Ymqzttn0czzKUwje3 eslODwJo4toWrKGn5mivyjzgtiRl1V2aa3r3c= MIME-Version: 1.0 In-Reply-To: <4f02e31c$0$15724$426a74cc@news.free.fr> References: <4f02e31c$0$15724$426a74cc@news.free.fr> From: Devin Jeanpierre Date: Tue, 3 Jan 2012 06:56:44 -0500 Subject: Re: Regular expression : non capturing groups are faster ? To: python-list@python.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 51 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1325591848 news.xs4all.nl 6861 [2001:888:2000:d::a6]:40395 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:18396 > The second assertion sounds more likely. It seems very odd that Python an= d > Perl implementations are divergent on this point. Any opinion ? The Python documentation oversimplifies. What it means to say is that while one might expect capturing matches to do extra work proportional to the capture, they do not. They don't do anything other than mark down where to extract submatches, and the extra work done is pretty much negligible. (That is, the work done for submatch extraction is a polynomial (looks like quadratic) in the number of capturing groups (which is very small almost always), with a small coefficient). The Perl documentation is technically correct, but if the HOWTO said it, it would give the wrong impression. You shouldn't care about capturing vs noncapturing except with regards to how it interferes with your group numbering scheme. -- Devin On Tue, Jan 3, 2012 at 6:14 AM, candide wrote: > Excerpt from the Regular Expression HOWTO > (http://docs.python.org/howto/regex.html#non-capturing-and-named-groups) = : > > > ----------------------------------------------- > It should be mentioned that there=E2=80=99s no performance difference in = searching > between capturing and non-capturing groups; neither form is any faster th= an > the other. > ----------------------------------------------- > > > Now from the Perl Regular Expression tutorial > (http://perldoc.perl.org/perlretut.html#Non-capturing-groupings) : > > > ----------------------------------------------- > Because there is no extraction, non-capturing groupings are faster than > capturing groupings. > ----------------------------------------------- > > > The second assertion sounds more likely. It seems very odd that Python an= d > Perl implementations are divergent on this point. Any opinion ? > > > -- > http://mail.python.org/mailman/listinfo/python-list