Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18396

Re: Regular expression : non capturing groups are faster ?

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <jeanpierreda@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.064
X-Spam-Evidence '*H*': 0.87; '*S*': 0.00; 'python': 0.08; 'likely.': 0.09; 'am,': 0.12; 'assertion': 0.16; 'capturing': 0.16; 'numbering': 0.16; 'subject:expression': 0.16; 'subject:non': 0.16; 'wrote:': 0.18; 'perl': 0.18; 'jan': 0.19; '(which': 0.19; 'seems': 0.20; 'header:In-Reply-To:1': 0.22; 'correct,': 0.23; "shouldn't": 0.23; 'extract': 0.24; 'subject: ?': 0.24; 'subject: : ': 0.25; 'expect': 0.26; 'not.': 0.28; 'url:mailman': 0.28; 'message-id:@mail.gmail.com': 0.28; 'matches': 0.29; 'odd': 0.29; 'second': 0.29; 'tue,': 0.32; 'url:listinfo': 0.32; 'pretty': 0.32; 'there': 0.33; 'received:209.85.160': 0.33; 'done': 0.34; 'to:addr:python-list': 0.34; 'anything': 0.34; 'received:209.85.160.46': 0.35; 'received:mail- pw0-f46.google.com': 0.35; 'regular': 0.35; 'url:python': 0.36; 'but': 0.37; 'except': 0.37; 'received:google.com': 0.37; 'skip:- 40': 0.37; 'received:209.85': 0.38; 'url:docs': 0.39; 'url:org': 0.39; 'should': 0.39; 'mark': 0.39; 'received:209': 0.40; 'to:addr:python.org': 0.40; 'might': 0.40; 'difference': 0.40; 'more': 0.61; 'your': 0.61; 'subject:are': 0.65; '2012': 0.67; 'scheme.': 0.67; 'care': 0.71; 'candide': 0.84; 'divergent': 0.84; 'proportional': 0.84; 'subject:groups': 0.84; 'technically': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=fjIXqrA+VkzEVohtAhgIoSpfRpv6zF/g4fPF905Obt4=; b=llTaVfYf2HljPakzbI+RV0yroJZKjxKJrEPsTOpAPAd1oinviiaOLKzOQWhC3sunRm uUY77hG7dZEDrp1n5HfqDc0DV02LDOprcjkUF3Nw+eqOvex9RF0Ymqzttn0czzKUwje3 eslODwJo4toWrKGn5mivyjzgtiRl1V2aa3r3c=
MIME-Version 1.0
In-Reply-To <4f02e31c$0$15724$426a74cc@news.free.fr>
References <4f02e31c$0$15724$426a74cc@news.free.fr>
From Devin Jeanpierre <jeanpierreda@gmail.com>
Date Tue, 3 Jan 2012 06:56:44 -0500
Subject Re: Regular expression : non capturing groups are faster ?
To python-list@python.org
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4346.1325591848.27778.python-list@python.org> (permalink)
Lines 51
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1325591848 news.xs4all.nl 6861 [2001:888:2000:d::a6]:40395
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:18396

Show key headers only | View raw


> The second assertion sounds more likely. It seems very odd that Python and
> Perl implementations are divergent on this point. Any opinion ?

The Python documentation oversimplifies. What it means to say is that
while one might expect capturing matches to do extra work proportional
to the capture, they do not. They don't do anything other than mark
down where to extract submatches, and the extra work done is pretty
much negligible. (That is, the work done for submatch extraction is a
polynomial (looks like quadratic) in the number of capturing groups
(which is very small almost always), with a small coefficient).

The Perl documentation is technically correct, but if the HOWTO said
it, it would give the wrong impression. You shouldn't care about
capturing vs noncapturing except with regards to how it interferes
with your group numbering scheme.

-- Devin

On Tue, Jan 3, 2012 at 6:14 AM, candide <candide@free.invalid> wrote:
> Excerpt from the Regular Expression HOWTO
> (http://docs.python.org/howto/regex.html#non-capturing-and-named-groups) :
>
>
> -----------------------------------------------
> It should be mentioned that there’s no performance difference in searching
> between capturing and non-capturing groups; neither form is any faster than
> the other.
> -----------------------------------------------
>
>
> Now from the Perl Regular Expression tutorial
> (http://perldoc.perl.org/perlretut.html#Non-capturing-groupings) :
>
>
> -----------------------------------------------
> Because there is no extraction, non-capturing groupings are faster than
> capturing groupings.
> -----------------------------------------------
>
>
> The second assertion sounds more likely. It seems very odd that Python and
> Perl implementations are divergent on this point. Any opinion ?
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Regular expression : non capturing groups are faster ? candide <candide@free.invalid> - 2012-01-03 12:14 +0100
  Re: Regular expression : non capturing groups are faster ? Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-01-03 06:56 -0500
    Re: Regular expression : non capturing groups are faster ? candide <candide@free.invalid> - 2012-01-03 15:50 +0100
      Re: Regular expression : non capturing groups are faster ? Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-01-03 14:31 -0500
      Re: Regular expression : non capturing groups are faster ? "Octavian Rasnita" <orasnita@gmail.com> - 2012-01-03 22:07 +0200
      Re: Regular expression : non capturing groups are faster ? Devin Jeanpierre <jeanpierreda@gmail.com> - 2012-01-03 15:38 -0500
  Re: Regular expression : non capturing groups are faster ? "Octavian Rasnita" <orasnita@gmail.com> - 2012-01-03 13:59 +0200

csiph-web