Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!txtfeed1.tudelft.nl!tudelft.nl!txtfeed2.tudelft.nl!amsnews11.chello.com!newsgate.cistron.nl!newsgate.news.xs4all.nl!194.109.133.84.MISMATCH!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.140 X-Spam-Level: * X-Spam-Evidence: '*H*': 0.72; '*S*': 0.00; 'python': 0.08; 'likely.': 0.09; 'assertion': 0.16; 'capturing': 0.16; 'subject:expression': 0.16; 'subject:non': 0.16; 'perl': 0.18; 'seems': 0.20; 'seconds': 0.21; 'subject: ?': 0.24; 'subject: : ': 0.25; 'odd': 0.29; 'second': 0.29; 'x-mailer:microsoft outlook express 6.00.2900.5931': 0.30; 'least': 0.30; 'there': 0.33; 'to:addr :python-list': 0.34; 'subject:': 0.34; 'regular': 0.35; 'url:python': 0.36; 'received:74.125': 0.37; 'received:google.com': 0.37; 'skip:- 40': 0.37; 'using': 0.38; 'non': 0.38; 'url:docs': 0.39; 'from:': 0.39; 'url:org': 0.39; 'should': 0.39; 'to:addr:python.org': 0.40; 'difference': 0.40; 'more': 0.61; 'course,': 0.62; 'subject:are': 0.65; 'received:188': 0.68; 'million': 0.76; 'divergent': 0.84; 'subject:groups': 0.84; 'there\x92s': 0.84; 'dozen': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:from:to:references:subject:date:mime-version :content-type:content-transfer-encoding:x-priority:x-msmail-priority :x-mailer:x-mimeole; bh=UhWmDp3qKBxNd1mBP58GsJjSBC/tA+mogwm7tonICK0=; b=Bd3sJMuiPwk3UboMmUIPUqqyyJFsVfhaSU6yfQ6VMCCSR8oHs0DlSGHik87w8PqTZC lpAnpuAHDqnmjutMSjDwEfChSyEOsPnCSkRb6WXMugN1GlrsR+MxV45SZy8JmRSFIhhB AR3WmrLRn8QN6XtiIJKatsyxNDj+bVd11Fl/Y= From: "Octavian Rasnita" To: References: <4f02e31c$0$15724$426a74cc@news.free.fr> Subject: Re: Regular expression : non capturing groups are faster ? Date: Tue, 3 Jan 2012 13:59:27 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 40 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1325592105 news.xs4all.nl 6854 [2001:888:2000:d::a6]:60278 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:18397 From: "candide" Subject: Regular expression : non capturing groups are faster ? Excerpt from the Regular Expression HOWTO=20 (http://docs.python.org/howto/regex.html#non-capturing-and-named-groups) = : ----------------------------------------------- It should be mentioned that there=92s no performance difference in=20 searching between capturing and non-capturing groups; neither form is=20 any faster than the other. ----------------------------------------------- Now from the Perl Regular Expression tutorial=20 (http://perldoc.perl.org/perlretut.html#Non-capturing-groupings) : ----------------------------------------------- Because there is no extraction, non-capturing groupings are faster than=20 capturing groupings. ----------------------------------------------- The second assertion sounds more likely. It seems very odd that Python=20 and Perl implementations are divergent on this point. Any opinion ? --=20 ** At least in Perl's case, it is true. I tested and using (?:...) is much = faster than (). Of course, it takes a few seconds for dozen million matches... Octavian