Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.013 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'url:pypi': 0.03; 'url:pipermail': 0.05; 'expressions': 0.07; 'alternatives': 0.09; 'collier': 0.09; 'bug': 0.12; 'missed': 0.12; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'margin,': 0.16; 'module?': 0.16; 'reason.': 0.16; 'wrote:': 0.18; 'module': 0.19; 'meant': 0.20; 'accepted.': 0.22; 'module,': 0.24; 'tend': 0.24; 'source': 0.25; 'script': 0.25; 'post': 0.26; 'least': 0.26; 'header:In-Reply-To:1': 0.27; '[1]': 0.29; 'am,': 0.29; 'patch': 0.29; "doesn't": 0.30; '[2]': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'towards': 0.31; "d'aprano": 0.31; 'steven': 0.31; 'anyone': 0.31; 'probably': 0.32; 'supposed': 0.32; 'regular': 0.32; 'quite': 0.32; 'url:python': 0.33; 'fri,': 0.33; "can't": 0.35; 'johnson': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'url:org': 0.36; 'so,': 0.37; 'list': 0.37; 'performance': 0.37; 'being': 0.38; 'to:addr:python-list': 0.38; 'explain': 0.39; '12,': 0.39; 'quote': 0.39; 'to:addr:python.org': 0.39; 'mailing': 0.39; 'url:mail': 0.40; 'referred': 0.60; 'costs': 0.63; 'real': 0.63; 'personal': 0.63; 'optimized': 0.68; 'results': 0.69; 'jul': 0.74; 'viewed': 0.74; 'saw': 0.77; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5Q73v1b9atUaJhzwH+sbQgdCAk9FhrWmWVqUTstVMoY=; b=PXhl72LTidDCzzRNQmAcgx5wNHyUh84hXTXiwu0I/n48k9a6pbCJwyM8tsVNI28X9H uSJQnhz0Hj19hdl0OoWGLzIza5Jk1fng2adSn6wqKMEBGBtLDef+dQJejw8UTP15ojOa lYGTv5Fk63i0H+p7qtr/+lAEDwYmhkLULpuAMBTcEIAkiDw9wgBGjNqRtDVa/gvW3R2A qup1IuWdAhdDaxVVl8LpGNaadiEQOZ14G5+UjDShgcz8laOHXvnhf1sJL8yVrRRuJgbr rQfvdpOGPanPWlEzK9aALrJe24zsmsXEM5fIznwpdny6KlZ79Ii8ihpDRRKgSK/6MJg3 YKcQ== MIME-Version: 1.0 X-Received: by 10.220.182.193 with SMTP id cd1mr24063632vcb.32.1373614624154; Fri, 12 Jul 2013 00:37:04 -0700 (PDT) In-Reply-To: <51DF4345.5020606@Gmail.com> References: <51DF4345.5020606@Gmail.com> Date: Fri, 12 Jul 2013 17:37:04 +1000 Subject: Re: RE Module Performance From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 27 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1373614626 news.xs4all.nl 15924 [2001:888:2000:d::a6]:35559 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:50508 On Fri, Jul 12, 2013 at 9:44 AM, Devyn Collier Johnson wrote: > I recently saw an email in this mailing list about the RE module being made > slower. I no long have that email. However, I have viewed the source for the > RE module, but I did not see any code that would slow down the script for no > valid reason. Can anyone explain what that user meant or if I missed that > part of the module? > > Can the RE module be optimized in any way or at least the "re.sub" portion? There was a post by Steven D'Aprano [1] in which he referred to it busy-waiting just to make regular expressions slower than the alternatives, but his tongue was firmly in his cheek at the time. As to real performance questions, there have been a variety of alternatives proposed, including I think the regex module [2] which is supposed to outperform 're' by quite a margin, but since I tend towards other solutions, I can't quote personal results or hard figures. If re.sub can be optimized and you can see a way to do so, post a patch to the bug tracker; if it improves performance and doesn't have any ridiculous costs to it, it'll probably be accepted. [1] http://mail.python.org/pipermail/python-list/2013-July/651818.html [2] https://pypi.python.org/pypi/regex ChrisA