Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #48664
| From | André Malo <ndparker@gmail.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: Why is regex so slow? |
| Date | 2013-06-18 21:59 +0200 |
| Organization | TIMTOWTDI |
| Message-ID | <2281418.s73s1ze9gg@news.perlig.de> (permalink) |
| References | <kpq2r9$gg6$1@panix2.panix.com> <CANc-5UyouN5EQw_GDecaAR+inyAB26g=e2pZ=zBuspXdmxJh+Q@mail.gmail.com> <B2AEBA54-6098-47D6-B7E0-77F0B7D8D722@panix.com> <mailman.3545.1371576060.3114.python-list@python.org> <kpq7q8$n9o$1@news.albasani.net> |
* Johannes Bauer wrote:
> The pre-check version is about 42% faster in my case (0.75 sec vs. 1.3
> sec). Curious. This is Python 3.2.3 on Linux x86_64.
A lot of time is spent with dict lookups (timings at my box, Python 3.2.3)
in your inner loop (1500000 times...)
#!/usr/bin/python3
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
count = 0
for line in open('error.log'):
#if 'ENQ' not in line:
# continue
m = pattern.search(line)
if m:
count += 1
print(count)
runs ~ 1.39 s
replacing some dict lookups with index lookups:
#!/usr/bin/python3
def main():
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
count = 0
for line in open('error.log'):
#if 'ENQ' not in line:
# continue
m = pattern.search(line)
if m:
count += 1
print(count)
main()
runs ~ 1.15s
and further:
#!/usr/bin/python3
def main():
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
search = pattern.search
count = 0
for line in open('error.log'):
#if 'ENQ' not in line:
# continue
m = search(line)
if m:
count += 1
print(count)
main()
runs ~ 1.08 s
and for reference:
#!/usr/bin/python3
def main():
import re
pattern = re.compile(r'ENQUEUEING: /listen/(.*)')
search = pattern.search
count = 0
for line in open('error.log'):
if 'ENQ' not in line:
continue
m = search(line)
if m:
count += 1
print(count)
main()
runs ~ 0.71 s
The final difference is probably just the difference between a hardcoded
string search and a generic NFA.
nd
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Why is regex so slow? roy@panix.com (Roy Smith) - 2013-06-18 12:45 -0400
Re: Why is regex so slow? Skip Montanaro <skip@pobox.com> - 2013-06-18 12:01 -0500
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 13:08 -0400
Re: Why is regex so slow? Chris Angelico <rosuav@gmail.com> - 2013-06-19 03:20 +1000
Re: Why is regex so slow? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2013-06-18 20:10 +0200
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 12:40 -0700
Re: Why is regex so slow? André Malo <ndparker@gmail.com> - 2013-06-18 21:59 +0200
Re: Why is regex so slow? André Malo <ndparker@gmail.com> - 2013-06-18 22:13 +0200
Re: Why is regex so slow? MRAB <python@mrabarnett.plus.com> - 2013-06-18 18:31 +0100
Re: Why is regex so slow? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-18 18:34 +0100
Re: Why is regex so slow? roy@panix.com (Roy Smith) - 2013-06-18 15:21 -0400
Re: Why is regex so slow? MRAB <python@mrabarnett.plus.com> - 2013-06-18 20:49 +0100
Re: Why is regex so slow? Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-18 12:21 -0700
Re: Why is regex so slow? Antoine Pitrou <solipsis@pitrou.net> - 2013-06-18 20:05 +0000
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 13:23 -0700
Re: Why is regex so slow? Duncan Booth <duncan.booth@invalid.invalid> - 2013-06-19 13:21 +0000
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-19 12:55 -0700
Re: Why is regex so slow? Grant Edwards <invalid@invalid.invalid> - 2013-06-18 20:30 +0000
Re: Why is regex so slow? Terry Reedy <tjreedy@udel.edu> - 2013-06-18 17:29 -0400
Re: Why is regex so slow? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2013-06-19 10:29 +0200
Re: Why is regex so slow? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-19 01:51 +0000
Re: Why is regex so slow? Dave Angel <davea@davea.name> - 2013-06-18 22:11 -0400
Re: Why is regex so slow? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-19 03:16 +0000
csiph-web