Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #48676
| Date | 2013-06-18 22:11 -0400 |
|---|---|
| From | Dave Angel <davea@davea.name> |
| Subject | Re: Why is regex so slow? |
| References | <kpq2r9$gg6$1@panix2.panix.com> <51c10e9e$0$29973$c3e8da3$5496439d@news.astraweb.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3566.1371607883.3114.python-list@python.org> (permalink) |
On 06/18/2013 09:51 PM, Steven D'Aprano wrote:
<SNIP>
>
> Even if the regex engine is just as efficient at doing simple character
> matching as `in`, and it probably isn't, your regex tries to match all
> eleven characters of "ENQUEUEING" while the `in` test only has to match
> three, "ENQ".
>
The rest of your post was valid, and useful, but there's a misconception
in this paragraph; I hope you don't mind me pointing it out.
In general, for simple substring searches, you can search for a large
string faster than you can search for a smaller one. I'd expect
if "ENQUEUING" in bigbuffer
to be faster than
if "ENQ" in bigbuffer
assuming that all occurrences of ENQ will actually match the whole
thing. If CPython's implementation doesn't show the speed difference,
maybe there's some room for optimization.
See Boyer-Moore if you want a peek at the algorithm.
When I was writiing a simple search program, I could typically search
for a 4-character string faster than REP SCASB could match a one
character string. And that's a single instruction (with prefix).
--
DaveA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Why is regex so slow? roy@panix.com (Roy Smith) - 2013-06-18 12:45 -0400
Re: Why is regex so slow? Skip Montanaro <skip@pobox.com> - 2013-06-18 12:01 -0500
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 13:08 -0400
Re: Why is regex so slow? Chris Angelico <rosuav@gmail.com> - 2013-06-19 03:20 +1000
Re: Why is regex so slow? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2013-06-18 20:10 +0200
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 12:40 -0700
Re: Why is regex so slow? André Malo <ndparker@gmail.com> - 2013-06-18 21:59 +0200
Re: Why is regex so slow? André Malo <ndparker@gmail.com> - 2013-06-18 22:13 +0200
Re: Why is regex so slow? MRAB <python@mrabarnett.plus.com> - 2013-06-18 18:31 +0100
Re: Why is regex so slow? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-06-18 18:34 +0100
Re: Why is regex so slow? roy@panix.com (Roy Smith) - 2013-06-18 15:21 -0400
Re: Why is regex so slow? MRAB <python@mrabarnett.plus.com> - 2013-06-18 20:49 +0100
Re: Why is regex so slow? Rick Johnson <rantingrickjohnson@gmail.com> - 2013-06-18 12:21 -0700
Re: Why is regex so slow? Antoine Pitrou <solipsis@pitrou.net> - 2013-06-18 20:05 +0000
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-18 13:23 -0700
Re: Why is regex so slow? Duncan Booth <duncan.booth@invalid.invalid> - 2013-06-19 13:21 +0000
Re: Why is regex so slow? Roy Smith <roy@panix.com> - 2013-06-19 12:55 -0700
Re: Why is regex so slow? Grant Edwards <invalid@invalid.invalid> - 2013-06-18 20:30 +0000
Re: Why is regex so slow? Terry Reedy <tjreedy@udel.edu> - 2013-06-18 17:29 -0400
Re: Why is regex so slow? Johannes Bauer <dfnsonfsduifb@gmx.de> - 2013-06-19 10:29 +0200
Re: Why is regex so slow? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-19 01:51 +0000
Re: Why is regex so slow? Dave Angel <davea@davea.name> - 2013-06-18 22:11 -0400
Re: Why is regex so slow? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-06-19 03:16 +0000
csiph-web