Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #66200 > unrolled thread
| Started by | forman.simon@gmail.com |
|---|---|
| First post | 2014-02-13 10:37 -0800 |
| Last post | 2014-02-14 09:06 -0500 |
| Articles | 20 on this page of 45 — 16 participants |
Back to article view | Back to comp.lang.python
A curious bit of code... forman.simon@gmail.com - 2014-02-13 10:37 -0800
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 13:45 -0500
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 10:45 -0800
Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 19:09 +0000
Re: A curious bit of code... Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2014-02-13 20:05 +0100
Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:17 +0000
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:20 -0800
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 14:28 -0500
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:25 -0800
Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:25 +0000
Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 19:32 +0000
Re: A curious bit of code... Peter Otten <__peter__@web.de> - 2014-02-13 20:43 +0100
Re: A curious bit of code... Marko Rauhamaa <marko@pacujo.net> - 2014-02-13 21:56 +0200
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:23 -0800
Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:51 +0000
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:59 -0800
Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 13:59 -0600
Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 07:29 +1100
Re: A curious bit of code... Tim Chase <python.list@tim.thechases.com> - 2014-02-13 14:39 -0600
Re: A curious bit of code... Emile van Sebille <emile@fenx.com> - 2014-02-13 12:55 -0800
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:24 -0500
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 16:23 -0800
Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:01 +1100
Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 21:01 +0000
Re: A curious bit of code... Peter Otten <__peter__@web.de> - 2014-02-13 22:06 +0100
Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:10 +1100
Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 21:14 +0000
Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:20 -0600
Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:19 -0600
Re: A curious bit of code... Emile van Sebille <emile@fenx.com> - 2014-02-13 13:23 -0800
Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:31 +1100
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:38 -0500
Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:47 -0600
Re: A curious bit of code... Serhiy Storchaka <storchaka@gmail.com> - 2014-02-13 23:49 +0200
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:51 -0500
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 13:33 -0800
Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 09:13 +1100
Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 14:26 -0800
Re: A curious bit of code... Terry Reedy <tjreedy@udel.edu> - 2014-02-13 19:29 -0500
Re: A curious bit of code... forman.simon@gmail.com - 2014-02-13 18:45 -0800
Re: A curious bit of code... Ned Batchelder <ned@nedbatchelder.com> - 2014-02-13 22:26 -0500
Re: A curious bit of code... forman.simon@gmail.com - 2014-02-14 12:04 -0800
Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-14 21:01 +0000
Re: A curious bit of code... Dave Angel <davea@davea.name> - 2014-02-14 07:19 -0500
Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-14 09:06 -0500
Page 1 of 3 [1] 2 3 Next page →
| From | forman.simon@gmail.com |
|---|---|
| Date | 2014-02-13 10:37 -0800 |
| Subject | A curious bit of code... |
| Message-ID | <4cc09129-43ee-4205-a24c-03f92b594abc@googlegroups.com> |
I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
if key[:1] + key[-1:] == '<>': ...
Some possibilities that occurred to me:
if key.startswith('<') and key.endswith('>'): ...
and:
if (key[:1], key[-1:]) == ('<', '>'): ...
I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?
[toc] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-02-13 13:45 -0500 |
| Message-ID | <roy-16074E.13454313022014@news.panix.com> |
| In reply to | #66200 |
In article <4cc09129-43ee-4205-a24c-03f92b594abc@googlegroups.com>,
forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it, but
> then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
if re.match(r'^<.*>$', key):
sheesh.
(if you care how fast it is, pre-compile the pattern)
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-02-13 10:45 -0800 |
| Message-ID | <mailman.6853.1392318396.18130.python-list@python.org> |
| In reply to | #66200 |
On 02/13/2014 10:37 AM, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?
Unless that line of code is a bottleneck, don't worry about speed, go for readability. In which case I'd go with the
second option, then the first, and definitely avoid the third.
--
~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-02-13 19:09 +0000 |
| Message-ID | <mailman.6854.1392318616.18130.python-list@python.org> |
| In reply to | #66200 |
On 13/02/2014 18:37, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?
>
All I can say is that if you're worried about the speed of a single line
of code like the above then you've got problems. Having said that, I
suspect that using an index to extract a single character has to be
faster than using a slice, but I haven't run these through a profiler yet :)
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
[toc] | [prev] | [next] | [standalone]
| From | Alain Ketterlin <alain@dpt-info.u-strasbg.fr> |
|---|---|
| Date | 2014-02-13 20:05 +0100 |
| Message-ID | <87zjlu4yvk.fsf@dpt-info.u-strasbg.fr> |
| In reply to | #66200 |
forman.simon@gmail.com writes:
> I ran across this and I thought there must be a better way of doing
> it, but then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
I would do: if key[0] == '<' and key[-1] == '>' ...
-- Alain.
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2014-02-13 19:17 +0000 |
| Message-ID | <mailman.6857.1392319077.18130.python-list@python.org> |
| In reply to | #66200 |
On 2014-02-13, forman.simon@gmail.com <forman.simon@gmail.com>
wrote:
> I ran across this and I thought there must be a better way of
> doing it, but then after further consideration I wasn't so
> sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like
> the original might be the fastest after all?
I think the following would occur to someone first:
if key[0] == '<' and key[-1] == '>':
...
It is wrong to avoid the obvious. Needlessly ornate or clever
code will only irritate the person who has to read it later; most
likely yourself.
--
Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-02-13 11:20 -0800 |
| Message-ID | <mailman.6859.1392319225.18130.python-list@python.org> |
| In reply to | #66200 |
On 02/13/2014 11:09 AM, Mark Lawrence wrote: > > All I can say is that if you're worried about the speed of a single line of code like the above then you've got > problems. Having said that, I suspect that using an index to extract a single character has to be faster than using a > slice, but I haven't run these through a profiler yet :) The problem with using indices in the code sample is that if the string is 0 or 1 characters long you'll get an exception instead of a False. -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2014-02-13 14:28 -0500 |
| Message-ID | <roy-BC0641.14284113022014@news.panix.com> |
| In reply to | #66211 |
In article <mailman.6859.1392319225.18130.python-list@python.org>, Ethan Furman <ethan@stoneleaf.us> wrote: > On 02/13/2014 11:09 AM, Mark Lawrence wrote: > > > > All I can say is that if you're worried about the speed of a single line of > > code like the above then you've got > > problems. Having said that, I suspect that using an index to extract a > > single character has to be faster than using a > > slice, but I haven't run these through a profiler yet :) > > The problem with using indices in the code sample is that if the string is 0 > or 1 characters long you'll get an > exception instead of a False. My re.match() solution handles those edge cases just fine.
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-02-13 11:25 -0800 |
| Message-ID | <mailman.6860.1392319505.18130.python-list@python.org> |
| In reply to | #66200 |
On 02/13/2014 11:20 AM, Ethan Furman wrote: > On 02/13/2014 11:09 AM, Mark Lawrence wrote: >> >> All I can say is that if you're worried about the speed of a single >> line of code like the above then you've got problems. Having said >> that, I suspect that using an index to extract a single character >> has to be faster than using a slice, but I haven't run these through >> a profiler yet :) > > The problem with using indices in the code sample is that if the > string is 0 or 1 characters long you'll get an exception instead > of a False. Oops, make that zero characters. ;) -- ~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2014-02-13 19:25 +0000 |
| Message-ID | <mailman.6862.1392319553.18130.python-list@python.org> |
| In reply to | #66200 |
On 2014-02-13, Ethan Furman <ethan@stoneleaf.us> wrote: > On 02/13/2014 11:09 AM, Mark Lawrence wrote: >> All I can say is that if you're worried about the speed of a >> single line of code like the above then you've got problems. >> Having said that, I suspect that using an index to extract a >> single character has to be faster than using a slice, but I >> haven't run these through a profiler yet :) > > The problem with using indices in the code sample is that if > the string is 0 or 1 characters long you'll get an exception > instead of a False. There will be an exception only if it is zero-length. But good point! That's a pretty sneaky way to avoid checking for a zero-length string. Is it a popular idiom? -- Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-02-13 19:32 +0000 |
| Message-ID | <mailman.6863.1392319960.18130.python-list@python.org> |
| In reply to | #66200 |
On 13/02/2014 19:25, Neil Cerutti wrote: > On 2014-02-13, Ethan Furman <ethan@stoneleaf.us> wrote: >> On 02/13/2014 11:09 AM, Mark Lawrence wrote: >>> All I can say is that if you're worried about the speed of a >>> single line of code like the above then you've got problems. >>> Having said that, I suspect that using an index to extract a >>> single character has to be faster than using a slice, but I >>> haven't run these through a profiler yet :) >> >> The problem with using indices in the code sample is that if >> the string is 0 or 1 characters long you'll get an exception >> instead of a False. > > There will be an exception only if it is zero-length. But good > point! That's a pretty sneaky way to avoid checking for a > zero-length string. Is it a popular idiom? > I hope not. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2014-02-13 20:43 +0100 |
| Message-ID | <mailman.6864.1392320644.18130.python-list@python.org> |
| In reply to | #66200 |
forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it,
> but then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original
> might be the fastest after all?
$ python -m timeit -s 's = "<alpha>"' 's[:1]+s[-1:] == "<>"'
1000000 loops, best of 3: 0.37 usec per loop
$ python -m timeit -s 's = "<alpha>"' 's[:1] == "<" and s[-1:] == ">"'
1000000 loops, best of 3: 0.329 usec per loop
$ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and
s.endswith(">")'
1000000 loops, best of 3: 0.713 usec per loop
The first is too clever for my taste.
The second is fast and easy to understand. It might attract "improvements"
replacing the slice with an index, but I trust you will catch that with your
unit tests ;)
Personally, I'm willing to spend the few extra milliseconds and use the
foolproof third.
[toc] | [prev] | [next] | [standalone]
| From | Marko Rauhamaa <marko@pacujo.net> |
|---|---|
| Date | 2014-02-13 21:56 +0200 |
| Message-ID | <87bnya7pnx.fsf@elektro.pacujo.net> |
| In reply to | #66218 |
Peter Otten <__peter__@web.de>:
> Personally, I'm willing to spend the few extra milliseconds and use
> the foolproof third.
Speaking of foolproof, what is this "key?" Is it an XML start tag,
maybe? Then, how does your test fare with, say,
<start comparison=">
">
which is equivalent to
<start comparison=">">
Marko
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-02-13 11:23 -0800 |
| Message-ID | <mailman.6865.1392320801.18130.python-list@python.org> |
| In reply to | #66200 |
On 02/13/2014 11:17 AM, Neil Cerutti wrote:
> On 2014-02-13, forman.simon@gmail.com <forman.simon@gmail.com>
> wrote:
>> I ran across this and I thought there must be a better way of
>> doing it, but then after further consideration I wasn't so
>> sure.
>>
>> if key[:1] + key[-1:] == '<>': ...
>>
>> Some possibilities that occurred to me:
>>
>> if key.startswith('<') and key.endswith('>'): ...
>>
>> and:
>>
>> if (key[:1], key[-1:]) == ('<', '>'): ...
>>
>>
>> I haven't run these through a profiler yet, but it seems like
>> the original might be the fastest after all?
>
> I think the following would occur to someone first:
>
> if key[0] == '<' and key[-1] == '>':
> ...
>
> It is wrong to avoid the obvious. Needlessly ornate or clever
> code will only irritate the person who has to read it later; most
> likely yourself.
Not whet the obvious is wrong:
-> key = ''
--> if key[0] == '<' and key[-1] == '>':
... print "good key!"
... else:
... print "bad key"
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
--
~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <neilc@norwich.edu> |
|---|---|
| Date | 2014-02-13 19:51 +0000 |
| Message-ID | <mailman.6867.1392321113.18130.python-list@python.org> |
| In reply to | #66200 |
On 2014-02-13, Peter Otten <__peter__@web.de> wrote: > forman.simon@gmail.com wrote: > The first is too clever for my taste. > > The second is fast and easy to understand. It might attract > "improvements" replacing the slice with an index, but I trust > you will catch that with your unit tests ;) It's easy to forget exactly why startswith and endswith even exist. -- Neil Cerutti
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2014-02-13 11:59 -0800 |
| Message-ID | <mailman.6868.1392321537.18130.python-list@python.org> |
| In reply to | #66200 |
On 02/13/2014 11:43 AM, Peter Otten wrote:
> forman.simon@gmail.com wrote:
>
>> I ran across this and I thought there must be a better way of doing it,
>> but then after further consideration I wasn't so sure.
>>
>> if key[:1] + key[-1:] == '<>': ...
>>
>>
>> Some possibilities that occurred to me:
>>
>> if key.startswith('<') and key.endswith('>'): ...
>>
>> and:
>>
>> if (key[:1], key[-1:]) == ('<', '>'): ...
>>
>>
>> I haven't run these through a profiler yet, but it seems like the original
>> might be the fastest after all?
>
> $ python -m timeit -s 's = "<alpha>"' 's[:1]+s[-1:] == "<>"'
> 1000000 loops, best of 3: 0.37 usec per loop
>
> $ python -m timeit -s 's = "<alpha>"' 's[:1] == "<" and s[-1:] == ">"'
> 1000000 loops, best of 3: 0.329 usec per loop
>
> $ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and
> s.endswith(">")'
> 1000000 loops, best of 3: 0.713 usec per loop
>
> The first is too clever for my taste.
>
> The second is fast and easy to understand. It might attract "improvements"
> replacing the slice with an index, but I trust you will catch that with your
> unit tests ;)
>
> Personally, I'm willing to spend the few extra milliseconds and use the
> foolproof third.
For completeness:
# the slowest method from Peter
$ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and s.endswith(">")'
1000000 loops, best of 3: 0.309 usec per loop
# the re method from Roy
$ python -m timeit -s "import re;pattern=re.compile(r'^<.*>$');s = '<alpha>'" "pattern.match(s)"
1000000 loops, best of 3: 0.466 usec per loop
--
~Ethan~
[toc] | [prev] | [next] | [standalone]
| From | Zachary Ware <zachary.ware+pylist@gmail.com> |
|---|---|
| Date | 2014-02-13 13:59 -0600 |
| Message-ID | <mailman.6870.1392322018.18130.python-list@python.org> |
| In reply to | #66200 |
On Thu, Feb 13, 2014 at 12:37 PM, <forman.simon@gmail.com> wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
> if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
> if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?
In a fit of curiosity, I did some timings:
'and'ed indexing:
C:\tmp>py -m timeit -s "key = '<test>'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.35 usec per loop
C:\tmp>py -m timeit -s "key = '<test'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.398 usec per loop
C:\tmp>py -m timeit -s "key = 'test>'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.188 usec per loop
C:\tmp>py -m timeit -s "key = 'test'" "key[0] == '<' and key[-1] == '>'"
10000000 loops, best of 3: 0.211 usec per loop
C:\tmp>py -m timeit -s "key = ''" "key[0] == '<' and key[-1] == '>'"
Traceback (most recent call last):
File "P:\Python34\lib\timeit.py", line 292, in main
x = t.timeit(number)
File "P:\Python34\lib\timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
key[0] == '<' and key[-1] == '>'
IndexError: string index out of range
Slice concatenation:
C:\tmp>py -m timeit -s "key = '<test>'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.649 usec per loop
C:\tmp>py -m timeit -s "key = '<test'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.7 usec per loop
C:\tmp>py -m timeit -s "key = 'test>'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.663 usec per loop
C:\tmp>py -m timeit -s "key = 'test'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.665 usec per loop
C:\tmp>py -m timeit -s "key = ''" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.456 usec per loop
String methods:
C:\tmp>py -m timeit -s "key = '<test>'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 1.03 usec per loop
C:\tmp>py -m timeit -s "key = '<test'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 1.02 usec per loop
C:\tmp>py -m timeit -s "key = 'test>'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 0.504 usec per loop
C:\tmp>py -m timeit -s "key = 'test'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 0.502 usec per loop
C:\tmp>py -m timeit -s "key = ''" "key.startswith('<') and key.endswith('>')"
1000000 loops, best of 3: 0.49 usec per loop
Tuple comparison:
C:\tmp>py -m timeit -s "key = '<test>'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.629 usec per loop
C:\tmp>py -m timeit -s "key = '<test'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.689 usec per loop
C:\tmp>py -m timeit -s "key = 'test>'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.676 usec per loop
C:\tmp>py -m timeit -s "key = 'test'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.675 usec per loop
C:\tmp>py -m timeit -s "key = ''" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.608 usec per loop
re.match():
C:\tmp>py -m timeit -s "import re;key = '<test>'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 3.39 usec per loop
C:\tmp>py -m timeit -s "import re;key = '<test'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 3.27 usec per loop
C:\tmp>py -m timeit -s "import re;key = 'test>'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.94 usec per loop
C:\tmp>py -m timeit -s "import re;key = 'test'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.97 usec per loop
C:\tmp>py -m timeit -s "import re;key = ''" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.97 usec per loop
Pre-compiled re:
C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'<test>'" "r.match(key)"
1000000 loops, best of 3: 0.932 usec per loop
C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'<test'" "r.match(key)"
1000000 loops, best of 3: 0.79 usec per loop
C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'test>'" "r.match(key)"
1000000 loops, best of 3: 0.718 usec per loop
C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'test'" "r.match(key)"
1000000 loops, best of 3: 0.755 usec per loop
C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key = ''"
"r.match(key)"
1000000 loops, best of 3: 0.731 usec per loop
Pre-compiled re with pre-fetched method:
C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= '<test>'" "m(key)"
1000000 loops, best of 3: 0.777 usec per loop
C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= '<test'" "m(key)"
1000000 loops, best of 3: 0.65 usec per loop
C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= 'test>'" "m(key)"
1000000 loops, best of 3: 0.652 usec per loop
C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= 'test'" "m(key)"
1000000 loops, best of 3: 0.576 usec per loop
C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= ''" "m(key)"
1000000 loops, best of 3: 0.58 usec per loop
And the winner is:
C:\tmp>py -m timeit -s "key = '<test>'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.388 usec per loop
C:\tmp>py -m timeit -s "key = '<test'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.413 usec per loop
C:\tmp>py -m timeit -s "key = 'test>'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.219 usec per loop
C:\tmp>py -m timeit -s "key = 'test'" "key and key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.215 usec per loop
C:\tmp>py -m timeit -s "key = ''" "key and key[0] == '<' and key[-1] == '>'"
10000000 loops, best of 3: 0.0481 usec per loop
So, the moral of the story? Use short-circuit logic wherever you can,
don't use re for simple stuff (because while it may be very fast, it's
dominated by attribute lookup and function call overhead), and unless
you expect to be doing this test many many millions of times in a very
short space of time, go for readability over performance.
--
Zach
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2014-02-14 07:29 +1100 |
| Message-ID | <mailman.6875.1392323389.18130.python-list@python.org> |
| In reply to | #66200 |
On Fri, Feb 14, 2014 at 6:32 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> There will be an exception only if it is zero-length. But good
>> point! That's a pretty sneaky way to avoid checking for a
>> zero-length string. Is it a popular idiom?
>>
>
> I hope not.
The use of slicing rather than indexing to avoid problems when the
string's too short? I don't know about popular, but I've certainly
used it a good bit. For the specific case of string comparisons you
can use startswith/endswith, but slicing works with other types as
well.
Also worth noting:
Python 2.7.4 (default, Apr 6 2013, 19:54:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> s1,s2=b"asdf",u"asdf"
>>> s1[:1],s2[:1]
('a', u'a')
>>> s1[0],s2[0]
('a', u'a')
Python 3.4.0b2 (v3.4.0b2:ba32913eb13e, Jan 5 2014, 16:23:43) [MSC
v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> s1,s2=b"asdf",u"asdf"
>>> s1[:1],s2[:1]
(b'a', 'a')
>>> s1[0],s2[0]
(97, 'a')
When you slice, you get back the same type as you started with. (Also
true of lists, tuples, and probably everything else that can be
sliced.) When you index, you might not; strings are a special case
(since Python lacks a "character" type), and if your code has to run
on Py2 and Py3, byte strings stop being that special case in Py3. So
if you're working with a byte string, it might be worth slicing rather
than indexing. (Though you can still use startswith/endswith, if they
suit your purpose.)
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2014-02-13 14:39 -0600 |
| Message-ID | <mailman.6877.1392323932.18130.python-list@python.org> |
| In reply to | #66200 |
On 2014-02-13 10:37, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing
> it, but then after further consideration I wasn't so sure.
>
> Some possibilities that occurred to me:
>
> if key.startswith('<') and key.endswith('>'): ...
This is my favorite because it doesn't break on the empty string like
some of your alternatives. Your k[0] and k[-1] assume there's at
least one character in the string, otherwise an IndexError is raised.
-tkc
[toc] | [prev] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2014-02-13 12:55 -0800 |
| Message-ID | <mailman.6880.1392324956.18130.python-list@python.org> |
| In reply to | #66200 |
On 2/13/2014 11:59 AM, Zachary Ware wrote: > In a fit of curiosity, I did some timings: Snip of lots of TMTOWTDT/TIMTOWTDI/whatever... timed examples :) But I didn't see this one: s[::len(s)-1] Emile
[toc] | [prev] | [next] | [standalone]
Page 1 of 3 [1] 2 3 Next page →
Back to top | Article view | comp.lang.python
csiph-web