Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #66200 > unrolled thread

A curious bit of code...

Started byforman.simon@gmail.com
First post2014-02-13 10:37 -0800
Last post2014-02-14 09:06 -0500
Articles 20 on this page of 45 — 16 participants

Back to article view | Back to comp.lang.python


Contents

  A curious bit of code... forman.simon@gmail.com - 2014-02-13 10:37 -0800
    Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 13:45 -0500
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 10:45 -0800
    Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 19:09 +0000
    Re: A curious bit of code... Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2014-02-13 20:05 +0100
    Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:17 +0000
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:20 -0800
      Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 14:28 -0500
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:25 -0800
    Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:25 +0000
    Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 19:32 +0000
    Re: A curious bit of code... Peter Otten <__peter__@web.de> - 2014-02-13 20:43 +0100
      Re: A curious bit of code... Marko Rauhamaa <marko@pacujo.net> - 2014-02-13 21:56 +0200
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:23 -0800
    Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 19:51 +0000
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 11:59 -0800
    Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 13:59 -0600
    Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 07:29 +1100
    Re: A curious bit of code... Tim Chase <python.list@tim.thechases.com> - 2014-02-13 14:39 -0600
    Re: A curious bit of code... Emile van Sebille <emile@fenx.com> - 2014-02-13 12:55 -0800
      Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:24 -0500
        Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 16:23 -0800
    Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:01 +1100
    Re: A curious bit of code... Neil Cerutti <neilc@norwich.edu> - 2014-02-13 21:01 +0000
    Re: A curious bit of code... Peter Otten <__peter__@web.de> - 2014-02-13 22:06 +0100
    Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:10 +1100
    Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-13 21:14 +0000
    Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:20 -0600
    Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:19 -0600
    Re: A curious bit of code... Emile van Sebille <emile@fenx.com> - 2014-02-13 13:23 -0800
    Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 08:31 +1100
      Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:38 -0500
    Re: A curious bit of code... Zachary Ware <zachary.ware+pylist@gmail.com> - 2014-02-13 15:47 -0600
    Re: A curious bit of code... Serhiy Storchaka <storchaka@gmail.com> - 2014-02-13 23:49 +0200
      Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-13 16:51 -0500
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 13:33 -0800
    Re: A curious bit of code... Chris Angelico <rosuav@gmail.com> - 2014-02-14 09:13 +1100
    Re: A curious bit of code... Ethan Furman <ethan@stoneleaf.us> - 2014-02-13 14:26 -0800
    Re: A curious bit of code... Terry Reedy <tjreedy@udel.edu> - 2014-02-13 19:29 -0500
    Re: A curious bit of code... forman.simon@gmail.com - 2014-02-13 18:45 -0800
      Re: A curious bit of code... Ned Batchelder <ned@nedbatchelder.com> - 2014-02-13 22:26 -0500
        Re: A curious bit of code... forman.simon@gmail.com - 2014-02-14 12:04 -0800
          Re: A curious bit of code... Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-02-14 21:01 +0000
    Re: A curious bit of code... Dave Angel <davea@davea.name> - 2014-02-14 07:19 -0500
      Re: A curious bit of code... Roy Smith <roy@panix.com> - 2014-02-14 09:06 -0500

Page 1 of 3  [1] 2 3  Next page →


#66200 — A curious bit of code...

Fromforman.simon@gmail.com
Date2014-02-13 10:37 -0800
SubjectA curious bit of code...
Message-ID<4cc09129-43ee-4205-a24c-03f92b594abc@googlegroups.com>
I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.

  if key[:1] + key[-1:] == '<>': ...


Some possibilities that occurred to me:

  if key.startswith('<') and key.endswith('>'): ...

and:

  if (key[:1], key[-1:]) == ('<', '>'): ...


I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?

[toc] | [next] | [standalone]


#66201

FromRoy Smith <roy@panix.com>
Date2014-02-13 13:45 -0500
Message-ID<roy-16074E.13454313022014@news.panix.com>
In reply to#66200
In article <4cc09129-43ee-4205-a24c-03f92b594abc@googlegroups.com>,
 forman.simon@gmail.com wrote:

> I ran across this and I thought there must be a better way of doing it, but 
> then after further consideration I wasn't so sure.
> 
>   if key[:1] + key[-1:] == '<>': ...
> 
> 
> Some possibilities that occurred to me:
> 
>   if key.startswith('<') and key.endswith('>'): ...
> 
> and:
> 
>   if (key[:1], key[-1:]) == ('<', '>'): ...

if re.match(r'^<.*>$', key):

sheesh.

(if you care how fast it is, pre-compile the pattern)

[toc] | [prev] | [next] | [standalone]


#66204

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-13 10:45 -0800
Message-ID<mailman.6853.1392318396.18130.python-list@python.org>
In reply to#66200
On 02/13/2014 10:37 AM, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
>    if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
>    if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
>    if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?

Unless that line of code is a bottleneck, don't worry about speed, go for readability.  In which case I'd go with the 
second option, then the first, and definitely avoid the third.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#66205

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-02-13 19:09 +0000
Message-ID<mailman.6854.1392318616.18130.python-list@python.org>
In reply to#66200
On 13/02/2014 18:37, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
>    if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
>    if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
>    if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?
>

All I can say is that if you're worried about the speed of a single line 
of code like the above then you've got problems.  Having said that, I 
suspect that using an index to extract a single character has to be 
faster than using a slice, but I haven't run these through a profiler yet :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com

[toc] | [prev] | [next] | [standalone]


#66208

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2014-02-13 20:05 +0100
Message-ID<87zjlu4yvk.fsf@dpt-info.u-strasbg.fr>
In reply to#66200
forman.simon@gmail.com writes:

> I ran across this and I thought there must be a better way of doing
> it, but then after further consideration I wasn't so sure.
>
>   if key[:1] + key[-1:] == '<>': ...
>
> Some possibilities that occurred to me:
>
>   if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
>   if (key[:1], key[-1:]) == ('<', '>'): ...

I would do: if key[0] == '<' and key[-1] == '>' ...

-- Alain.

[toc] | [prev] | [next] | [standalone]


#66209

FromNeil Cerutti <neilc@norwich.edu>
Date2014-02-13 19:17 +0000
Message-ID<mailman.6857.1392319077.18130.python-list@python.org>
In reply to#66200
On 2014-02-13, forman.simon@gmail.com <forman.simon@gmail.com>
wrote:
> I ran across this and I thought there must be a better way of
> doing it, but then after further consideration I wasn't so
> sure.
>
>   if key[:1] + key[-1:] == '<>': ...
>
> Some possibilities that occurred to me:
>
>   if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
>   if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like
> the original might be the fastest after all?

I think the following would occur to someone first:

if key[0] == '<' and key[-1] == '>':
    ...

It is wrong to avoid the obvious. Needlessly ornate or clever
code will only irritate the person who has to read it later; most
likely yourself.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#66211

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-13 11:20 -0800
Message-ID<mailman.6859.1392319225.18130.python-list@python.org>
In reply to#66200
On 02/13/2014 11:09 AM, Mark Lawrence wrote:
>
> All I can say is that if you're worried about the speed of a single line of code like the above then you've got
> problems.  Having said that, I suspect that using an index to extract a single character has to be faster than using a
> slice, but I haven't run these through a profiler yet :)

The problem with using indices in the code sample is that if the string is 0 or 1 characters long you'll get an 
exception instead of a False.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#66216

FromRoy Smith <roy@panix.com>
Date2014-02-13 14:28 -0500
Message-ID<roy-BC0641.14284113022014@news.panix.com>
In reply to#66211
In article <mailman.6859.1392319225.18130.python-list@python.org>,
 Ethan Furman <ethan@stoneleaf.us> wrote:

> On 02/13/2014 11:09 AM, Mark Lawrence wrote:
> >
> > All I can say is that if you're worried about the speed of a single line of 
> > code like the above then you've got
> > problems.  Having said that, I suspect that using an index to extract a 
> > single character has to be faster than using a
> > slice, but I haven't run these through a profiler yet :)
> 
> The problem with using indices in the code sample is that if the string is 0 
> or 1 characters long you'll get an 
> exception instead of a False.

My re.match() solution handles those edge cases just fine.

[toc] | [prev] | [next] | [standalone]


#66212

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-13 11:25 -0800
Message-ID<mailman.6860.1392319505.18130.python-list@python.org>
In reply to#66200
On 02/13/2014 11:20 AM, Ethan Furman wrote:
> On 02/13/2014 11:09 AM, Mark Lawrence wrote:
>>
>> All I can say is that if you're worried about the speed of a single
>> line of code like the above then you've got problems.  Having said
>>  that, I suspect that using an index to extract a single character
>> has to be faster than using a slice, but I haven't run these through
>>  a profiler yet :)
>
> The problem with using indices in the code sample is that if the
> string is 0 or 1 characters long you'll get an exception instead
>  of a False.

Oops, make that zero characters.  ;)

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#66214

FromNeil Cerutti <neilc@norwich.edu>
Date2014-02-13 19:25 +0000
Message-ID<mailman.6862.1392319553.18130.python-list@python.org>
In reply to#66200
On 2014-02-13, Ethan Furman <ethan@stoneleaf.us> wrote:
> On 02/13/2014 11:09 AM, Mark Lawrence wrote:
>> All I can say is that if you're worried about the speed of a
>> single line of code like the above then you've got problems.
>> Having said that, I suspect that using an index to extract a
>> single character has to be faster than using a slice, but I
>> haven't run these through a profiler yet :)
>
> The problem with using indices in the code sample is that if
> the string is 0 or 1 characters long you'll get an exception
> instead of a False.

There will be an exception only if it is zero-length. But good
point! That's a pretty sneaky way to avoid checking for a
zero-length string. Is it a popular idiom?

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#66217

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2014-02-13 19:32 +0000
Message-ID<mailman.6863.1392319960.18130.python-list@python.org>
In reply to#66200
On 13/02/2014 19:25, Neil Cerutti wrote:
> On 2014-02-13, Ethan Furman <ethan@stoneleaf.us> wrote:
>> On 02/13/2014 11:09 AM, Mark Lawrence wrote:
>>> All I can say is that if you're worried about the speed of a
>>> single line of code like the above then you've got problems.
>>> Having said that, I suspect that using an index to extract a
>>> single character has to be faster than using a slice, but I
>>> haven't run these through a profiler yet :)
>>
>> The problem with using indices in the code sample is that if
>> the string is 0 or 1 characters long you'll get an exception
>> instead of a False.
>
> There will be an exception only if it is zero-length. But good
> point! That's a pretty sneaky way to avoid checking for a
> zero-length string. Is it a popular idiom?
>

I hope not.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com

[toc] | [prev] | [next] | [standalone]


#66218

FromPeter Otten <__peter__@web.de>
Date2014-02-13 20:43 +0100
Message-ID<mailman.6864.1392320644.18130.python-list@python.org>
In reply to#66200
forman.simon@gmail.com wrote:

> I ran across this and I thought there must be a better way of doing it,
> but then after further consideration I wasn't so sure.
> 
>   if key[:1] + key[-1:] == '<>': ...
> 
> 
> Some possibilities that occurred to me:
> 
>   if key.startswith('<') and key.endswith('>'): ...
> 
> and:
> 
>   if (key[:1], key[-1:]) == ('<', '>'): ...
> 
> 
> I haven't run these through a profiler yet, but it seems like the original
> might be the fastest after all?

$ python -m timeit -s 's = "<alpha>"' 's[:1]+s[-1:] == "<>"'
1000000 loops, best of 3: 0.37 usec per loop

$ python -m timeit -s 's = "<alpha>"' 's[:1] == "<" and s[-1:] == ">"'
1000000 loops, best of 3: 0.329 usec per loop

$ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and 
s.endswith(">")'
1000000 loops, best of 3: 0.713 usec per loop

The first is too clever for my taste.

The second is fast and easy to understand. It might attract "improvements" 
replacing the slice with an index, but I trust you will catch that with your 
unit tests ;)

Personally, I'm willing to spend the few extra milliseconds and use the 
foolproof third.

[toc] | [prev] | [next] | [standalone]


#66224

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-02-13 21:56 +0200
Message-ID<87bnya7pnx.fsf@elektro.pacujo.net>
In reply to#66218
Peter Otten <__peter__@web.de>:

> Personally, I'm willing to spend the few extra milliseconds and use
> the foolproof third.

Speaking of foolproof, what is this "key?" Is it an XML start tag,
maybe? Then, how does your test fare with, say,

    <start comparison=">
    ">

which is equivalent to

    <start comparison="&gt;">


Marko

[toc] | [prev] | [next] | [standalone]


#66219

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-13 11:23 -0800
Message-ID<mailman.6865.1392320801.18130.python-list@python.org>
In reply to#66200
On 02/13/2014 11:17 AM, Neil Cerutti wrote:
> On 2014-02-13, forman.simon@gmail.com <forman.simon@gmail.com>
> wrote:
>> I ran across this and I thought there must be a better way of
>> doing it, but then after further consideration I wasn't so
>> sure.
>>
>>    if key[:1] + key[-1:] == '<>': ...
>>
>> Some possibilities that occurred to me:
>>
>>    if key.startswith('<') and key.endswith('>'): ...
>>
>> and:
>>
>>    if (key[:1], key[-1:]) == ('<', '>'): ...
>>
>>
>> I haven't run these through a profiler yet, but it seems like
>> the original might be the fastest after all?
>
> I think the following would occur to someone first:
>
> if key[0] == '<' and key[-1] == '>':
>      ...
>
> It is wrong to avoid the obvious. Needlessly ornate or clever
> code will only irritate the person who has to read it later; most
> likely yourself.

Not whet the obvious is wrong:

-> key = ''
--> if key[0] == '<' and key[-1] == '>':
...   print "good key!"
... else:
...   print "bad key"
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
IndexError: string index out of range

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#66223

FromNeil Cerutti <neilc@norwich.edu>
Date2014-02-13 19:51 +0000
Message-ID<mailman.6867.1392321113.18130.python-list@python.org>
In reply to#66200
On 2014-02-13, Peter Otten <__peter__@web.de> wrote:
> forman.simon@gmail.com wrote:
> The first is too clever for my taste.
>
> The second is fast and easy to understand. It might attract
> "improvements" replacing the slice with an index, but I trust
> you will catch that with your unit tests ;)

It's easy to forget exactly why startswith and endswith even exist.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#66225

FromEthan Furman <ethan@stoneleaf.us>
Date2014-02-13 11:59 -0800
Message-ID<mailman.6868.1392321537.18130.python-list@python.org>
In reply to#66200
On 02/13/2014 11:43 AM, Peter Otten wrote:
> forman.simon@gmail.com wrote:
>
>> I ran across this and I thought there must be a better way of doing it,
>> but then after further consideration I wasn't so sure.
>>
>>    if key[:1] + key[-1:] == '<>': ...
>>
>>
>> Some possibilities that occurred to me:
>>
>>    if key.startswith('<') and key.endswith('>'): ...
>>
>> and:
>>
>>    if (key[:1], key[-1:]) == ('<', '>'): ...
>>
>>
>> I haven't run these through a profiler yet, but it seems like the original
>> might be the fastest after all?
>
> $ python -m timeit -s 's = "<alpha>"' 's[:1]+s[-1:] == "<>"'
> 1000000 loops, best of 3: 0.37 usec per loop
>
> $ python -m timeit -s 's = "<alpha>"' 's[:1] == "<" and s[-1:] == ">"'
> 1000000 loops, best of 3: 0.329 usec per loop
>
> $ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and
> s.endswith(">")'
> 1000000 loops, best of 3: 0.713 usec per loop
>
> The first is too clever for my taste.
>
> The second is fast and easy to understand. It might attract "improvements"
> replacing the slice with an index, but I trust you will catch that with your
> unit tests ;)
>
> Personally, I'm willing to spend the few extra milliseconds and use the
> foolproof third.

For completeness:

# the slowest method from Peter
$ python -m timeit -s 's = "<alpha>"' 's.startswith("<") and s.endswith(">")'
1000000 loops, best of 3: 0.309 usec per loop

# the re method from Roy
$ python -m timeit -s "import re;pattern=re.compile(r'^<.*>$');s = '<alpha>'" "pattern.match(s)"
1000000 loops, best of 3: 0.466 usec per loop

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#66228

FromZachary Ware <zachary.ware+pylist@gmail.com>
Date2014-02-13 13:59 -0600
Message-ID<mailman.6870.1392322018.18130.python-list@python.org>
In reply to#66200
On Thu, Feb 13, 2014 at 12:37 PM,  <forman.simon@gmail.com> wrote:
> I ran across this and I thought there must be a better way of doing it, but then after further consideration I wasn't so sure.
>
>   if key[:1] + key[-1:] == '<>': ...
>
>
> Some possibilities that occurred to me:
>
>   if key.startswith('<') and key.endswith('>'): ...
>
> and:
>
>   if (key[:1], key[-1:]) == ('<', '>'): ...
>
>
> I haven't run these through a profiler yet, but it seems like the original might be the fastest after all?

In a fit of curiosity, I did some timings:

'and'ed indexing:

C:\tmp>py -m timeit -s "key = '<test>'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.35 usec per loop

C:\tmp>py -m timeit -s "key = '<test'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.398 usec per loop

C:\tmp>py -m timeit -s "key = 'test>'" "key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.188 usec per loop

C:\tmp>py -m timeit -s "key = 'test'" "key[0] == '<' and key[-1] == '>'"
10000000 loops, best of 3: 0.211 usec per loop

C:\tmp>py -m timeit -s "key = ''" "key[0] == '<' and key[-1] == '>'"
Traceback (most recent call last):
  File "P:\Python34\lib\timeit.py", line 292, in main
    x = t.timeit(number)
  File "P:\Python34\lib\timeit.py", line 178, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
    key[0] == '<' and key[-1] == '>'
IndexError: string index out of range


Slice concatenation:

C:\tmp>py -m timeit -s "key = '<test>'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.649 usec per loop

C:\tmp>py -m timeit -s "key = '<test'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.7 usec per loop

C:\tmp>py -m timeit -s "key = 'test>'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.663 usec per loop

C:\tmp>py -m timeit -s "key = 'test'" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.665 usec per loop

C:\tmp>py -m timeit -s "key = ''" "key[:1] + key[-1:] == '<>'"
1000000 loops, best of 3: 0.456 usec per loop


String methods:

C:\tmp>py -m timeit -s "key = '<test>'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 1.03 usec per loop

C:\tmp>py -m timeit -s "key = '<test'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 1.02 usec per loop

C:\tmp>py -m timeit -s "key = 'test>'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 0.504 usec per loop

C:\tmp>py -m timeit -s "key = 'test'" "key.startswith('<') and
key.endswith('>')"
1000000 loops, best of 3: 0.502 usec per loop

C:\tmp>py -m timeit -s "key = ''" "key.startswith('<') and key.endswith('>')"
1000000 loops, best of 3: 0.49 usec per loop


Tuple comparison:

C:\tmp>py -m timeit -s "key = '<test>'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.629 usec per loop

C:\tmp>py -m timeit -s "key = '<test'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.689 usec per loop

C:\tmp>py -m timeit -s "key = 'test>'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.676 usec per loop

C:\tmp>py -m timeit -s "key = 'test'" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.675 usec per loop

C:\tmp>py -m timeit -s "key = ''" "(key[:1], key[-1:]) == ('<', '>')"
1000000 loops, best of 3: 0.608 usec per loop


re.match():

C:\tmp>py -m timeit -s "import re;key = '<test>'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 3.39 usec per loop

C:\tmp>py -m timeit -s "import re;key = '<test'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 3.27 usec per loop

C:\tmp>py -m timeit -s "import re;key = 'test>'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.94 usec per loop

C:\tmp>py -m timeit -s "import re;key = 'test'" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.97 usec per loop

C:\tmp>py -m timeit -s "import re;key = ''" "re.match(r'^<.*>$', key)"
100000 loops, best of 3: 2.97 usec per loop


Pre-compiled re:

C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'<test>'" "r.match(key)"
1000000 loops, best of 3: 0.932 usec per loop

C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'<test'" "r.match(key)"
1000000 loops, best of 3: 0.79 usec per loop

C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'test>'" "r.match(key)"
1000000 loops, best of 3: 0.718 usec per loop

C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key =
'test'" "r.match(key)"
1000000 loops, best of 3: 0.755 usec per loop

C:\tmp>py -m timeit -s "import re;r = re.compile(r'^<.*>$');key = ''"
"r.match(key)"
1000000 loops, best of 3: 0.731 usec per loop


Pre-compiled re with pre-fetched method:

C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= '<test>'" "m(key)"
1000000 loops, best of 3: 0.777 usec per loop

C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= '<test'" "m(key)"
1000000 loops, best of 3: 0.65 usec per loop

C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= 'test>'" "m(key)"
1000000 loops, best of 3: 0.652 usec per loop

C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= 'test'" "m(key)"
1000000 loops, best of 3: 0.576 usec per loop

C:\tmp>py -m timeit -s "import re;m = re.compile(r'^<.*>$').match;key
= ''" "m(key)"
1000000 loops, best of 3: 0.58 usec per loop


And the winner is:

C:\tmp>py -m timeit -s "key = '<test>'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.388 usec per loop

C:\tmp>py -m timeit -s "key = '<test'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.413 usec per loop

C:\tmp>py -m timeit -s "key = 'test>'" "key and key[0] == '<' and
key[-1] == '>'"
1000000 loops, best of 3: 0.219 usec per loop

C:\tmp>py -m timeit -s "key = 'test'" "key and key[0] == '<' and key[-1] == '>'"
1000000 loops, best of 3: 0.215 usec per loop

C:\tmp>py -m timeit -s "key = ''" "key and key[0] == '<' and key[-1] == '>'"
10000000 loops, best of 3: 0.0481 usec per loop


So, the moral of the story?  Use short-circuit logic wherever you can,
don't use re for simple stuff (because while it may be very fast, it's
dominated by attribute lookup and function call overhead), and unless
you expect to be doing this test many many millions of times in a very
short space of time, go for readability over performance.

-- 
Zach

[toc] | [prev] | [next] | [standalone]


#66232

FromChris Angelico <rosuav@gmail.com>
Date2014-02-14 07:29 +1100
Message-ID<mailman.6875.1392323389.18130.python-list@python.org>
In reply to#66200
On Fri, Feb 14, 2014 at 6:32 AM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
>> There will be an exception only if it is zero-length. But good
>> point! That's a pretty sneaky way to avoid checking for a
>> zero-length string. Is it a popular idiom?
>>
>
> I hope not.

The use of slicing rather than indexing to avoid problems when the
string's too short? I don't know about popular, but I've certainly
used it a good bit. For the specific case of string comparisons you
can use startswith/endswith, but slicing works with other types as
well.

Also worth noting:

Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> s1,s2=b"asdf",u"asdf"
>>> s1[:1],s2[:1]
('a', u'a')
>>> s1[0],s2[0]
('a', u'a')

Python 3.4.0b2 (v3.4.0b2:ba32913eb13e, Jan  5 2014, 16:23:43) [MSC
v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> s1,s2=b"asdf",u"asdf"
>>> s1[:1],s2[:1]
(b'a', 'a')
>>> s1[0],s2[0]
(97, 'a')

When you slice, you get back the same type as you started with. (Also
true of lists, tuples, and probably everything else that can be
sliced.) When you index, you might not; strings are a special case
(since Python lacks a "character" type), and if your code has to run
on Py2 and Py3, byte strings stop being that special case in Py3. So
if you're working with a byte string, it might be worth slicing rather
than indexing. (Though you can still use startswith/endswith, if they
suit your purpose.)

ChrisA

[toc] | [prev] | [next] | [standalone]


#66234

FromTim Chase <python.list@tim.thechases.com>
Date2014-02-13 14:39 -0600
Message-ID<mailman.6877.1392323932.18130.python-list@python.org>
In reply to#66200
On 2014-02-13 10:37, forman.simon@gmail.com wrote:
> I ran across this and I thought there must be a better way of doing
> it, but then after further consideration I wasn't so sure.
> 
> Some possibilities that occurred to me:
> 
>   if key.startswith('<') and key.endswith('>'): ...

This is my favorite because it doesn't break on the empty string like
some of your alternatives.  Your k[0] and k[-1] assume there's at
least one character in the string, otherwise an IndexError is raised.

-tkc



[toc] | [prev] | [next] | [standalone]


#66238

FromEmile van Sebille <emile@fenx.com>
Date2014-02-13 12:55 -0800
Message-ID<mailman.6880.1392324956.18130.python-list@python.org>
In reply to#66200
On 2/13/2014 11:59 AM, Zachary Ware wrote:
> In a fit of curiosity, I did some timings:

Snip of lots of TMTOWTDT/TIMTOWTDI/whatever... timed examples :)

But I didn't see this one:

s[::len(s)-1]

Emile




[toc] | [prev] | [next] | [standalone]


Page 1 of 3  [1] 2 3  Next page →

Back to top | Article view | comp.lang.python


csiph-web