Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #12596 > unrolled thread
| Started by | Bart Kastermans <bkasterm@gmail.com> |
|---|---|
| First post | 2011-09-01 18:35 -0600 |
| Last post | 2011-09-02 20:15 -0600 |
| Articles | 5 — 3 participants |
Back to article view | Back to comp.lang.python
List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-01 18:35 -0600
Re: List comprehension timing difference. MRAB <python@mrabarnett.plus.com> - 2011-09-02 02:12 +0100
Re: List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-02 07:54 -0600
Re: List comprehension timing difference. ting@thsu.org - 2011-09-02 07:50 -0700
Re: List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-02 20:15 -0600
| From | Bart Kastermans <bkasterm@gmail.com> |
|---|---|
| Date | 2011-09-01 18:35 -0600 |
| Subject | List comprehension timing difference. |
| Message-ID | <87zkinkcht.fsf@gmail.com> |
In the following code I create the graph with vertices
sgb-words.txt (the file of 5 letter words from the
stanford graphbase), and an edge if two words differ
by one letter. The two methods I wrote seem to me to
likely perform the same computations, the list comprehension
is faster though (281 seconds VS 305 seconds on my dell mini).
Is the right interpretation of this timing difference
that the comprehension is performed in the lower level
C code?
As this time I have no other conjecture about the cause.
---------------------------------------------------------
import time
import copy
data = map (lambda x: x.strip(), open('sgb-words.txt').readlines())
def d (w1, w2):
count = 0
for idx in range(0,5):
if w1[idx] != w2[idx]:
count += 1
return count
print "creating graph"
t0 = time.clock ()
graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a < b]
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."
t0 = time.clock ()
graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i < j:
graph2.append ([i,j])
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."
[toc] | [next] | [standalone]
| From | MRAB <python@mrabarnett.plus.com> |
|---|---|
| Date | 2011-09-02 02:12 +0100 |
| Message-ID | <mailman.676.1314925972.27778.python-list@python.org> |
| In reply to | #12596 |
On 02/09/2011 01:35, Bart Kastermans wrote:
>
> In the following code I create the graph with vertices
> sgb-words.txt (the file of 5 letter words from the
> stanford graphbase), and an edge if two words differ
> by one letter. The two methods I wrote seem to me to
> likely perform the same computations, the list comprehension
> is faster though (281 seconds VS 305 seconds on my dell mini).
>
> Is the right interpretation of this timing difference
> that the comprehension is performed in the lower level
> C code?
>
> As this time I have no other conjecture about the cause.
>
> ---------------------------------------------------------
> import time
> import copy
>
> data = map (lambda x: x.strip(), open('sgb-words.txt').readlines())
>
> def d (w1, w2):
> count = 0
> for idx in range(0,5):
> if w1[idx] != w2[idx]:
> count += 1
> return count
>
> print "creating graph"
> t0 = time.clock ()
> graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a< b]
> t1 = time.clock ()
> print "took " + str (t1 - t0) + " seconds."
>
> t0 = time.clock ()
> graph2 = []
> for i in range (0, len(data)):
> for j in range(0,len(data)):
> if d(data[i],data[j]) == 1 and i< j:
> graph2.append ([i,j])
> t1 = time.clock ()
> print "took " + str (t1 - t0) + " seconds."
Are they actually equivalent? Does graph == graph2?
The first version (list comprehension) creates a list of pairs of
values:
[a, b]
whereas the second version (for loops) creates a list of pairs of
indexes:
[i, j]
The second version has subscripting ("data[i]" and "data[j]"), which
will slow it down.
[toc] | [prev] | [next] | [standalone]
| From | Bart Kastermans <bkasterm@gmail.com> |
|---|---|
| Date | 2011-09-02 07:54 -0600 |
| Message-ID | <87sjofjbh1.fsf@gmail.com> |
| In reply to | #12598 |
MRAB <python@mrabarnett.plus.com> writes:
> On 02/09/2011 01:35, Bart Kastermans wrote:
>> graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a< b]
>> graph2 = []
>> for i in range (0, len(data)):
>> for j in range(0,len(data)):
>> if d(data[i],data[j]) == 1 and i< j:
>> graph2.append ([i,j])
>
> Are they actually equivalent? Does graph == graph2?
>
> The first version (list comprehension) creates a list of pairs of
> values:
>
> [a, b]
>
> whereas the second version (for loops) creates a list of pairs of
> indexes:
>
> [i, j]
>
> The second version has subscripting ("data[i]" and "data[j]"), which
> will slow it down.
You are absolutely right. I had changed the code from the
equivalent:
graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i < j:
graph2.append ([data[i],data[j]])
But then also tried the equivalent
for a in data:
for b in data:
if d(a,b) == 1 and a < b:
graph2.append([a,b])
Which does away with the indexing, and is just about exactly as
fast as the list comprehension.
That'll teach me; I almost didn't ask the question thinking it might
be silly. And it was, but I thought it for the wrong reason. I tell my
students there are no stupid questions, I should listen to myself
more when I do. Thanks!
[toc] | [prev] | [next] | [standalone]
| From | ting@thsu.org |
|---|---|
| Date | 2011-09-02 07:50 -0700 |
| Message-ID | <07906043-d732-43ca-8313-bee4746eca75@fi7g2000vbb.googlegroups.com> |
| In reply to | #12642 |
On Sep 2, 9:54 am, Bart Kastermans <bkast...@gmail.com> wrote: > if d(a,b) == 1 and a < b: It will probably be faster if you reverse the evaluation order of that expression. if a<b and d(a,b)==1: That way the d() function is called less than half the time. Of course this assumes that a<b is a faster evaluation than d(a,b), but I think that's true for your example. -- // T.Hsu
[toc] | [prev] | [next] | [standalone]
| From | Bart Kastermans <bkasterm@gmail.com> |
|---|---|
| Date | 2011-09-02 20:15 -0600 |
| Message-ID | <87bov29xsg.fsf@gmail.com> |
| In reply to | #12643 |
ting@thsu.org writes: > On Sep 2, 9:54 am, Bart Kastermans <bkast...@gmail.com> wrote: >> if d(a,b) == 1 and a < b: > > It will probably be faster if you reverse the evaluation order of that > expression. > > if a<b and d(a,b)==1: > > That way the d() function is called less than half the time. Of course > this assumes that a<b is a faster evaluation than d(a,b), but I think > that's true for your example. > -- > // T.Hsu Indeed makes quite a difference, goes from 275 seconds down to 153 seconds.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web