Re: List comprehension timing difference.

Path	csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!news.wiretrip.org!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path	<python@mrabarnett.plus.com>
X-Original-To	python-list@python.org
Delivered-To	python-list@mail.python.org
X-Spam-Status	OK 0.002
X-Spam-Evidence	'H': 1.00; 'S': 0.00; 'from:addr:python': 0.09; 'graph': 0.09; 'def': 0.15; '(0,': 0.16; '(lambda': 0.16; 'bart': 0.16; 'code?': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.92': 0.16; 'received:84.92.122': 0.16; 'received:84.92.122.60': 0.16; 'received:84.93': 0.16; 'received:84.93.230': 0.16; 'reply-to:addr:python-list': 0.16; 'stanford': 0.16; 'subject:timing': 0.16; 'subscripting': 0.16; 'values:': 0.16; 'wrote:': 0.16; 'wrote': 0.20; 'seconds': 0.21; 'header:In-Reply-To:1': 0.22; 'differ': 0.23; 'code': 0.25; 'subject:List': 0.25; '(the': 0.28; 'received:84': 0.28; 'import': 0.28; 'second': 0.29; 'print': 0.29; 'str': 0.30; 'seem': 0.31; 'version': 0.32; 'list': 0.32; 'does': 0.32; 'actually': 0.33; 'to:addr:python-list': 0.33; 'difference': 0.34; 'header:User- Agent:1': 0.34; 'creates': 0.34; 'letter.': 0.34; 'reply- to:addr:python.org': 0.34; '(for': 0.35; 'file': 0.36; 'perform': 0.37; 'two': 0.37; 'subject:: ': 0.39; 'skip:- 50': 0.39; 'data': 0.39; 'to:addr:python.org': 0.39; 'skip:o 30': 0.63; 'lower': 0.64; 'header:Reply-To:1': 0.71; 'reply-to:no real name:2**0': 0.71
X-IronPort-Anti-Spam-Filtered	true
X-IronPort-Anti-Spam-Result	AsIGACEtYE5UXebj/2dsb2JhbABCmSyPJ3eBQAEBBThAEQsIEAkUAg8JAwIBAgENOBMIAQGHcLplhl8Ei2xJjACLfw
Date	Fri, 02 Sep 2011 02:12:47 +0100
From	MRAB <python@mrabarnett.plus.com>
User-Agent	Mozilla/5.0 (Windows NT 5.1; rv:6.0.1) Gecko/20110830 Thunderbird/6.0.1
MIME-Version	1.0
To	python-list@python.org
Subject	Re: List comprehension timing difference.
References	<87zkinkcht.fsf@gmail.com>
In-Reply-To	<87zkinkcht.fsf@gmail.com>
Content-Type	text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding	7bit
X-BeenThere	python-list@python.org
X-Mailman-Version	2.1.12
Precedence	list
Reply-To	python-list@python.org
List-Id	General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe	<http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive	<http://mail.python.org/pipermail/python-list>
List-Post	<mailto:python-list@python.org>
List-Help	<mailto:python-list-request@python.org?subject=help>
List-Subscribe	<http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups	comp.lang.python
Message-ID	<mailman.676.1314925972.27778.python-list@python.org> (permalink)
Lines	57
NNTP-Posting-Host	2001:888:2000:d::a6
X-Trace	1314925972 news.xs4all.nl 2430 [2001:888:2000:d::a6]:44738
X-Complaints-To	abuse@xs4all.nl
Xref	x330-a1.tempe.blueboxinc.net comp.lang.python:12598

Show key headers only | View raw

On 02/09/2011 01:35, Bart Kastermans wrote:
>
> In the following code I create the graph with vertices
> sgb-words.txt (the file of 5 letter words from the
> stanford graphbase), and an edge if two words differ
> by one letter.  The two methods I wrote seem to me to
> likely perform the same computations, the list comprehension
> is faster though (281 seconds VS 305 seconds on my dell mini).
>
> Is the right interpretation of this timing difference
> that the comprehension is performed in the lower level
> C code?
>
> As this time I have no other conjecture about the cause.
>
> ---------------------------------------------------------
> import time
> import copy
>
> data = map (lambda x: x.strip(), open('sgb-words.txt').readlines())
>
> def d (w1, w2):
>      count = 0
>      for idx in range(0,5):
>          if w1[idx] != w2[idx]:
>              count += 1
>      return count
>
> print "creating graph"
> t0 = time.clock ()
> graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a<  b]
> t1 = time.clock ()
> print "took " + str (t1 - t0) + " seconds."
>
> t0 = time.clock ()
> graph2 = []
> for i in range (0, len(data)):
>      for j in range(0,len(data)):
>          if d(data[i],data[j]) == 1 and i<  j:
>              graph2.append ([i,j])
> t1 = time.clock ()
> print "took " + str (t1 - t0) + " seconds."

Are they actually equivalent? Does graph == graph2?

The first version (list comprehension) creates a list of pairs of
values:

     [a, b]

whereas the second version (for loops) creates a list of pairs of
indexes:

     [i, j]

The second version has subscripting ("data[i]" and "data[j]"), which
will slow it down.

Thread

List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-01 18:35 -0600
  Re: List comprehension timing difference. MRAB <python@mrabarnett.plus.com> - 2011-09-02 02:12 +0100
    Re: List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-02 07:54 -0600
      Re: List comprehension timing difference. ting@thsu.org - 2011-09-02 07:50 -0700
        Re: List comprehension timing difference. Bart Kastermans <bkasterm@gmail.com> - 2011-09-02 20:15 -0600

csiph-web