Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!news.wiretrip.org!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'from:addr:python': 0.09; 'graph': 0.09; 'def': 0.15; '(0,': 0.16; '(lambda': 0.16; 'bart': 0.16; 'code?': 0.16; 'from:addr:mrabarnett.plus.com': 0.16; 'from:name:mrab': 0.16; 'message-id:@mrabarnett.plus.com': 0.16; 'received:84.92': 0.16; 'received:84.92.122': 0.16; 'received:84.92.122.60': 0.16; 'received:84.93': 0.16; 'received:84.93.230': 0.16; 'reply-to:addr:python-list': 0.16; 'stanford': 0.16; 'subject:timing': 0.16; 'subscripting': 0.16; 'values:': 0.16; 'wrote:': 0.16; 'wrote': 0.20; 'seconds': 0.21; 'header:In-Reply-To:1': 0.22; 'differ': 0.23; 'code': 0.25; 'subject:List': 0.25; '(the': 0.28; 'received:84': 0.28; 'import': 0.28; 'second': 0.29; 'print': 0.29; 'str': 0.30; 'seem': 0.31; 'version': 0.32; 'list': 0.32; 'does': 0.32; 'actually': 0.33; 'to:addr:python-list': 0.33; 'difference': 0.34; 'header:User- Agent:1': 0.34; 'creates': 0.34; 'letter.': 0.34; 'reply- to:addr:python.org': 0.34; '(for': 0.35; 'file': 0.36; 'perform': 0.37; 'two': 0.37; 'subject:: ': 0.39; 'skip:- 50': 0.39; 'data': 0.39; 'to:addr:python.org': 0.39; 'skip:o 30': 0.63; 'lower': 0.64; 'header:Reply-To:1': 0.71; 'reply-to:no real name:2**0': 0.71 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsIGACEtYE5UXebj/2dsb2JhbABCmSyPJ3eBQAEBBThAEQsIEAkUAg8JAwIBAgENOBMIAQGHcLplhl8Ei2xJjACLfw Date: Fri, 02 Sep 2011 02:12:47 +0100 From: MRAB User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:6.0.1) Gecko/20110830 Thunderbird/6.0.1 MIME-Version: 1.0 To: python-list@python.org Subject: Re: List comprehension timing difference. References: <87zkinkcht.fsf@gmail.com> In-Reply-To: <87zkinkcht.fsf@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: python-list@python.org List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 57 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1314925972 news.xs4all.nl 2430 [2001:888:2000:d::a6]:44738 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:12598 On 02/09/2011 01:35, Bart Kastermans wrote: > > In the following code I create the graph with vertices > sgb-words.txt (the file of 5 letter words from the > stanford graphbase), and an edge if two words differ > by one letter. The two methods I wrote seem to me to > likely perform the same computations, the list comprehension > is faster though (281 seconds VS 305 seconds on my dell mini). > > Is the right interpretation of this timing difference > that the comprehension is performed in the lower level > C code? > > As this time I have no other conjecture about the cause. > > --------------------------------------------------------- > import time > import copy > > data = map (lambda x: x.strip(), open('sgb-words.txt').readlines()) > > def d (w1, w2): > count = 0 > for idx in range(0,5): > if w1[idx] != w2[idx]: > count += 1 > return count > > print "creating graph" > t0 = time.clock () > graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a< b] > t1 = time.clock () > print "took " + str (t1 - t0) + " seconds." > > t0 = time.clock () > graph2 = [] > for i in range (0, len(data)): > for j in range(0,len(data)): > if d(data[i],data[j]) == 1 and i< j: > graph2.append ([i,j]) > t1 = time.clock () > print "took " + str (t1 - t0) + " seconds." Are they actually equivalent? Does graph == graph2? The first version (list comprehension) creates a list of pairs of values: [a, b] whereas the second version (for loops) creates a list of pairs of indexes: [i, j] The second version has subscripting ("data[i]" and "data[j]"), which will slow it down.