Path: csiph.com!x330-a1.tempe.blueboxinc.net!feeder3.hal-mli.net!nx02.iad01.newshosting.com!newshosting.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '"if': 0.04; 'subject:using': 0.04; 'python': 0.08; '>>>>': 0.09; 'foo': 0.09; 'pm,': 0.10; ';-)': 0.14; 'wrote:': 0.14; '"copyright",': 0.16; '"credits"': 0.16; '"license"': 0.16; '(ubuntu': 0.16; '2.5.2': 0.16; '[gcc': 0.16; 'approach,': 0.16; 'e.g': 0.16; 'guys,': 0.16; 'linux2': 0.16; 'subject: \n ': 0.16; 'converting': 0.16; 'cc:addr :python-list': 0.17; 'mon,': 0.17; 'subject:list': 0.19; 'jan': 0.20; 'header:In-Reply-To:1': 0.21; 'cc:2**0': 0.22; 'cc:no real name:2**0': 0.23; 'received:209.85.161.46': 0.23; 'received:mail- fx0-f46.google.com': 0.23; 'times,': 0.25; 'skip:[ 10': 0.26; 'received:209.85.161': 0.26; "i'm": 0.27; 'message- id:@mail.gmail.com': 0.28; 'subject:?': 0.29; 'lists': 0.29; 'cc:addr:python.org': 0.30; 'second': 0.30; 'new.': 0.30; 'yes.': 0.30; 'list': 0.33; "i've": 0.33; 'rather': 0.34; 'there': 0.35; 'lists?': 0.35; 'using': 0.35; 'received:google.com': 0.37; 'received:209.85': 0.37; '20,': 0.37; 'instead.': 0.37; 'but': 0.38; 'subject:: ': 0.38; 'doing': 0.39; 'received:209': 0.39; 'list,': 0.39; 'more': 0.60; 'back': 0.63; 'unique': 0.63; 'ever': 0.64; 'sets,': 0.84; 'subject:any': 0.84; 'subject:over': 0.84; 'subject:there': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=IMp9+46eV6zzjkJ9177hibKJKIG+a48f0kVMaeCalI8=; b=ttFQ3SQeDy/bvWvcfsRwlOk7A3CATfDGY5N9g9KXug98ZwGkSmcBsB1//NlevLGHgb MbyBAh26osjas6cWpx/ZgQU5KrxsDmmmRXxoshCsf9KkeGh+dCpAPYDffGImvVpxFYJL N9XLZl0iAC4rp61dnfR4t/Euobj/yCgSX6Ku8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=cRrRonSwl645j831CoJc9H78TwvstPbExc2RrRBy7vvemUN+TivNXjEUXMF8DpgDch GOpWaTr55SkoY8y6OmV5zqumDMjhklxJH2i7SpRaIRjQO1qcnE/F4EEfA0whepPC0Qgo 3N0Mz7WeCjpTwqbbM35ZFWYJTL/EIvDO9/Mpw= MIME-Version: 1.0 In-Reply-To: <5b73ae60-506f-45e4-a82c-e59571252d47@w4g2000yqm.googlegroups.com> References: <5b73ae60-506f-45e4-a82c-e59571252d47@w4g2000yqm.googlegroups.com> From: Ian Kelly Date: Mon, 20 Jun 2011 13:59:57 -0600 Subject: Re: Is there any advantage or disadvantage to using sets over list comps to ensure a list of unique entries? To: deathweaselx86 Content-Type: text/plain; charset=ISO-8859-1 Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1308600030 news.xs4all.nl 49175 [::ffff:82.94.164.166]:60504 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:8028 On Mon, Jun 20, 2011 at 1:43 PM, deathweaselx86 wrote: > Howdy guys, I am new. > > I've been converting lists to sets, then back to lists again to get > unique lists. > e.g > > Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48) > [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> foo = ['1','2','3'] >>>> bar = ['2','5'] >>>> foo.extend(bar) >>>> foo = list(set(foo)) >>>> foo > ['1', '3', '2', '5'] > > I used to use list comps to do this instead. >>>> foo = ['1','2','3'] >>>> bar = ['2','5'] >>>> foo.extend([a for a in bar if a not in foo]) >>>> foo > ['1', '2', '3', '5'] > > A very long time ago, we all used dictionaries, but I'm not interested > in that ever again. ;-) > Is there any performance hit to using one of these methods over the > other for rather large lists? Yes. In the second approach, "if a not in foo" is O(n) if foo is a list, and since you're doing it m times, that's O(n * m). The first approach is merely O(n + m).