Path: csiph.com!usenet.pasdenom.info!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'algorithm': 0.03; 'operator': 0.03; 'subject:file': 0.07; 'python': 0.09; '"if': 0.09; 'command.': 0.09; 'dict': 0.09; 'item,': 0.09; 'received :mail-vc0-f174.google.com': 0.09; 'cc:addr:python-list': 0.10; 'aug': 0.13; "skip:' 30": 0.15; 'slightly': 0.15; '"in"': 0.16; '0.07': 0.16; '2.6:': 0.16; 'entries,': 0.16; 'iterated': 0.16; 'iterating': 0.16; 'iteration,': 0.16; 'iteration.': 0.16; 'linear,': 0.16; 'no-op': 0.16; 'say.': 0.16; 'seconds,': 0.16; 'seconds.': 0.16; 'wrote:': 0.17; 'thu,': 0.17; 'tests': 0.18; '>>>': 0.18; 'appears': 0.18; 'fairly': 0.21; 'keys': 0.22; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'values': 0.26; 'cc:addr:gmail.com': 0.27; 'plain': 0.27; 'scale': 0.27; 'cc:2**2': 0.27; 'message-id:@mail.gmail.com': 0.27; 'dictionary': 0.29; 'overhead': 0.29; 'received:209.85.220.174': 0.29; "skip:' 20": 0.32; 'received:google.com': 0.34; 'especially': 0.35; 'doing': 0.35; 'pm,': 0.35; 'received:209.85.220': 0.35; 'received:209.85': 0.35; 'cc:no real name:2**1': 0.36; 'uses': 0.37; 'why': 0.37; 'skip:3 10': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'header:Received:5': 0.40; 'close': 0.63; 'here': 0.65; 'as:': 0.75; '100': 0.78; 'actually,': 0.84; 'difference.': 0.84; 'different.': 0.84; 'touching': 0.84; 'hundred': 0.95 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-gm-message-state; bh=fZZN+/CPT04rd1Pfe/Qn22hjMNLtXVdbMRPcV+GmwQc=; b=Ul47OJkjsePNrfS82EHybQSJM1YF5GvImfoNjBej2hLA4CqWLOc7tkvD4LGHjbr1Zt 3Wv40vvTeh4KQn3pvYaisnvF6C5X2IL7KvVdttAjhk0qpx1lryhRfkMPnfpvL88zQe0X rtRyWhLrM+ocGRjA8yE0y1H5uItB3yZPc5bHHeN1E1mdE2dht/tD6oZ8xfCY1WA8camD c/OxkoSW4VtmeYGKNdqMf5asR+YcgohFdAieM3reJSBMKEqYZo3cHRiffGFS0hiLV4LN kxV/yhpU786TvwC0QveT2fGqi2M35CzoDj+O6obnk6pQ2XdGZRSZonA26o5HbUSZiSLx scmg== MIME-Version: 1.0 In-Reply-To: <583C055C-9E99-4EAC-8F3A-D578C399826E@gmail.com> References: <930ab3d8-4ab9-446d-9970-ee811eb70a44@googlegroups.com> <50241F14.2060209@tim.thechases.com> <36EA3847-6713-4C12-B47B-9B5E10325F00@gmail.com> <502429C3.5000600@tim.thechases.com> <583C055C-9E99-4EAC-8F3A-D578C399826E@gmail.com> From: Chris Kaynor Date: Thu, 9 Aug 2012 14:49:03 -0700 Subject: Re: save dictionary to a file without brackets. To: Roman Vashkevich Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnI6O/pCo8CWbE1Npcnh0IblS5a+nOrqI7B6xQj+0mawMkkUK9WJ8G5yfs4vvpfOD7za7+l Cc: python-list@python.org, giuseppe.amatulli@gmail.com X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 46 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1344548966 news.xs4all.nl 6978 [2001:888:2000:d::a6]:46396 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:26820 On Thu, Aug 9, 2012 at 2:34 PM, Roman Vashkevich wrote: > > Actually, they are different. > Put a dict.{iter}items() in an O(k^N) algorithm and make it a hundred thousand entries, and you will feel the difference. > Dict uses hashing to get a value from the dict and this is why it's O(1). > Using "in" as an operator such as: "if key in dict" or "result = key in dict" is O(1) as you say. Iterating on the dictionary requires touching every item, and so is O(n), even though it also using "in" in the command. Here are a few quick timing tests I just ran with Python 2.6: >>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(1))') 0.078683853332734088 >>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(10))') 0.17451784110969015 >>> timeit.timeit('for i in d: pass', 'd=dict.fromkeys(range(100))') 1.1708168159579486 >>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(1))') 0.14186911440299355 >>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(10))') 0.33836512561802579 >>> timeit.timeit('for i in d.iteritems(): pass', 'd=dict.fromkeys(range(100))') 2.2544262854249268 >>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(1))') 0.10009793211446549 >>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(10))') 0.38825072496723578 >>> timeit.timeit('for i in d: v=d[i]', 'd=dict.fromkeys(range(100))') 3.3020098061049339 As can be seen here, a 1-item dictionary iterated in 0.07 seconds, 10 items in 0.17 seconds, and 100 items in 1.17 seconds. That is fairly close to linear, especially when considering the overhead of a complete no-op Using iteritems, it appears to actually scale slightly better than linear, though it is slower than just the plain iteration. Doing a plain iteration, then looking up the keys to get the values also appears to be linear, and is even slower than iteritems.