Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Oscar Benjamin Newsgroups: comp.lang.python Subject: Re: Heap Implementation Date: Tue, 2 Feb 2016 14:53:02 +0000 Lines: 62 Message-ID: References: <56ad6800$0$14486$c3e8da3@news.astraweb.com> <56b040c5$0$1590$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: news.uni-berlin.de 6VPjof5my5nL638Jed1uvwfrG6JmbsD0ZINuH0oF8TOw== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.031 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'cpython': 0.05; 'cc:addr :python-list': 0.09; 'components:': 0.09; 'likewise': 0.09; 'snippet': 0.09; 'argument': 0.15; '2016': 0.16; 'caching': 0.16; 'cc:name:python list': 0.16; 'describing': 0.16; 'effect,': 0.16; 'extraneous': 0.16; 'mean,': 0.16; 'measurement': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'sorts': 0.16; 'structure.': 0.16; 'time"': 0.16; 'to:addr:pearwood.info': 0.16; 'to:addr:steve+comp.lang.python': 0.16; "to:name:steven d'aprano": 0.16; 'useless': 0.16; 'wrote:': 0.16; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'machine': 0.21; 'constant': 0.22; 'header:In-Reply-To:1': 0.24; 'all.': 0.24; 'example': 0.26; 'sense': 0.26; 'message-id:@mail.gmail.com': 0.27; 'actual': 0.28; 'forces': 0.29; 'loop,': 0.29; 'measure': 0.29; 'system,': 0.30; 'code': 0.30; 'especially': 0.32; 'run': 0.33; 'useful': 0.33; "d'aprano": 0.33; 'ram': 0.33; 'steven': 0.33; 'structure': 0.34; 'running': 0.34; 'gives': 0.35; 'received:google.com': 0.35; 'could': 0.35; 'something': 0.35; 'level': 0.35; 'expected': 0.35; 'but': 0.36; 'there': 0.36; 'possible.': 0.36; 'received:209.85': 0.36; 'cases': 0.36; 'depends': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'seem': 0.37; 'things': 0.38; 'doing': 0.38; 'itself': 0.38; 'times.': 0.38; 'received:209': 0.38; 'minimum': 0.38; 'test': 0.39; 'data': 0.39; 'whatever': 0.39; 'rather': 0.39; 'well.': 0.40; 'where': 0.40; 'called': 0.40; 'some': 0.40; 'care': 0.60; 'your': 0.60; 'entire': 0.61; 'side': 0.62; 'real': 0.62; 'programs': 0.62; 'more': 0.63; 'different': 0.63; 'necessarily': 0.63; 'information': 0.63; 'times': 0.63; 'accessed': 0.66; 'else.': 0.66; 'levels': 0.70; 'exclusive': 0.81; '"best': 0.84; 'about,': 0.84; 'alone,': 0.84; 'median': 0.84; 'oscar': 0.84; 'adopt': 0.91; 'assessing': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=DSpCTuFlu24irDqquoCGxMPFP/JDtwww0dyfi/hAPX4=; b=o/PPeMro6o4KN4+lqXvmCZo12K5mrjJoa1mZFSG5Zq6GUWN80wFhq3nPKW7bn3di95 GRAaz2O+cwgMl/ouiU57AjvTY2c7FxHE6idp8/JA574Nhl3Ag+2/ZX7vazKgUzRtyWbs bszHT76PK+lvR79yg/Tn6Q3UrsNrFPZD1MC7Y5Xpkp1u0DjfhJafOYgBYWH0vXJT/G5y sY1htPSsNLxXawK1c2IxruXSQomOtQTfRtzgNbk2EwiudZHB5nd/nOpy2p2i0JkV938e r69UPjHhUht9Avgi83MnNn7fwKf9p5ZKKjNveZwVDceNXDfV5yxk7w/6+dCHIaq40dAD rPLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=DSpCTuFlu24irDqquoCGxMPFP/JDtwww0dyfi/hAPX4=; b=H1h6Mtdqsz25W+R9YDzC4c2bMlVC67h7YGqEEtgEoJuW3QrQ3dOwhmrsLjhxSBpiho jrILTj8oMl6eyrIyaj51XesSUqBhI/SZ6797c7oMdRlamf0SHQDZ0V16vamt1CuDg6+M ZvmQjMfVmD3uAxrgggh8p8Ri5ZWKglLcMDpkETQbASdgUeGsNbtUynswQs651W7wMRPi 9BkQdmGx+L34umJXaA7p2sEvWkaYIJSJVgN1RAzCp6MJ39sPjPbq3ASwe8XFIx6X99nb n4l5YlmZj4hMEPjeRpRrlsdyNcaJgvKa5lK7CvyL5CDUDZV4Gl0+1Iz+73gr+5gI2Eb5 GYAQ== X-Gm-Message-State: AG10YOTFIh18Eft7f8QXpa6E5UqPzz8dq2+HSO9iQ1v7oUw171fXrZUoTQmxszzxwL6VvYpy6Iq5bqLNtTFW3g== X-Received: by 10.112.172.233 with SMTP id bf9mr11204110lbc.121.1454424801811; Tue, 02 Feb 2016 06:53:21 -0800 (PST) In-Reply-To: <56b040c5$0$1590$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:102407 On 2 February 2016 at 05:38, Steven D'Aprano wrote: > > In effect, each measurement you take is made up of two components: > > * the actual time that the code would take if it had exclusive > access to the machine with no other programs running, call it t; > > * and the noise added by the system, call it =CE=94t. > > It is impossible to measure t alone, you always get t+=CE=94t. The aim in= timing > is to reduce the effect of the noise as much as possible. That's a reasonable model in some cases but not all. In particular the "actual time" part is itself not a constant but rather depends on all sorts of things like RAM caching etc. On the CPython side there could be all kinds of optimisations that would kick in at different times to change the "actual time". Likewise there are optimisations at OS and hardware levels that kick in at different times. > The way to do this with timeit is: > > - don't run extraneous code as part of your test (just measure what > you care about, with as little scaffolding around it); > > - run that code snippet as many times as you can bear in a loop, > *but* let timeit manage the loop; don't try doing it by hand; There are many situations where running something in a loop over and over triggers all kinds of optimisations. PyPy's jit warmup is an example but the effect can occur at the hardware level as well. Performance of code run repeatedly in a tight loop can be different to the same code called once or called sporadically in a long-running process. This is especially true of code that operates on a data structure. In real usage the data structure might be accessed rarely but a tight-profiling loop forces the entire structure into the CPU's caches. > - only care about "best of N", where N is as large a number as you > can stand; > > - averages, standard deviation, and other statistics are pretty > much useless here, the only statistic that you care about is > the minimum. This argument makes sense if you adopt the t+=CE=94t model described above but that model it is just a model. The minimum from micro-benchmarking in a tight-loop is not necessarily a more representative statistic than mean, median or whatever else. Standard deviation is actually useful for assessing the variability of time taken which can also be important and gives additional information about performance: I would like to minimise both the expected time taken and also the variability in that simultaneously if possible. You seem to be describing timeit's way of profiling which is reasonable but it's not the only way. -- Oscar