Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #102407

Re: Heap Implementation

Path csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From Oscar Benjamin <oscar.j.benjamin@gmail.com>
Newsgroups comp.lang.python
Subject Re: Heap Implementation
Date Tue, 2 Feb 2016 14:53:02 +0000
Lines 62
Message-ID <mailman.16.1454424804.3032.python-list@python.org> (permalink)
References <mailman.148.1454194055.2338.python-list@python.org> <56ad6800$0$14486$c3e8da3@news.astraweb.com> <mailman.6.1454355132.3032.python-list@python.org> <56b040c5$0$1590$c3e8da3$5496439d@news.astraweb.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
X-Trace news.uni-berlin.de 6VPjof5my5nL638Jed1uvwfrG6JmbsD0ZINuH0oF8TOw==
Return-Path <oscar.j.benjamin@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.031
X-Spam-Evidence '*H*': 0.94; '*S*': 0.00; 'cpython': 0.05; 'cc:addr :python-list': 0.09; 'components:': 0.09; 'likewise': 0.09; 'snippet': 0.09; 'argument': 0.15; '2016': 0.16; 'caching': 0.16; 'cc:name:python list': 0.16; 'describing': 0.16; 'effect,': 0.16; 'extraneous': 0.16; 'mean,': 0.16; 'measurement': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'sorts': 0.16; 'structure.': 0.16; 'time"': 0.16; 'to:addr:pearwood.info': 0.16; 'to:addr:steve+comp.lang.python': 0.16; "to:name:steven d'aprano": 0.16; 'useless': 0.16; 'wrote:': 0.16; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'machine': 0.21; 'constant': 0.22; 'header:In-Reply-To:1': 0.24; 'all.': 0.24; 'example': 0.26; 'sense': 0.26; 'message-id:@mail.gmail.com': 0.27; 'actual': 0.28; 'forces': 0.29; 'loop,': 0.29; 'measure': 0.29; 'system,': 0.30; 'code': 0.30; 'especially': 0.32; 'run': 0.33; 'useful': 0.33; "d'aprano": 0.33; 'ram': 0.33; 'steven': 0.33; 'structure': 0.34; 'running': 0.34; 'gives': 0.35; 'received:google.com': 0.35; 'could': 0.35; 'something': 0.35; 'level': 0.35; 'expected': 0.35; 'but': 0.36; 'there': 0.36; 'possible.': 0.36; 'received:209.85': 0.36; 'cases': 0.36; 'depends': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'seem': 0.37; 'things': 0.38; 'doing': 0.38; 'itself': 0.38; 'times.': 0.38; 'received:209': 0.38; 'minimum': 0.38; 'test': 0.39; 'data': 0.39; 'whatever': 0.39; 'rather': 0.39; 'well.': 0.40; 'where': 0.40; 'called': 0.40; 'some': 0.40; 'care': 0.60; 'your': 0.60; 'entire': 0.61; 'side': 0.62; 'real': 0.62; 'programs': 0.62; 'more': 0.63; 'different': 0.63; 'necessarily': 0.63; 'information': 0.63; 'times': 0.63; 'accessed': 0.66; 'else.': 0.66; 'levels': 0.70; 'exclusive': 0.81; '"best': 0.84; 'about,': 0.84; 'alone,': 0.84; 'median': 0.84; 'oscar': 0.84; 'adopt': 0.91; 'assessing': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=DSpCTuFlu24irDqquoCGxMPFP/JDtwww0dyfi/hAPX4=; b=o/PPeMro6o4KN4+lqXvmCZo12K5mrjJoa1mZFSG5Zq6GUWN80wFhq3nPKW7bn3di95 GRAaz2O+cwgMl/ouiU57AjvTY2c7FxHE6idp8/JA574Nhl3Ag+2/ZX7vazKgUzRtyWbs bszHT76PK+lvR79yg/Tn6Q3UrsNrFPZD1MC7Y5Xpkp1u0DjfhJafOYgBYWH0vXJT/G5y sY1htPSsNLxXawK1c2IxruXSQomOtQTfRtzgNbk2EwiudZHB5nd/nOpy2p2i0JkV938e r69UPjHhUht9Avgi83MnNn7fwKf9p5ZKKjNveZwVDceNXDfV5yxk7w/6+dCHIaq40dAD rPLg==
X-Google-DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=DSpCTuFlu24irDqquoCGxMPFP/JDtwww0dyfi/hAPX4=; b=H1h6Mtdqsz25W+R9YDzC4c2bMlVC67h7YGqEEtgEoJuW3QrQ3dOwhmrsLjhxSBpiho jrILTj8oMl6eyrIyaj51XesSUqBhI/SZ6797c7oMdRlamf0SHQDZ0V16vamt1CuDg6+M ZvmQjMfVmD3uAxrgggh8p8Ri5ZWKglLcMDpkETQbASdgUeGsNbtUynswQs651W7wMRPi 9BkQdmGx+L34umJXaA7p2sEvWkaYIJSJVgN1RAzCp6MJ39sPjPbq3ASwe8XFIx6X99nb n4l5YlmZj4hMEPjeRpRrlsdyNcaJgvKa5lK7CvyL5CDUDZV4Gl0+1Iz+73gr+5gI2Eb5 GYAQ==
X-Gm-Message-State AG10YOTFIh18Eft7f8QXpa6E5UqPzz8dq2+HSO9iQ1v7oUw171fXrZUoTQmxszzxwL6VvYpy6Iq5bqLNtTFW3g==
X-Received by 10.112.172.233 with SMTP id bf9mr11204110lbc.121.1454424801811; Tue, 02 Feb 2016 06:53:21 -0800 (PST)
In-Reply-To <56b040c5$0$1590$c3e8da3$5496439d@news.astraweb.com>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Xref csiph.com comp.lang.python:102407

Show key headers only | View raw


On 2 February 2016 at 05:38, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
>
> In effect, each measurement you take is made up of two components:
>
> * the actual time that the code would take if it had exclusive
>   access to the machine with no other programs running, call it t;
>
> * and the noise added by the system, call it Δt.
>
> It is impossible to measure t alone, you always get t+Δt. The aim in timing
> is to reduce the effect of the noise as much as possible.

That's a reasonable model in some cases but not all. In particular the
"actual time" part is itself not a constant but rather depends on all
sorts of things like RAM caching etc. On the CPython side there could
be all kinds of optimisations that would kick in at different times to
change the "actual time". Likewise there are optimisations at OS and
hardware levels that kick in at different times.

> The way to do this with timeit is:
>
> - don't run extraneous code as part of your test (just measure what
>   you care about, with as little scaffolding around it);
>
> - run that code snippet as many times as you can bear in a loop,
>   *but* let timeit manage the loop; don't try doing it by hand;

There are many situations where running something in a loop over and
over triggers all kinds of optimisations. PyPy's jit warmup is an
example but the effect can occur at the hardware level as well.
Performance of code run repeatedly in a tight loop can be different to
the same code called once or called sporadically in a long-running
process.

This is especially true of code that operates on a data structure. In
real usage the data structure might be accessed rarely but a
tight-profiling loop forces the entire structure into the CPU's
caches.

> - only care about "best of N", where N is as large a number as you
>   can stand;
>
> - averages, standard deviation, and other statistics are pretty
>   much useless here, the only statistic that you care about is
>   the minimum.

This argument makes sense if you adopt the t+Δt model described above
but that model it is just a model. The minimum from micro-benchmarking
in a tight-loop is not necessarily a more representative statistic
than mean, median or whatever else. Standard deviation is actually
useful for assessing the variability of time taken which can also be
important and gives additional information about performance: I would
like to minimise both the expected time taken and also the variability
in that simultaneously if possible.

You seem to be describing timeit's way of profiling which is
reasonable but it's not the only way.

--
Oscar

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Heap Implementation "Sven R. Kunze" <srkunze@mail.de> - 2016-01-30 23:47 +0100
  Re: Heap Implementation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-01-31 12:48 +1100
    Re: Heap Implementation "Sven R. Kunze" <srkunze@mail.de> - 2016-02-01 20:32 +0100
      Re: Heap Implementation Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-02 16:38 +1100
        Re: Heap Implementation Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-02 14:53 +0000

csiph-web