Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder3.xlned.com!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <CAPTjJmondwpWzLi7dri-QQot=8YhhEpWjSnDUxTxjYD6VO+-Gw@mail.gmail.com>
References: <687dea63-84da-4c45-9366-cb5a10665d1f@googlegroups.com> <CAPTjJmondwpWzLi7dri-QQot=8YhhEpWjSnDUxTxjYD6VO+-Gw@mail.gmail.com>
Date: Sun, 2 Jun 2013 20:16:21 -0400
Subject: Re: PyWart: The problem with "print"
From: Jason Swails <jason.swails@gmail.com>
To: Chris Angelico <rosuav@gmail.com>
Content-Type: multipart/alternative; boundary=089e013c5aeed9121f04de34df50
Cc: python list <python-list@python.org>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2568.1370218591.3114.python-list@python.org>
Lines: 213
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:46741

--089e013c5aeed9121f04de34df50
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Jun 2, 2013 at 1:20 PM, Chris Angelico <rosuav@gmail.com> wrote:

>
> Hmm. Could be costly. Hey, you know, Python has something for testing that.
>
> >>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
> DEBUG: return\n\tprint(*args)\nDEBUG=False',number=1000000)
> 0.5838018519113444
>
> That's roughly half a second for a million calls to debugprint().
> That's a 580ns cost per call. Rather than fiddle with the language,
> I'd rather just take this cost. Oh, and there's another way, too: If
> you make the DEBUG flag have effect only on startup, you could write
> your module thus:
>

This is a slightly contrived demonstration... The time lost in a function
call is not just the time it takes to execute that function.  If it
consistently increases the frequency of cache misses then the cost is much
greater -- possibly by orders of magnitude if the application is truly
bound by the bandwidth of the memory bus and the CPU pipeline is almost
always saturated.

I'm actually with RR in terms of eliminating the overhead involved with
'dead' function calls, since there are instances when optimizing in Python
is desirable.  I actually recently adjusted one of my own scripts to
eliminate branching and improve data layout to achieve a 1000-fold
improvement in efficiency (~45 minutes to 0.42 s. for one example) --- all
in pure Python.  The first approach was unacceptable, the second is fine.
 For comparison, if I add a 'deactivated' debugprint call into the inner
loop (executed 243K times in this particular test), then the time of the
double-loop step that I optimized takes 0.73 seconds (nearly doubling the
duration of the whole step).  The whole program is too large to post here,
but the relevant code portion is shown below:

         i = 0
         begin = time.time()
         for mol in owner:
            for atm in mol:
               blankfunc("Print i %d" % i)
               new_atoms[i] = self.atom_list[atm]
               i += 1
         self.atom_list = new_atoms
         print "Done in %f seconds." % (time.time() - begin)

from another module:

DEBUG = False

[snip]

def blankfunc(instring):
   if DEBUG:
      print instring

Also, you're often not passing a constant literal to the debug print --
you're doing some type of string formatting or substitution if you're
really inspecting the value of a particular variable, and this also takes
time.  In the test I gave the timings for above, I passed a string the
counter substituted to the 'dead' debug function.  Copy-and-pasting your
timeit experiment on my machine yields different timings (Python 2.7):

>>> import sys
>>> timeit.timeit('debugprint("asdf")','def debugprint(*args):\n\tif not
DEBUG: return\n\tsys.stdout.write(*args)\nDEBUG=False',number=1000000)
0.15644001960754395

which is ~150 ns/function call, versus ~1300 ns/function call.  And there
may be even more extreme examples, this is just one I was able to cook up
quickly.

This is, I'm sad to say, where my alignment with RR ends.  While I use
prints in debugging all the time, it can also become a 'crutch', just like
reliance on valgrind or gdb.  If you don't believe me, you've never hit a
bug that 'magically' disappears when you add a debugging print statement
;-).

The easiest way to eliminate these 'dead' calls is to simply comment-out
the print call, but I would be quite upset if the interpreter tried to
outsmart me and do it automagically as RR seems to be suggesting.  And if
you're actually debugging, then you typically only add a few targeted print
statements -- not too hard to comment-out.  If it is, and you're really
that lazy, then by all means add a new, greppable function call and use a
sed command to comment those lines out for you.

BTW: *you* in the above message refers to a generic person -- none of my
comments were aimed at anybody in particular

All the best,
Jason

P.S. All that said, I would agree with ChrisA's suggestion that the
overhead is negligible is most cases...

--089e013c5aeed9121f04de34df50
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"color:rgb(0,0,0)"><b=
r></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Su=
n, Jun 2, 2013 at 1:20 PM, Chris Angelico <span dir=3D"ltr">&lt;<a href=3D"=
mailto:rosuav@gmail.com" target=3D"_blank">rosuav@gmail.com</a>&gt;</span> =
wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex"><div class=3D"im"><br></div>Hmm. Could be costly. Hey, you=
 know, Python has something for testing that.<br>

<br>
&gt;&gt;&gt; timeit.timeit(&#39;debugprint(&quot;asdf&quot;)&#39;,&#39;def =
debugprint(*args):\n\tif not DEBUG: return\n\tprint(*args)\nDEBUG=3DFalse&#=
39;,number=3D1000000)<br>
0.5838018519113444<br>
<br>
That&#39;s roughly half a second for a million calls to debugprint().<br>
That&#39;s a 580ns cost per call. Rather than fiddle with the language,<br>
I&#39;d rather just take this cost. Oh, and there&#39;s another way, too: I=
f<br>
you make the DEBUG flag have effect only on startup, you could write<br>
your module thus:<br></blockquote><div><br></div><div style=3D"color:rgb(0,=
0,0)" class=3D"gmail_default">This is a slightly contrived demonstration...=
 The time lost in a function call is not just the time it takes to execute =
that function. =A0If it consistently increases the frequency of cache misse=
s then the cost is much greater -- possibly by orders of magnitude if the a=
pplication is truly bound by the bandwidth of the memory bus and the CPU pi=
peline is almost always saturated.</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div styl=
e=3D"color:rgb(0,0,0)" class=3D"gmail_default">I&#39;m actually with RR in =
terms of eliminating the overhead involved with &#39;dead&#39; function cal=
ls, since there are instances when optimizing in Python is desirable. =A0I =
actually recently adjusted one of my own scripts to eliminate branching and=
 improve data layout to achieve a 1000-fold improvement in efficiency (~45 =
minutes to 0.42 s. for one example) --- all in pure Python. =A0The first ap=
proach was unacceptable, the second is fine. =A0For comparison, if I add a =
&#39;deactivated&#39; debugprint call into the inner loop (executed 243K ti=
mes in this particular test), then the time of the double-loop step that I =
optimized takes 0.73 seconds (nearly doubling the duration of the whole ste=
p). =A0The whole program is too large to post here, but the relevant code p=
ortion is shown below:</div>
<div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><div clas=
s=3D"gmail_default"><div class=3D"gmail_default"><font color=3D"#000000">=
=A0 =A0 =A0 =A0 =A0i =3D 0</font></div><div class=3D"gmail_default"><font c=
olor=3D"#000000">=A0 =A0 =A0 =A0 =A0begin =3D time.time()</font></div>
<div class=3D"gmail_default"><font color=3D"#000000">=A0 =A0 =A0 =A0 =A0for=
 mol in owner:</font></div><div class=3D"gmail_default"><font color=3D"#000=
000">=A0 =A0 =A0 =A0 =A0 =A0 for atm in mol:</font></div><div class=3D"gmai=
l_default"><font color=3D"#000000">=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0blankfunc=
(&quot;Print i %d&quot; % i)</font></div>
<div class=3D"gmail_default"><font color=3D"#000000">=A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0new_atoms[i] =3D self.atom_list[atm]</font></div><div class=3D"g=
mail_default"><font color=3D"#000000">=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0i +=3D=
 1</font></div><div class=3D"gmail_default">
<font color=3D"#000000">=A0 =A0 =A0 =A0 =A0self.atom_list =3D new_atoms</fo=
nt></div><div class=3D"gmail_default"><font color=3D"#000000">=A0 =A0 =A0 =
=A0 =A0print &quot;Done in %f seconds.&quot; % (time.time() - begin)</font>=
</div><div class=3D"gmail_default">
<font color=3D"#000000"><br></font></div><div class=3D"gmail_default" style=
><font color=3D"#000000">from another module:</font></div><div class=3D"gma=
il_default" style><font color=3D"#000000"><br></font></div><div class=3D"gm=
ail_default" style>
<font color=3D"#000000">DEBUG =3D False</font></div><div class=3D"gmail_def=
ault" style><br></div><div class=3D"gmail_default" style><font color=3D"#00=
0000">[snip]</font></div><div class=3D"gmail_default" style><font color=3D"=
#000000"><br>
</font></div><div class=3D"gmail_default" style><font color=3D"#000000"><di=
v class=3D"gmail_default">def blankfunc(instring):</div><div class=3D"gmail=
_default">=A0 =A0if DEBUG:</div><div class=3D"gmail_default">=A0 =A0 =A0 pr=
int instring</div>
<div><br></div></font></div></div><div style=3D"color:rgb(0,0,0)" class=3D"=
gmail_default">Also, you&#39;re often not passing a constant literal to the=
 debug print -- you&#39;re doing some type of string formatting or substitu=
tion if you&#39;re really inspecting the value of a particular variable, an=
d this also takes time. =A0In the test I gave the timings for above, I pass=
ed a string the counter substituted to the &#39;dead&#39; debug function. =
=A0Copy-and-pasting your timeit experiment on my machine yields different t=
imings (Python 2.7):<br>
</div><div style=3D"color:rgb(0,0,0)" class=3D"gmail_default"><br></div><di=
v style=3D"color:rgb(0,0,0)" class=3D"gmail_default">&gt;&gt;&gt; import sy=
s</div><div class=3D"gmail_default"><div class=3D"gmail_default"><div class=
=3D"gmail_default">
<font color=3D"#000000">&gt;&gt;&gt; timeit.timeit(&#39;debugprint(&quot;as=
df&quot;)&#39;,&#39;def debugprint(*args):\n\tif not DEBUG: return\n\tsys.s=
tdout.write(*args)\nDEBUG=3DFalse&#39;,number=3D1000000)</font></div><div c=
lass=3D"gmail_default">
<font color=3D"#000000">0.15644001960754395</font></div><div><br></div><div=
 style>which is ~150 ns/function call, versus ~1300 ns/function call. =A0An=
d there may be even more extreme examples, this is just one I was able to c=
ook up quickly.</div>
<div style><br></div><div style>This is, I&#39;m sad to say, where my align=
ment with RR ends. =A0While I use prints in debugging all the time, it can =
also become a &#39;crutch&#39;, just like reliance on valgrind or gdb. =A0I=
f you don&#39;t believe me, you&#39;ve never hit a bug that &#39;magically&=
#39; disappears when you add a debugging print statement ;-).</div>
<div style><br></div><div style>The easiest way to eliminate these &#39;dea=
d&#39; calls is to simply comment-out the print call, but I would be quite =
upset if the interpreter tried to outsmart me and do it automagically as RR=
 seems to be suggesting. =A0And if you&#39;re actually debugging, then you =
typically only add a few targeted print statements -- not too hard to comme=
nt-out. =A0If it is, and you&#39;re really that lazy, then by all means add=
 a new, greppable function call and use a sed command to comment those line=
s out for you.</div>
<div style><br></div><div style>BTW: *you* in the above message refers to a=
 generic person -- none of my comments were aimed at anybody in particular<=
/div><div style><br></div><div style>All the best,</div><div style>Jason</d=
iv>
<div style><br></div><div style>P.S. All that said, I would agree with Chri=
sA&#39;s suggestion that the overhead is negligible is most cases...</div><=
/div></div></div></div></div>

--089e013c5aeed9121f04de34df50--