Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #11948 > unrolled thread

Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1")

Started byAndreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>
First post2011-08-21 19:27 +0200
Last post2011-08-21 21:14 -0700
Articles 20 on this page of 30 — 12 participants

Back to article view | Back to comp.lang.python


Contents

  Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-21 19:27 +0200
    Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 10:48 -0700
      Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 11:03 -0700
    Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Christian Heimes <lists@cheimes.de> - 2011-08-21 20:24 +0200
      Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Roy Smith <roy@panix.com> - 2011-08-21 14:52 -0400
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 01:17 +0200
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Chris Angelico <rosuav@gmail.com> - 2011-08-22 00:37 +0100
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 19:38 -0400
            Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 02:00 +0200
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Seebs <usenet-nospam@seebs.net> - 2011-08-22 05:33 +0000
    Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 15:39 -0400
      Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:53 -0700
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:55 -0700
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:55 -0700
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 11:12 +1000
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") "Richard D. Moores" <rdmoores@gmail.com> - 2011-08-22 02:55 -0700
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Emile van Sebille <emile@fenx.com> - 2011-08-22 09:35 -0700
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Emile van Sebille <emile@fenx.com> - 2011-08-22 10:22 -0700
      Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 13:04 -0700
      Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:53 -0700
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 01:25 +0200
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Chris Angelico <rosuav@gmail.com> - 2011-08-22 00:41 +0100
            Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 11:16 +1000
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Christian Heimes <lists@cheimes.de> - 2011-08-22 04:04 +0200
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 22:11 -0400
        Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 19:08 -0700
          Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 14:14 +1000
            Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 21:37 -0700
            Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 21:49 -0700
    Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") casevh <casevh@gmail.com> - 2011-08-21 21:14 -0700

Page 1 of 2  [1] 2  Next page →


#11948 — Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1")

FromAndreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>
Date2011-08-21 19:27 +0200
SubjectRe: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1")
Message-ID<1313947658.3424.3.camel@thegeorge>
> What the precise difference (semantics and speed) is between the 
> BINARY_ADD and INPLACE_ADD opcodes, I dunno. Look in the Python source 
> code or maybe someone knows it from memory :-)
> 
> Irmen
> 
from Python/ceval.c:

1316	        case BINARY_ADD:
1317	            w = POP();
1318	            v = TOP();
1319	            if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
1320	                /* INLINE: int + int */
1321	                register long a, b, i;
1322	                a = PyInt_AS_LONG(v);
1323	                b = PyInt_AS_LONG(w);
1324	                /* cast to avoid undefined behaviour
1325	                   on overflow */
1326	                i = (long)((unsigned long)a + b);
1327	                if ((i^a) < 0 && (i^b) < 0)
1328	                    goto slow_add;
1329	                x = PyInt_FromLong(i);
1330	            }
1331	            else if (PyString_CheckExact(v) &&
1332	                     PyString_CheckExact(w)) {
1333	                x = string_concatenate(v, w, f, next_instr);
1334	                /* string_concatenate consumed the ref to v */
1335	                goto skip_decref_vx;
1336	            }
1337	            else {
1338	              slow_add:
1339	                x = PyNumber_Add(v, w);
1340	            }
1341	            Py_DECREF(v);
1342	          skip_decref_vx:
1343	            Py_DECREF(w);
1344	            SET_TOP(x);
1345	            if (x != NULL) continue;
1346	            break;

1532	        case INPLACE_ADD:
1533	            w = POP();
1534	            v = TOP();
1535	            if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
1536	                /* INLINE: int + int */
1537	                register long a, b, i;
1538	                a = PyInt_AS_LONG(v);
1539	                b = PyInt_AS_LONG(w);
1540	                i = a + b;
1541	                if ((i^a) < 0 && (i^b) < 0)
1542	                    goto slow_iadd;
1543	                x = PyInt_FromLong(i);
1544	            }
1545	            else if (PyString_CheckExact(v) &&
1546	                     PyString_CheckExact(w)) {
1547	                x = string_concatenate(v, w, f, next_instr);
1548	                /* string_concatenate consumed the ref to v */
1549	                goto skip_decref_v;
1550	            }
1551	            else {
1552	              slow_iadd:
1553	                x = PyNumber_InPlaceAdd(v, w);
1554	            }
1555	            Py_DECREF(v);
1556	          skip_decref_v:
1557	            Py_DECREF(w);
1558	            SET_TOP(x);
1559	            if (x != NULL) continue;
1560	            break;

As for using Integers, the first case (line 1319 and 1535) are true and
there is no difference in Code. However, Python uses a huge switch-case
construct to execute it's opcodes and INPLACE_ADD cames after
BINARY_ADD, hence the difference in speed. 

To be clear, this is nothing you should consider when writing fast code.
Complexity wise they both are the same. 


[toc] | [next] | [standalone]


#11949

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 10:48 -0700
Message-ID<2885fd04-6cde-414c-8be9-67e6e1235c79@glegroupsg2000goo.googlegroups.com>
In reply to#11948
Thanks for these explanations. So 2% speed difference just between "B..." and "I..." entries in a huge alphabetically-ordered switch case? Wow. Maybe there is some material for speed enhancement here... 

[toc] | [prev] | [next] | [standalone]


#11951

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 11:03 -0700
Message-ID<8530af1c-f723-47d6-afa9-2a8d9d3b17b1@glegroupsg2000goo.googlegroups.com>
In reply to#11949
Well 2% more time after 1 million iterations so you're right I won't consider it.

[toc] | [prev] | [next] | [standalone]


#11952

FromChristian Heimes <lists@cheimes.de>
Date2011-08-21 20:24 +0200
Message-ID<mailman.282.1313951079.27778.python-list@python.org>
In reply to#11948
Am 21.08.2011 19:27, schrieb Andreas Löscher:
> As for using Integers, the first case (line 1319 and 1535) are true and
> there is no difference in Code. However, Python uses a huge switch-case
> construct to execute it's opcodes and INPLACE_ADD cames after
> BINARY_ADD, hence the difference in speed. 

I don't think that's the reason. Modern compiles turn a switch statement
into a jump or branch table rather than a linear search like chained
elif statements. Python 3.2 on Linux should also be compiled with
computed gotos.

Christian

[toc] | [prev] | [next] | [standalone]


#11954

FromRoy Smith <roy@panix.com>
Date2011-08-21 14:52 -0400
Message-ID<roy-D16F14.14525021082011@news.panix.com>
In reply to#11952
In article <mailman.282.1313951079.27778.python-list@python.org>,
 Christian Heimes <lists@cheimes.de> wrote:

> Am 21.08.2011 19:27, schrieb Andreas Löscher:
> > As for using Integers, the first case (line 1319 and 1535) are true and
> > there is no difference in Code. However, Python uses a huge switch-case
> > construct to execute it's opcodes and INPLACE_ADD cames after
> > BINARY_ADD, hence the difference in speed. 
> 
> I don't think that's the reason. Modern compiles turn a switch statement
> into a jump or branch table rather than a linear search like chained
> elif statements.

This is true even for very small values of "modern".  I remember the 
Unix v6 C compiler (circa 1977) was able to do this.

[toc] | [prev] | [next] | [standalone]


#11968

FromAndreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>
Date2011-08-22 01:17 +0200
Message-ID<1313968634.3135.14.camel@thegeorge>
In reply to#11954
Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
> In article <mailman.282.1313951079.27778.python-list@python.org>,
>  Christian Heimes <lists@cheimes.de> wrote:
> 
> > Am 21.08.2011 19:27, schrieb Andreas Lscher:
> > > As for using Integers, the first case (line 1319 and 1535) are true and
> > > there is no difference in Code. However, Python uses a huge switch-case
> > > construct to execute it's opcodes and INPLACE_ADD cames after
> > > BINARY_ADD, hence the difference in speed. 
> > 
> > I don't think that's the reason. Modern compiles turn a switch statement
> > into a jump or branch table rather than a linear search like chained
> > elif statements.
> 
> This is true even for very small values of "modern".  I remember the 
> Unix v6 C compiler (circa 1977) was able to do this.

What is the difference in speed between a jump table that is searched
from top to bottom in comparison to an ordinary if-then-elif...? The
difference can only be in the search algorithm regarding the table.
Without optimization (linear search) both are the same. If the compiler
applies some magic the difference can be relevant (linear complexity for
if-then-elif... and O(1) if you would use a dictionary). 

Hence the executed code for integers is the same, there must be a slower
path to the code of BINARY_ADD than to INPLACE_ADD. 

How would such an jump table work to behave the same liek a
switch-case-statement? Beware, that things like

       case PRINT_NEWLINE_TO:
1802	            w = stream = POP();
1803	            /* fall through to PRINT_NEWLINE */
1804	
1805	        case PRINT_NEWLINE:

must be supported.


Bye the way:
	First line of ceval.c since at least Version 2.4
1	
2	/* Execute compiled code */
3	
4	/* XXX TO DO:
5	   XXX speed up searching for keywords by using a dictionary
6	   XXX document it!
7	   */

:-)

[toc] | [prev] | [next] | [standalone]


#11970

FromChris Angelico <rosuav@gmail.com>
Date2011-08-22 00:37 +0100
Message-ID<mailman.290.1313969874.27778.python-list@python.org>
In reply to#11968
2011/8/22 Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>:
> How would such an jump table work to behave the same liek a
> switch-case-statement?

If your switch statement uses a simple integer enum with sequential
values, then it can be done quite easily. Take this as an example:

        switch (argc)
        {
            case 0: printf("No args at all, this is weird\n"); break;
            case 1: printf("No args\n"); break;
            case 2: printf("Default for second arg\n");
            case 3: printf("Two args\n"); break;
            default: printf("Too many args\n"); break;
        }

I compiled this using Open Watcom C, looked at the disassembly, and
hereby translate it into pseudocode (I'll email/post the full 80x86
disassembly if you like):

1) Check if argc > 3 (unsigned comparison), if so jump to default case.
2) Left shift argc two places, add a constant offset, fetch a pointer
from there, and jump to it - that's the jump table. One JMP statement.
3) Code follows for each case.

Incidentally, the Open Watcom compiler actually turned several of the
cases into offset-load of the appropriate string pointer, and then a
jump to the single call to printf. The fall-through from 'case 2' to
'case 3' works fine, although it means that 'case 2' has to be
de-optimized from that one simplification.

This type of optimization works best when the case values are
sequential. (If I remove the 'case 0', the compiler decrements argc
and proceeds to continue as above.) Otherwise, the jump table has to
have a lot of copies of the "default" pointer.

Chris Angelico

[toc] | [prev] | [next] | [standalone]


#11971

FromTerry Reedy <tjreedy@udel.edu>
Date2011-08-21 19:38 -0400
Message-ID<mailman.291.1313969950.27778.python-list@python.org>
In reply to#11968
On 8/21/2011 7:17 PM, Andreas Löscher wrote:
> Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
>> In article<mailman.282.1313951079.27778.python-list@python.org>,
>>   Christian Heimes<lists@cheimes.de>  wrote:
>>
>>> Am 21.08.2011 19:27, schrieb Andreas Lscher:
>>>> As for using Integers, the first case (line 1319 and 1535) are true and
>>>> there is no difference in Code. However, Python uses a huge switch-case
>>>> construct to execute it's opcodes and INPLACE_ADD cames after
>>>> BINARY_ADD, hence the difference in speed.
>>>
>>> I don't think that's the reason. Modern compiles turn a switch statement
>>> into a jump or branch table rather than a linear search like chained
>>> elif statements.
>>
>> This is true even for very small values of "modern".  I remember the
>> Unix v6 C compiler (circa 1977) was able to do this.
>
> What is the difference in speed between a jump table that is searched
> from top to bottom in comparison to an ordinary if-then-elif...? The
> difference can only be in the search algorithm regarding the table.
> Without optimization (linear search) both are the same. If the compiler
> applies some magic the difference can be relevant (linear complexity for
> if-then-elif... and O(1) if you would use a dictionary).

A jump or branch table is applicable when the case value values are all 
small ints, like bytes or less. For C, the table is simply an array of 
pointers (addressess, with entries for unused byte codes would be a void 
pointer). Hence, O(1) access.
https://secure.wikimedia.org/wikipedia/en/wiki/Jump_table

> Hence the executed code for integers is the same, there must be a slower
> path to the code of BINARY_ADD than to INPLACE_ADD.
>
> How would such an jump table work to behave the same liek a
> switch-case-statement? Beware, that things like
>
>         case PRINT_NEWLINE_TO:
> 1802	            w = stream = POP();
> 1803	            /* fall through to PRINT_NEWLINE */

add jump to address of the code for PRINT_NEWLINE

> 1804	
> 1805	        case PRINT_NEWLINE:
>
> must be supported.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#11974

FromAndreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>
Date2011-08-22 02:00 +0200
Message-ID<1313971238.3713.0.camel@thegeorge>
In reply to#11971
Am Sonntag, den 21.08.2011, 19:38 -0400 schrieb Terry Reedy:
> On 8/21/2011 7:17 PM, Andreas Löscher wrote:
> > Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
> >> In article<mailman.282.1313951079.27778.python-list@python.org>,
> >>   Christian Heimes<lists@cheimes.de>  wrote:
> >>
> >>> Am 21.08.2011 19:27, schrieb Andreas Lscher:
> >>>> As for using Integers, the first case (line 1319 and 1535) are true and
> >>>> there is no difference in Code. However, Python uses a huge switch-case
> >>>> construct to execute it's opcodes and INPLACE_ADD cames after
> >>>> BINARY_ADD, hence the difference in speed.
> >>>
> >>> I don't think that's the reason. Modern compiles turn a switch statement
> >>> into a jump or branch table rather than a linear search like chained
> >>> elif statements.
> >>
> >> This is true even for very small values of "modern".  I remember the
> >> Unix v6 C compiler (circa 1977) was able to do this.
> >
> > What is the difference in speed between a jump table that is searched
> > from top to bottom in comparison to an ordinary if-then-elif...? The
> > difference can only be in the search algorithm regarding the table.
> > Without optimization (linear search) both are the same. If the compiler
> > applies some magic the difference can be relevant (linear complexity for
> > if-then-elif... and O(1) if you would use a dictionary).
> 
> A jump or branch table is applicable when the case value values are all 
> small ints, like bytes or less. For C, the table is simply an array of 
> pointers (addressess, with entries for unused byte codes would be a void 
> pointer). Hence, O(1) access.
> https://secure.wikimedia.org/wikipedia/en/wiki/Jump_table
> 
> > Hence the executed code for integers is the same, there must be a slower
> > path to the code of BINARY_ADD than to INPLACE_ADD.
> >
> > How would such an jump table work to behave the same liek a
> > switch-case-statement? Beware, that things like
> >
> >         case PRINT_NEWLINE_TO:
> > 1802	            w = stream = POP();
> > 1803	            /* fall through to PRINT_NEWLINE */
> 
> add jump to address of the code for PRINT_NEWLINE
> 
> > 1804	
> > 1805	        case PRINT_NEWLINE:
> >
> > must be supported.
> 

:-) too easy or too late 
thanks

[toc] | [prev] | [next] | [standalone]


#11995

FromSeebs <usenet-nospam@seebs.net>
Date2011-08-22 05:33 +0000
Message-ID<slrnj53q8e.g9s.usenet-nospam@guild.seebs.net>
In reply to#11968
On 2011-08-21, Andreas L?scher <andreas.loescher@s2005.tu-chemnitz.de> wrote:
> Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
>> In article <mailman.282.1313951079.27778.python-list@python.org>,
>>  Christian Heimes <lists@cheimes.de> wrote:
>> > I don't think that's the reason. Modern compiles turn a switch statement
>> > into a jump or branch table rather than a linear search like chained
>> > elif statements.

>> This is true even for very small values of "modern".  I remember the 
>> Unix v6 C compiler (circa 1977) was able to do this.

> What is the difference in speed between a jump table that is searched
> from top to bottom

A jump table isn't searched, it's jumped into.  You don't look through the
table for a matching element, you grab the Nth element of the table.

-s
-- 
Copyright 2011, all wrongs reversed.  Peter Seebach / usenet-nospam@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

[toc] | [prev] | [next] | [standalone]


#11957

FromTerry Reedy <tjreedy@udel.edu>
Date2011-08-21 15:39 -0400
Message-ID<mailman.285.1313955635.27778.python-list@python.org>
In reply to#11948
On 8/21/2011 1:27 PM, Andreas Löscher wrote:
>> What the precise difference (semantics and speed) is between the
>> BINARY_ADD and INPLACE_ADD opcodes, I dunno. Look in the Python source
>> code or maybe someone knows it from memory :-)
>>
>> Irmen
>>
> from Python/ceval.c:
>
> 1316	        case BINARY_ADD:
> 1317	            w = POP();
> 1318	            v = TOP();
> 1319	            if (PyInt_CheckExact(v)&&  PyInt_CheckExact(w)) {
> 1320	                /* INLINE: int + int */
> 1321	                register long a, b, i;
> 1322	                a = PyInt_AS_LONG(v);
> 1323	                b = PyInt_AS_LONG(w);
> 1324	                /* cast to avoid undefined behaviour
> 1325	                   on overflow */
> 1326	                i = (long)((unsigned long)a + b);
> 1327	                if ((i^a)<  0&&  (i^b)<  0)
> 1328	                    goto slow_add;
> 1329	                x = PyInt_FromLong(i);
> 1330	            }
> 1331	            else if (PyString_CheckExact(v)&&
> 1332	                     PyString_CheckExact(w)) {
> 1333	                x = string_concatenate(v, w, f, next_instr);
> 1334	                /* string_concatenate consumed the ref to v */
> 1335	                goto skip_decref_vx;
> 1336	            }
> 1337	            else {
> 1338	              slow_add:
> 1339	                x = PyNumber_Add(v, w);
> 1340	            }
> 1341	            Py_DECREF(v);
> 1342	          skip_decref_vx:
> 1343	            Py_DECREF(w);
> 1344	            SET_TOP(x);
> 1345	            if (x != NULL) continue;
> 1346	            break;
>
> 1532	        case INPLACE_ADD:
> 1533	            w = POP();
> 1534	            v = TOP();
> 1535	            if (PyInt_CheckExact(v)&&  PyInt_CheckExact(w)) {
> 1536	                /* INLINE: int + int */
> 1537	                register long a, b, i;
> 1538	                a = PyInt_AS_LONG(v);
> 1539	                b = PyInt_AS_LONG(w);
> 1540	                i = a + b;
> 1541	                if ((i^a)<  0&&  (i^b)<  0)
> 1542	                    goto slow_iadd;
> 1543	                x = PyInt_FromLong(i);
> 1544	            }
> 1545	            else if (PyString_CheckExact(v)&&
> 1546	                     PyString_CheckExact(w)) {
> 1547	                x = string_concatenate(v, w, f, next_instr);
> 1548	                /* string_concatenate consumed the ref to v */
> 1549	                goto skip_decref_v;
> 1550	            }
> 1551	            else {
> 1552	              slow_iadd:
> 1553	                x = PyNumber_InPlaceAdd(v, w);
> 1554	            }
> 1555	            Py_DECREF(v);
> 1556	          skip_decref_v:
> 1557	            Py_DECREF(w);
> 1558	            SET_TOP(x);
> 1559	            if (x != NULL) continue;
> 1560	            break;
>
> As for using Integers, the first case (line 1319 and 1535) are true and
> there is no difference in Code. However, Python uses a huge switch-case
> construct to execute it's opcodes and INPLACE_ADD cames after
> BINARY_ADD, hence the difference in speed.
>
> To be clear, this is nothing you should consider when writing fast code.
> Complexity wise they both are the same.

With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with 
floats (0.0 and 1.0), 6%

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#11958

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 12:53 -0700
Message-ID<mailman.287.1313956392.27778.python-list@python.org>
In reply to#11957
> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with 
> floats (0.0 and 1.0), 6%

For floats it is understandable. But for integers, seriously, 4% is a lot. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.

[toc] | [prev] | [next] | [standalone]


#11960

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 12:55 -0700
Message-ID<mailman.288.1313956503.27778.python-list@python.org>
In reply to#11958
Actually 6% between float themselves is just as not-understandable.

[toc] | [prev] | [next] | [standalone]


#11963

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 12:55 -0700
Message-ID<01896ae1-1a6e-4458-b9a0-1a071f8ffbae@glegroupsg2000goo.googlegroups.com>
In reply to#11958
Actually 6% between float themselves is just as not-understandable.

[toc] | [prev] | [next] | [standalone]


#11979

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-08-22 11:12 +1000
Message-ID<4e51ad16$0$29975$c3e8da3$5496439d@news.astraweb.com>
In reply to#11958
Laurent wrote:

> 
>> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with
>> floats (0.0 and 1.0), 6%
> 
> For floats it is understandable. But for integers, seriously, 4% is a lot.
> I would never have thought an interpreter would have differences like this
> in syntax for something as fundamental as adding 1.

Why? Python integers are rich objects, not native ints. Adding 1 is a
moderately heavyweight operation far more complicated than the same
operation in C.

n=n+1 and n+=1 call different methods and do different things, they are
*not* just different syntax. Your intuition about what should and shouldn't
take the same time should not be trusted.

But really, we're talking about tiny differences in speed. Such trivial
differences are at, or beyond, the limit of what can realistically be
measured on a noisy PC running multiple processes (pretty much all PCs
these days). Here are three runs of each on my computer:


[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.508 usec per loop
[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.587 usec per loop
[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.251 usec per loop


[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.226 usec per loop
[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.494 usec per loop
[steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.53 usec per loop

Look at the variation between runs! About 130% variation between the fastest
and slowest for each expression. And there's no reason to think that the
fastest results shown is as fast as it can get. The time is dominated by
noise, not the addition.


For what it's worth, if I try it with a more recent Python:

[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.221 usec per loop
[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.202 usec per loop
[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
1000000 loops, best of 3: 0.244 usec per loop

[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.49 usec per loop
[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.176 usec per loop
[steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
1000000 loops, best of 3: 0.49 usec per loop


I simply do not believe that we can justify making *any* claim about the
relative speeds of n=n+1 and n+=1 other than "they are about the same". Any
result you get, faster or slower, will depend more on chance than on any
real or significant difference in the code.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#12004

From"Richard D. Moores" <rdmoores@gmail.com>
Date2011-08-22 02:55 -0700
Message-ID<mailman.305.1314006934.27778.python-list@python.org>
In reply to#11979
On Sun, Aug 21, 2011 at 18:12, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:

> But really, we're talking about tiny differences in speed. Such trivial
> differences are at, or beyond, the limit of what can realistically be
> measured on a noisy PC running multiple processes (pretty much all PCs
> these days). Here are three runs of each on my computer:
>
>
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.508 usec per loop
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.587 usec per loop
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.251 usec per loop
>
>
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.226 usec per loop
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.494 usec per loop
> [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.53 usec per loop
>
> Look at the variation between runs! About 130% variation between the fastest
> and slowest for each expression. And there's no reason to think that the
> fastest results shown is as fast as it can get. The time is dominated by
> noise, not the addition.
>
>
> For what it's worth, if I try it with a more recent Python:
>
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.221 usec per loop
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.202 usec per loop
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1'
> 1000000 loops, best of 3: 0.244 usec per loop
>
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.49 usec per loop
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.176 usec per loop
> [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1'
> 1000000 loops, best of 3: 0.49 usec per loop
>
>
> I simply do not believe that we can justify making *any* claim about the
> relative speeds of n=n+1 and n+=1 other than "they are about the same". Any
> result you get, faster or slower, will depend more on chance than on any
> real or significant difference in the code.

I couldn't resist giving it a try. Using Python 3.2.1 on a 64-bit
Windows 7 machine with a 2.60 gigahertz AMD Athlon II X4 620
processor, I did 18 tests, alternating between n=n+1 and n+=1 (so 9
each).

The fastest for n+=1 was
C:\Windows\System32> python -m timeit  -r 3 -s "n=0" "n += 1"
10000000 loops, best of 3: 0.0879 usec per loop

The slowest for n+=1 was
C:\Windows\System32> python -m timeit  -r 3 -s "n=0" "n += 1"
10000000 loops, best of 3: 0.0902 usec per loop

The fastest for n = n + 1 was
C:\Windows\System32> python -m timeit  -r 3 -s "n=0" "n=n+1"
10000000 loops, best of 3: 0.0831 usec per loop

The slowest for n = n + 1 was
C:\Windows\System32> python -m timeit  -r 3 -s "n=0" "n=n+1"
10000000 loops, best of 3: 0.0842 usec per loop

Coincidence?

All the data are pasted at http://pastebin.com/jArPSe56

Dick Moores

[toc] | [prev] | [next] | [standalone]


#12039

FromEmile van Sebille <emile@fenx.com>
Date2011-08-22 09:35 -0700
Message-ID<mailman.321.1314030982.27778.python-list@python.org>
In reply to#11979
On 8/22/2011 2:55 AM Richard D. Moores said...
> I couldn't resist giving it a try. Using Python 3.2.1 on a 64-bit
> Windows 7 machine with a 2.60 gigahertz AMD Athlon II X4 620
> processor, I did 18 tests, alternating between n=n+1 and n+=1 (so 9
> each).
>
> The fastest for n+=1 was
> C:\Windows\System32>  python -m timeit  -r 3 -s "n=0" "n += 1"
> 10000000 loops, best of 3: 0.0879 usec per loop
>
> The slowest for n+=1 was
> C:\Windows\System32>  python -m timeit  -r 3 -s "n=0" "n += 1"
> 10000000 loops, best of 3: 0.0902 usec per loop
>
> The fastest for n = n + 1 was
> C:\Windows\System32>  python -m timeit  -r 3 -s "n=0" "n=n+1"
> 10000000 loops, best of 3: 0.0831 usec per loop
>
> The slowest for n = n + 1 was
> C:\Windows\System32>  python -m timeit  -r 3 -s "n=0" "n=n+1"
> 10000000 loops, best of 3: 0.0842 usec per loop
>
> Coincidence?
>

Naaa.. I just ran it twice -- once per ... _this_ is coincidence.  :)

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\Documents and Settings\Emile>python -m timeit  -r 3 -s "n=0" "n=n+1"
10000000 loops, best of 3: 0.108 usec per loop

C:\Documents and Settings\Emile>python -m timeit  -r 3 -s "n=0" "n += 1"
10000000 loops, best of 3: 0.108 usec per loop

C:\Documents and Settings\Emile>

[toc] | [prev] | [next] | [standalone]


#12040

FromEmile van Sebille <emile@fenx.com>
Date2011-08-22 10:22 -0700
Message-ID<mailman.322.1314033759.27778.python-list@python.org>
In reply to#11979
On 8/22/2011 9:35 AM Emile van Sebille said...
> On 8/22/2011 2:55 AM Richard D. Moores said...
>> Coincidence?
>>
>
> Naaa.. I just ran it twice -- once per ... _this_ is coincidence. :)
>
> Microsoft Windows XP [Version 5.1.2600]
> (C) Copyright 1985-2001 Microsoft Corp.
>
> C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n=n+1"
> 10000000 loops, best of 3: 0.108 usec per loop
>
> C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n += 1"
> 10000000 loops, best of 3: 0.108 usec per loop

Actually, it's more likely I hit a minimum resolution issue -- I ran it 
twenty more times and never got a different result.

Emile

[toc] | [prev] | [next] | [standalone]


#11961

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 13:04 -0700
Message-ID<3fa8a0a4-9321-4442-9133-ad44f7513486@glegroupsg2000goo.googlegroups.com>
In reply to#11957
I did the test several times with floats on my machine and the difference is not as big as for integers:


==> "i = i + 1.0" is 0.732% faster than "i += 1.0".

It seems normal as the float addition is supposed to be slower than integer addition, thus the syntaxic difference is comparatively less important. 

[toc] | [prev] | [next] | [standalone]


#11962

FromLaurent <laurent.payot@gmail.com>
Date2011-08-21 12:53 -0700
Message-ID<747a8223-0a9b-4691-8886-0a04433e54dc@glegroupsg2000goo.googlegroups.com>
In reply to#11957
> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with 
> floats (0.0 and 1.0), 6%

For floats it is understandable. But for integers, seriously, 4% is a lot. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web