Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #11948 > unrolled thread
| Started by | Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> |
|---|---|
| First post | 2011-08-21 19:27 +0200 |
| Last post | 2011-08-21 21:14 -0700 |
| Articles | 20 on this page of 30 — 12 participants |
Back to article view | Back to comp.lang.python
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-21 19:27 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 10:48 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 11:03 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Christian Heimes <lists@cheimes.de> - 2011-08-21 20:24 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Roy Smith <roy@panix.com> - 2011-08-21 14:52 -0400
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 01:17 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Chris Angelico <rosuav@gmail.com> - 2011-08-22 00:37 +0100
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 19:38 -0400
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 02:00 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Seebs <usenet-nospam@seebs.net> - 2011-08-22 05:33 +0000
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 15:39 -0400
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:53 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:55 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:55 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 11:12 +1000
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") "Richard D. Moores" <rdmoores@gmail.com> - 2011-08-22 02:55 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Emile van Sebille <emile@fenx.com> - 2011-08-22 09:35 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Emile van Sebille <emile@fenx.com> - 2011-08-22 10:22 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 13:04 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Laurent <laurent.payot@gmail.com> - 2011-08-21 12:53 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> - 2011-08-22 01:25 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Chris Angelico <rosuav@gmail.com> - 2011-08-22 00:41 +0100
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 11:16 +1000
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Christian Heimes <lists@cheimes.de> - 2011-08-22 04:04 +0200
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Terry Reedy <tjreedy@udel.edu> - 2011-08-21 22:11 -0400
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 19:08 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-08-22 14:14 +1000
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 21:37 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") Stephen Hansen <me+list/python@ixokai.io> - 2011-08-21 21:49 -0700
Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") casevh <casevh@gmail.com> - 2011-08-21 21:14 -0700
Page 1 of 2 [1] 2 Next page →
| From | Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> |
|---|---|
| Date | 2011-08-21 19:27 +0200 |
| Subject | Re: relative speed of incremention syntaxes (or "i=i+1" VS "i+=1") |
| Message-ID | <1313947658.3424.3.camel@thegeorge> |
> What the precise difference (semantics and speed) is between the
> BINARY_ADD and INPLACE_ADD opcodes, I dunno. Look in the Python source
> code or maybe someone knows it from memory :-)
>
> Irmen
>
from Python/ceval.c:
1316 case BINARY_ADD:
1317 w = POP();
1318 v = TOP();
1319 if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
1320 /* INLINE: int + int */
1321 register long a, b, i;
1322 a = PyInt_AS_LONG(v);
1323 b = PyInt_AS_LONG(w);
1324 /* cast to avoid undefined behaviour
1325 on overflow */
1326 i = (long)((unsigned long)a + b);
1327 if ((i^a) < 0 && (i^b) < 0)
1328 goto slow_add;
1329 x = PyInt_FromLong(i);
1330 }
1331 else if (PyString_CheckExact(v) &&
1332 PyString_CheckExact(w)) {
1333 x = string_concatenate(v, w, f, next_instr);
1334 /* string_concatenate consumed the ref to v */
1335 goto skip_decref_vx;
1336 }
1337 else {
1338 slow_add:
1339 x = PyNumber_Add(v, w);
1340 }
1341 Py_DECREF(v);
1342 skip_decref_vx:
1343 Py_DECREF(w);
1344 SET_TOP(x);
1345 if (x != NULL) continue;
1346 break;
1532 case INPLACE_ADD:
1533 w = POP();
1534 v = TOP();
1535 if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
1536 /* INLINE: int + int */
1537 register long a, b, i;
1538 a = PyInt_AS_LONG(v);
1539 b = PyInt_AS_LONG(w);
1540 i = a + b;
1541 if ((i^a) < 0 && (i^b) < 0)
1542 goto slow_iadd;
1543 x = PyInt_FromLong(i);
1544 }
1545 else if (PyString_CheckExact(v) &&
1546 PyString_CheckExact(w)) {
1547 x = string_concatenate(v, w, f, next_instr);
1548 /* string_concatenate consumed the ref to v */
1549 goto skip_decref_v;
1550 }
1551 else {
1552 slow_iadd:
1553 x = PyNumber_InPlaceAdd(v, w);
1554 }
1555 Py_DECREF(v);
1556 skip_decref_v:
1557 Py_DECREF(w);
1558 SET_TOP(x);
1559 if (x != NULL) continue;
1560 break;
As for using Integers, the first case (line 1319 and 1535) are true and
there is no difference in Code. However, Python uses a huge switch-case
construct to execute it's opcodes and INPLACE_ADD cames after
BINARY_ADD, hence the difference in speed.
To be clear, this is nothing you should consider when writing fast code.
Complexity wise they both are the same.
[toc] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 10:48 -0700 |
| Message-ID | <2885fd04-6cde-414c-8be9-67e6e1235c79@glegroupsg2000goo.googlegroups.com> |
| In reply to | #11948 |
Thanks for these explanations. So 2% speed difference just between "B..." and "I..." entries in a huge alphabetically-ordered switch case? Wow. Maybe there is some material for speed enhancement here...
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 11:03 -0700 |
| Message-ID | <8530af1c-f723-47d6-afa9-2a8d9d3b17b1@glegroupsg2000goo.googlegroups.com> |
| In reply to | #11949 |
Well 2% more time after 1 million iterations so you're right I won't consider it.
[toc] | [prev] | [next] | [standalone]
| From | Christian Heimes <lists@cheimes.de> |
|---|---|
| Date | 2011-08-21 20:24 +0200 |
| Message-ID | <mailman.282.1313951079.27778.python-list@python.org> |
| In reply to | #11948 |
Am 21.08.2011 19:27, schrieb Andreas Löscher: > As for using Integers, the first case (line 1319 and 1535) are true and > there is no difference in Code. However, Python uses a huge switch-case > construct to execute it's opcodes and INPLACE_ADD cames after > BINARY_ADD, hence the difference in speed. I don't think that's the reason. Modern compiles turn a switch statement into a jump or branch table rather than a linear search like chained elif statements. Python 3.2 on Linux should also be compiled with computed gotos. Christian
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2011-08-21 14:52 -0400 |
| Message-ID | <roy-D16F14.14525021082011@news.panix.com> |
| In reply to | #11952 |
In article <mailman.282.1313951079.27778.python-list@python.org>, Christian Heimes <lists@cheimes.de> wrote: > Am 21.08.2011 19:27, schrieb Andreas Löscher: > > As for using Integers, the first case (line 1319 and 1535) are true and > > there is no difference in Code. However, Python uses a huge switch-case > > construct to execute it's opcodes and INPLACE_ADD cames after > > BINARY_ADD, hence the difference in speed. > > I don't think that's the reason. Modern compiles turn a switch statement > into a jump or branch table rather than a linear search like chained > elif statements. This is true even for very small values of "modern". I remember the Unix v6 C compiler (circa 1977) was able to do this.
[toc] | [prev] | [next] | [standalone]
| From | Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> |
|---|---|
| Date | 2011-08-22 01:17 +0200 |
| Message-ID | <1313968634.3135.14.camel@thegeorge> |
| In reply to | #11954 |
Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith:
> In article <mailman.282.1313951079.27778.python-list@python.org>,
> Christian Heimes <lists@cheimes.de> wrote:
>
> > Am 21.08.2011 19:27, schrieb Andreas Lscher:
> > > As for using Integers, the first case (line 1319 and 1535) are true and
> > > there is no difference in Code. However, Python uses a huge switch-case
> > > construct to execute it's opcodes and INPLACE_ADD cames after
> > > BINARY_ADD, hence the difference in speed.
> >
> > I don't think that's the reason. Modern compiles turn a switch statement
> > into a jump or branch table rather than a linear search like chained
> > elif statements.
>
> This is true even for very small values of "modern". I remember the
> Unix v6 C compiler (circa 1977) was able to do this.
What is the difference in speed between a jump table that is searched
from top to bottom in comparison to an ordinary if-then-elif...? The
difference can only be in the search algorithm regarding the table.
Without optimization (linear search) both are the same. If the compiler
applies some magic the difference can be relevant (linear complexity for
if-then-elif... and O(1) if you would use a dictionary).
Hence the executed code for integers is the same, there must be a slower
path to the code of BINARY_ADD than to INPLACE_ADD.
How would such an jump table work to behave the same liek a
switch-case-statement? Beware, that things like
case PRINT_NEWLINE_TO:
1802 w = stream = POP();
1803 /* fall through to PRINT_NEWLINE */
1804
1805 case PRINT_NEWLINE:
must be supported.
Bye the way:
First line of ceval.c since at least Version 2.4
1
2 /* Execute compiled code */
3
4 /* XXX TO DO:
5 XXX speed up searching for keywords by using a dictionary
6 XXX document it!
7 */
:-)
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2011-08-22 00:37 +0100 |
| Message-ID | <mailman.290.1313969874.27778.python-list@python.org> |
| In reply to | #11968 |
2011/8/22 Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de>:
> How would such an jump table work to behave the same liek a
> switch-case-statement?
If your switch statement uses a simple integer enum with sequential
values, then it can be done quite easily. Take this as an example:
switch (argc)
{
case 0: printf("No args at all, this is weird\n"); break;
case 1: printf("No args\n"); break;
case 2: printf("Default for second arg\n");
case 3: printf("Two args\n"); break;
default: printf("Too many args\n"); break;
}
I compiled this using Open Watcom C, looked at the disassembly, and
hereby translate it into pseudocode (I'll email/post the full 80x86
disassembly if you like):
1) Check if argc > 3 (unsigned comparison), if so jump to default case.
2) Left shift argc two places, add a constant offset, fetch a pointer
from there, and jump to it - that's the jump table. One JMP statement.
3) Code follows for each case.
Incidentally, the Open Watcom compiler actually turned several of the
cases into offset-load of the appropriate string pointer, and then a
jump to the single call to printf. The fall-through from 'case 2' to
'case 3' works fine, although it means that 'case 2' has to be
de-optimized from that one simplification.
This type of optimization works best when the case values are
sequential. (If I remove the 'case 0', the compiler decrements argc
and proceeds to continue as above.) Otherwise, the jump table has to
have a lot of copies of the "default" pointer.
Chris Angelico
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2011-08-21 19:38 -0400 |
| Message-ID | <mailman.291.1313969950.27778.python-list@python.org> |
| In reply to | #11968 |
On 8/21/2011 7:17 PM, Andreas Löscher wrote: > Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith: >> In article<mailman.282.1313951079.27778.python-list@python.org>, >> Christian Heimes<lists@cheimes.de> wrote: >> >>> Am 21.08.2011 19:27, schrieb Andreas Lscher: >>>> As for using Integers, the first case (line 1319 and 1535) are true and >>>> there is no difference in Code. However, Python uses a huge switch-case >>>> construct to execute it's opcodes and INPLACE_ADD cames after >>>> BINARY_ADD, hence the difference in speed. >>> >>> I don't think that's the reason. Modern compiles turn a switch statement >>> into a jump or branch table rather than a linear search like chained >>> elif statements. >> >> This is true even for very small values of "modern". I remember the >> Unix v6 C compiler (circa 1977) was able to do this. > > What is the difference in speed between a jump table that is searched > from top to bottom in comparison to an ordinary if-then-elif...? The > difference can only be in the search algorithm regarding the table. > Without optimization (linear search) both are the same. If the compiler > applies some magic the difference can be relevant (linear complexity for > if-then-elif... and O(1) if you would use a dictionary). A jump or branch table is applicable when the case value values are all small ints, like bytes or less. For C, the table is simply an array of pointers (addressess, with entries for unused byte codes would be a void pointer). Hence, O(1) access. https://secure.wikimedia.org/wikipedia/en/wiki/Jump_table > Hence the executed code for integers is the same, there must be a slower > path to the code of BINARY_ADD than to INPLACE_ADD. > > How would such an jump table work to behave the same liek a > switch-case-statement? Beware, that things like > > case PRINT_NEWLINE_TO: > 1802 w = stream = POP(); > 1803 /* fall through to PRINT_NEWLINE */ add jump to address of the code for PRINT_NEWLINE > 1804 > 1805 case PRINT_NEWLINE: > > must be supported. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Andreas Löscher <andreas.loescher@s2005.tu-chemnitz.de> |
|---|---|
| Date | 2011-08-22 02:00 +0200 |
| Message-ID | <1313971238.3713.0.camel@thegeorge> |
| In reply to | #11971 |
Am Sonntag, den 21.08.2011, 19:38 -0400 schrieb Terry Reedy: > On 8/21/2011 7:17 PM, Andreas Löscher wrote: > > Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith: > >> In article<mailman.282.1313951079.27778.python-list@python.org>, > >> Christian Heimes<lists@cheimes.de> wrote: > >> > >>> Am 21.08.2011 19:27, schrieb Andreas Lscher: > >>>> As for using Integers, the first case (line 1319 and 1535) are true and > >>>> there is no difference in Code. However, Python uses a huge switch-case > >>>> construct to execute it's opcodes and INPLACE_ADD cames after > >>>> BINARY_ADD, hence the difference in speed. > >>> > >>> I don't think that's the reason. Modern compiles turn a switch statement > >>> into a jump or branch table rather than a linear search like chained > >>> elif statements. > >> > >> This is true even for very small values of "modern". I remember the > >> Unix v6 C compiler (circa 1977) was able to do this. > > > > What is the difference in speed between a jump table that is searched > > from top to bottom in comparison to an ordinary if-then-elif...? The > > difference can only be in the search algorithm regarding the table. > > Without optimization (linear search) both are the same. If the compiler > > applies some magic the difference can be relevant (linear complexity for > > if-then-elif... and O(1) if you would use a dictionary). > > A jump or branch table is applicable when the case value values are all > small ints, like bytes or less. For C, the table is simply an array of > pointers (addressess, with entries for unused byte codes would be a void > pointer). Hence, O(1) access. > https://secure.wikimedia.org/wikipedia/en/wiki/Jump_table > > > Hence the executed code for integers is the same, there must be a slower > > path to the code of BINARY_ADD than to INPLACE_ADD. > > > > How would such an jump table work to behave the same liek a > > switch-case-statement? Beware, that things like > > > > case PRINT_NEWLINE_TO: > > 1802 w = stream = POP(); > > 1803 /* fall through to PRINT_NEWLINE */ > > add jump to address of the code for PRINT_NEWLINE > > > 1804 > > 1805 case PRINT_NEWLINE: > > > > must be supported. > :-) too easy or too late thanks
[toc] | [prev] | [next] | [standalone]
| From | Seebs <usenet-nospam@seebs.net> |
|---|---|
| Date | 2011-08-22 05:33 +0000 |
| Message-ID | <slrnj53q8e.g9s.usenet-nospam@guild.seebs.net> |
| In reply to | #11968 |
On 2011-08-21, Andreas L?scher <andreas.loescher@s2005.tu-chemnitz.de> wrote: > Am Sonntag, den 21.08.2011, 14:52 -0400 schrieb Roy Smith: >> In article <mailman.282.1313951079.27778.python-list@python.org>, >> Christian Heimes <lists@cheimes.de> wrote: >> > I don't think that's the reason. Modern compiles turn a switch statement >> > into a jump or branch table rather than a linear search like chained >> > elif statements. >> This is true even for very small values of "modern". I remember the >> Unix v6 C compiler (circa 1977) was able to do this. > What is the difference in speed between a jump table that is searched > from top to bottom A jump table isn't searched, it's jumped into. You don't look through the table for a matching element, you grab the Nth element of the table. -s -- Copyright 2011, all wrongs reversed. Peter Seebach / usenet-nospam@seebs.net http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated! I am not speaking for my employer, although they do rent some of my opinions.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2011-08-21 15:39 -0400 |
| Message-ID | <mailman.285.1313955635.27778.python-list@python.org> |
| In reply to | #11948 |
On 8/21/2011 1:27 PM, Andreas Löscher wrote:
>> What the precise difference (semantics and speed) is between the
>> BINARY_ADD and INPLACE_ADD opcodes, I dunno. Look in the Python source
>> code or maybe someone knows it from memory :-)
>>
>> Irmen
>>
> from Python/ceval.c:
>
> 1316 case BINARY_ADD:
> 1317 w = POP();
> 1318 v = TOP();
> 1319 if (PyInt_CheckExact(v)&& PyInt_CheckExact(w)) {
> 1320 /* INLINE: int + int */
> 1321 register long a, b, i;
> 1322 a = PyInt_AS_LONG(v);
> 1323 b = PyInt_AS_LONG(w);
> 1324 /* cast to avoid undefined behaviour
> 1325 on overflow */
> 1326 i = (long)((unsigned long)a + b);
> 1327 if ((i^a)< 0&& (i^b)< 0)
> 1328 goto slow_add;
> 1329 x = PyInt_FromLong(i);
> 1330 }
> 1331 else if (PyString_CheckExact(v)&&
> 1332 PyString_CheckExact(w)) {
> 1333 x = string_concatenate(v, w, f, next_instr);
> 1334 /* string_concatenate consumed the ref to v */
> 1335 goto skip_decref_vx;
> 1336 }
> 1337 else {
> 1338 slow_add:
> 1339 x = PyNumber_Add(v, w);
> 1340 }
> 1341 Py_DECREF(v);
> 1342 skip_decref_vx:
> 1343 Py_DECREF(w);
> 1344 SET_TOP(x);
> 1345 if (x != NULL) continue;
> 1346 break;
>
> 1532 case INPLACE_ADD:
> 1533 w = POP();
> 1534 v = TOP();
> 1535 if (PyInt_CheckExact(v)&& PyInt_CheckExact(w)) {
> 1536 /* INLINE: int + int */
> 1537 register long a, b, i;
> 1538 a = PyInt_AS_LONG(v);
> 1539 b = PyInt_AS_LONG(w);
> 1540 i = a + b;
> 1541 if ((i^a)< 0&& (i^b)< 0)
> 1542 goto slow_iadd;
> 1543 x = PyInt_FromLong(i);
> 1544 }
> 1545 else if (PyString_CheckExact(v)&&
> 1546 PyString_CheckExact(w)) {
> 1547 x = string_concatenate(v, w, f, next_instr);
> 1548 /* string_concatenate consumed the ref to v */
> 1549 goto skip_decref_v;
> 1550 }
> 1551 else {
> 1552 slow_iadd:
> 1553 x = PyNumber_InPlaceAdd(v, w);
> 1554 }
> 1555 Py_DECREF(v);
> 1556 skip_decref_v:
> 1557 Py_DECREF(w);
> 1558 SET_TOP(x);
> 1559 if (x != NULL) continue;
> 1560 break;
>
> As for using Integers, the first case (line 1319 and 1535) are true and
> there is no difference in Code. However, Python uses a huge switch-case
> construct to execute it's opcodes and INPLACE_ADD cames after
> BINARY_ADD, hence the difference in speed.
>
> To be clear, this is nothing you should consider when writing fast code.
> Complexity wise they both are the same.
With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with
floats (0.0 and 1.0), 6%
--
Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 12:53 -0700 |
| Message-ID | <mailman.287.1313956392.27778.python-list@python.org> |
| In reply to | #11957 |
> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with > floats (0.0 and 1.0), 6% For floats it is understandable. But for integers, seriously, 4% is a lot. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 12:55 -0700 |
| Message-ID | <mailman.288.1313956503.27778.python-list@python.org> |
| In reply to | #11958 |
Actually 6% between float themselves is just as not-understandable.
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 12:55 -0700 |
| Message-ID | <01896ae1-1a6e-4458-b9a0-1a071f8ffbae@glegroupsg2000goo.googlegroups.com> |
| In reply to | #11958 |
Actually 6% between float themselves is just as not-understandable.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-08-22 11:12 +1000 |
| Message-ID | <4e51ad16$0$29975$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #11958 |
Laurent wrote: > >> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with >> floats (0.0 and 1.0), 6% > > For floats it is understandable. But for integers, seriously, 4% is a lot. > I would never have thought an interpreter would have differences like this > in syntax for something as fundamental as adding 1. Why? Python integers are rich objects, not native ints. Adding 1 is a moderately heavyweight operation far more complicated than the same operation in C. n=n+1 and n+=1 call different methods and do different things, they are *not* just different syntax. Your intuition about what should and shouldn't take the same time should not be trusted. But really, we're talking about tiny differences in speed. Such trivial differences are at, or beyond, the limit of what can realistically be measured on a noisy PC running multiple processes (pretty much all PCs these days). Here are three runs of each on my computer: [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.508 usec per loop [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.587 usec per loop [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.251 usec per loop [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.226 usec per loop [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.494 usec per loop [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.53 usec per loop Look at the variation between runs! About 130% variation between the fastest and slowest for each expression. And there's no reason to think that the fastest results shown is as fast as it can get. The time is dominated by noise, not the addition. For what it's worth, if I try it with a more recent Python: [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.221 usec per loop [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.202 usec per loop [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' 1000000 loops, best of 3: 0.244 usec per loop [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.49 usec per loop [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.176 usec per loop [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' 1000000 loops, best of 3: 0.49 usec per loop I simply do not believe that we can justify making *any* claim about the relative speeds of n=n+1 and n+=1 other than "they are about the same". Any result you get, faster or slower, will depend more on chance than on any real or significant difference in the code. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | "Richard D. Moores" <rdmoores@gmail.com> |
|---|---|
| Date | 2011-08-22 02:55 -0700 |
| Message-ID | <mailman.305.1314006934.27778.python-list@python.org> |
| In reply to | #11979 |
On Sun, Aug 21, 2011 at 18:12, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > But really, we're talking about tiny differences in speed. Such trivial > differences are at, or beyond, the limit of what can realistically be > measured on a noisy PC running multiple processes (pretty much all PCs > these days). Here are three runs of each on my computer: > > > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.508 usec per loop > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.587 usec per loop > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.251 usec per loop > > > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.226 usec per loop > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.494 usec per loop > [steve@sylar python]$ python2.5 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.53 usec per loop > > Look at the variation between runs! About 130% variation between the fastest > and slowest for each expression. And there's no reason to think that the > fastest results shown is as fast as it can get. The time is dominated by > noise, not the addition. > > > For what it's worth, if I try it with a more recent Python: > > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.221 usec per loop > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.202 usec per loop > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n = n+1' > 1000000 loops, best of 3: 0.244 usec per loop > > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.49 usec per loop > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.176 usec per loop > [steve@sylar python]$ python3.2 -m timeit -s 'n=0' 'n += 1' > 1000000 loops, best of 3: 0.49 usec per loop > > > I simply do not believe that we can justify making *any* claim about the > relative speeds of n=n+1 and n+=1 other than "they are about the same". Any > result you get, faster or slower, will depend more on chance than on any > real or significant difference in the code. I couldn't resist giving it a try. Using Python 3.2.1 on a 64-bit Windows 7 machine with a 2.60 gigahertz AMD Athlon II X4 620 processor, I did 18 tests, alternating between n=n+1 and n+=1 (so 9 each). The fastest for n+=1 was C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n += 1" 10000000 loops, best of 3: 0.0879 usec per loop The slowest for n+=1 was C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n += 1" 10000000 loops, best of 3: 0.0902 usec per loop The fastest for n = n + 1 was C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n=n+1" 10000000 loops, best of 3: 0.0831 usec per loop The slowest for n = n + 1 was C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n=n+1" 10000000 loops, best of 3: 0.0842 usec per loop Coincidence? All the data are pasted at http://pastebin.com/jArPSe56 Dick Moores
[toc] | [prev] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2011-08-22 09:35 -0700 |
| Message-ID | <mailman.321.1314030982.27778.python-list@python.org> |
| In reply to | #11979 |
On 8/22/2011 2:55 AM Richard D. Moores said... > I couldn't resist giving it a try. Using Python 3.2.1 on a 64-bit > Windows 7 machine with a 2.60 gigahertz AMD Athlon II X4 620 > processor, I did 18 tests, alternating between n=n+1 and n+=1 (so 9 > each). > > The fastest for n+=1 was > C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n += 1" > 10000000 loops, best of 3: 0.0879 usec per loop > > The slowest for n+=1 was > C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n += 1" > 10000000 loops, best of 3: 0.0902 usec per loop > > The fastest for n = n + 1 was > C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n=n+1" > 10000000 loops, best of 3: 0.0831 usec per loop > > The slowest for n = n + 1 was > C:\Windows\System32> python -m timeit -r 3 -s "n=0" "n=n+1" > 10000000 loops, best of 3: 0.0842 usec per loop > > Coincidence? > Naaa.. I just ran it twice -- once per ... _this_ is coincidence. :) Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n=n+1" 10000000 loops, best of 3: 0.108 usec per loop C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n += 1" 10000000 loops, best of 3: 0.108 usec per loop C:\Documents and Settings\Emile>
[toc] | [prev] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2011-08-22 10:22 -0700 |
| Message-ID | <mailman.322.1314033759.27778.python-list@python.org> |
| In reply to | #11979 |
On 8/22/2011 9:35 AM Emile van Sebille said... > On 8/22/2011 2:55 AM Richard D. Moores said... >> Coincidence? >> > > Naaa.. I just ran it twice -- once per ... _this_ is coincidence. :) > > Microsoft Windows XP [Version 5.1.2600] > (C) Copyright 1985-2001 Microsoft Corp. > > C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n=n+1" > 10000000 loops, best of 3: 0.108 usec per loop > > C:\Documents and Settings\Emile>python -m timeit -r 3 -s "n=0" "n += 1" > 10000000 loops, best of 3: 0.108 usec per loop Actually, it's more likely I hit a minimum resolution issue -- I ran it twenty more times and never got a different result. Emile
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 13:04 -0700 |
| Message-ID | <3fa8a0a4-9321-4442-9133-ad44f7513486@glegroupsg2000goo.googlegroups.com> |
| In reply to | #11957 |
I did the test several times with floats on my machine and the difference is not as big as for integers: ==> "i = i + 1.0" is 0.732% faster than "i += 1.0". It seems normal as the float addition is supposed to be slower than integer addition, thus the syntaxic difference is comparatively less important.
[toc] | [prev] | [next] | [standalone]
| From | Laurent <laurent.payot@gmail.com> |
|---|---|
| Date | 2011-08-21 12:53 -0700 |
| Message-ID | <747a8223-0a9b-4691-8886-0a04433e54dc@glegroupsg2000goo.googlegroups.com> |
| In reply to | #11957 |
> With 64 bit 3.2.2 on my Win 7 Pentium, the difference was 4% and with > floats (0.0 and 1.0), 6% For floats it is understandable. But for integers, seriously, 4% is a lot. I would never have thought an interpreter would have differences like this in syntax for something as fundamental as adding 1.
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web