Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.006 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'subject:Python': 0.06; 'python3': 0.07; 'sys': 0.07; 'calculating': 0.09; 'pep': 0.09; 'skip:% 20': 0.09; 'python': 0.11; 'def': 0.12; 'jan': 0.12; 'windows': 0.15; '2.7.3': 0.16; 'clear.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'func': 0.16; 'skip:[ 40': 0.16; 'bit': 0.19; 'skip:p 40': 0.19; 'feb': 0.22; 'import': 0.22; 'coding': 0.22; 'merge': 0.24; 'versions': 0.24; 'question': 0.24; 'compare': 0.26; 'long,': 0.26; 'nearly': 0.26; 'subject:/': 0.26; 'skip:" 20': 0.27; '2010,': 0.27; 'function': 0.29; 'message-id:@mail.gmail.com': 0.30; 'skip:( 20': 0.30; "i'm": 0.30; '3.2': 0.31; 'sep': 0.31; 'skip:7 10': 0.31; 'way?': 0.31; 'anyone': 0.31; 'this.': 0.32; 'linux': 0.33; 'there,': 0.34; 'received:209.85': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'really': 0.36; 'doing': 0.36; 'so,': 0.37; 'two': 0.37; 'received:209': 0.37; 'massive': 0.38; 'to:addr:python-list': 0.38; 'explain': 0.39; 'skip:8 10': 0.39; 'to:addr:python.org': 0.39; 'removing': 0.60; 'numbers': 0.61; 'making': 0.63; 'sum': 0.64; 'effectively': 0.66; 'worth': 0.66; 'mar': 0.68; 'internally.': 0.84; 'penalty': 0.84; 'subject:long': 0.84; '2013,': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=fZVu2dGXGsy49Kh3+1vfbRgG9pyGmiHJkAKmkLl9Vns=; b=qAhUCL4z5sqp2qZJxNTL1i//bLNILAIm2IGcbpPjq5MvcMKLHCG+8DmG8w2aMp+1Ba wsg6b1K8X6Vzr/cNn1WyBs3bVJVu2rhEHBwa5Mf6y/rO+BY1/yjfhD0e5291ry7cHj0p YWFYGYebNhBVz8GoKiMuB9UBTLw348/KdH3n9PwXeVZ9NyB+0jsrPnCPrW1PZwy4KPfw 1NTpavVSofulQx8gdJLAuwPAPxvYctCxU4pJz+9fmgyZ26m8eWBkGthryDHUfiUT0VCs LBYNoqJzAn2jFs845lEQcyjUvghwawb1nkB5WsGEFmTHWG9z5ZnGlVw8YU7dNNr7y5NX 6wBQ== MIME-Version: 1.0 X-Received: by 10.58.253.161 with SMTP id ab1mr17468805ved.55.1364248267701; Mon, 25 Mar 2013 14:51:07 -0700 (PDT) Date: Tue, 26 Mar 2013 08:51:07 +1100 Subject: Performance of int/long in Python 3 From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 96 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1364248275 news.xs4all.nl 6897 [2001:888:2000:d::a6]:60159 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:41834 The Python 3 merge of int and long has effectively penalized small-number arithmetic by removing an optimization. As we've seen from PEP 393 strings (jmf aside), there can be huge benefits from having a single type with multiple representations internally. Is there value in making the int type have a machine-word optimization in the same way? The cost is clear. Compare these methods for calculating the sum of all numbers up to 65535, which stays under 2^31: def range_sum(n): return sum(range(n+1)) def forloop(n): tot=0 for i in range(n+1): tot+=i return tot def forloop_offset(n): tot=1000000000000000 for i in range(n+1): tot+=i return tot-1000000000000000 import timeit import sys print(sys.version) print("inline: %d"%sum(range(65536))) print(timeit.timeit("sum(range(65536))",number=1000)) for func in ['range_sum','forloop','forloop_offset']: print("%s: %r"%(func,(globals()[func](65535)))) print(timeit.timeit(func+"(65535)","from __main__ import "+func,number=1000)) Windows XP: C:\>python26\python inttime.py 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)] inline: 2147450880 2.36770455463 range_sum: 2147450880 2.61778550067 forloop: 2147450880 7.91409131608 forloop_offset: 2147450880L 23.3116954809 C:\>python33\python inttime.py 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] inline: 2147450880 5.25038713020789 range_sum: 2147450880 5.412975112758745 forloop: 2147450880 17.875799577879313 forloop_offset: 2147450880 19.31672544974291 Debian Wheezy: rosuav@sikorsky:~$ python inttime.py 2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2] inline: 2147450880 1.92763710022 range_sum: 2147450880 1.93409109116 forloop: 2147450880 5.14633893967 forloop_offset: 2147450880 5.13459300995 rosuav@sikorsky:~$ python3 inttime.py 3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2] inline: 2147450880 2.884124994277954 range_sum: 2147450880 2.6586129665374756 forloop: 2147450880 7.660192012786865 forloop_offset: 2147450880 8.11817193031311 On 2.6/2.7, there's a massive penalty for switching to longs; on 3.2/3.3, the two for-loop versions are nearly identical in time. (Side point: I'm often seeing that 3.2 on Linux is marginally faster calling my range_sum function than doing the same thing inline. I do not understand this. If anyone can explain what's going on there, I'm all ears!) Python 3's int is faster than Python 2's long, but slower than Python 2's int. So the question really is, would a two-form representation be beneficial, and if so, is it worth the coding trouble? ChrisA