Path: csiph.com!usenet.pasdenom.info!goblin1!goblin.stu.neva.ru!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'python,': 0.02; 'algorithm': 0.03; '64-bit': 0.07; 'python': 0.09; '32-bit': 0.09; 'longs': 0.09; 'subject:Python3': 0.09; 'subject:set': 0.09; 'def': 0.10; 'anyway': 0.11; 'steve': 0.13; '10x': 0.16; '12)': 0.16; 'integers.': 0.16; 'oct': 0.16; 'ported': 0.16; 'subject:bit': 0.16; 'wrote:': 0.17; 'integer': 0.17; 'thu,': 0.17; 'version.': 0.17; 'url:edu': 0.18; '>>>': 0.18; 'bit': 0.21; 'import': 0.21; 'operations.': 0.22; 'os,': 0.22; 'testing': 0.24; 'pass': 0.25; 'header:In-Reply-To:1': 0.25; 'am,': 0.27; 'operations,': 0.27; 'message-id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'correct': 0.28; 'cpu': 0.29; 'loop,': 0.29; 'probably': 0.29; "i'm": 0.29; 'maybe': 0.29; 'received:209.85.215.46': 0.30; 'expect': 0.31; 'getting': 0.33; "aren't": 0.33; 'int': 0.33; 'to:addr:python-list': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'christian': 0.34; 'faster': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'but': 0.36; 'expensive': 0.36; 'level.': 0.36; 'too': 0.36; 'does': 0.37; 'rather': 0.37; 'received:209': 0.37; 'far': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'times': 0.63; 'number:': 0.65; 'saw': 0.75; 'to:name:python': 0.84; 'url:graphics': 0.84; 'url:stanford': 0.84; 'approach.': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=ilJRRMbGv0SCifd5YiWvAO4bjFZUiv2fXtVxevduKIk=; b=gky8SXGg40tMwo1hyI0Ox+t2FfNUWAGf5zzjqbCXmcQWgQqdt4wuNJrSFsdV67vGW2 E39oeXEJHVhvGV9hN6EqHOvcoRCGBbV2l6Uxj307I9Ii/b/OiwVgRlEzJrQS3R7TLF7k KEXB/MvivqTzKWvWwQNBRVDINsbBdGhJyGNUJSczbbJsdZli7LhTgGxEriC5Ew1E7n4A Y/Lrx9t2McajWUac37PESCAwoyS8u/rlwYN2AnUN/EGxhYIqWTIzsNCxqRJ84odW0uKs egTo6cg0a5Db1+HTx5pxTHdw7KJeMVw9mBR00Ul9aqkVjB+/ppY0z8hnWJCPz2RIObX/ zf0Q== MIME-Version: 1.0 In-Reply-To: References: From: Ian Kelly Date: Thu, 25 Oct 2012 12:19:28 -0600 Subject: Re: bit count or bit set && Python3 To: Python Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 50 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1351189200 news.xs4all.nl 6978 [2001:888:2000:d::a6]:41710 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:32153 On Thu, Oct 25, 2012 at 11:25 AM, Christian Gollwitzer wrote: > There is a very geeky algorithm with only a few integer operations. > > Checkout > http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSet64 > > for a C version. Maybe the same thing is equally fast when ported to Python. It does seem to be about five times faster than the loop, but still only half as fast as the string-counting approach. I'm using a 64-bit CPU but only a 32-bit OS, but it probably doesn't make much difference anyway for Python, which is going to do the computations with Python longs rather than with 64-bit C integers. Also, this approach is a bit limited in that it only works for 32-bit ints, so you can't pass it an aribtrary precision int and expect correct results. By the way, I expect the reason that Steve saw a 600x difference and I'm only getting a 10x difference is because he was testing 10000-bit ints with expensive "long" operations, whereas I'm using 32-bit ints that aren't too far removed from the C level. >>> from timeit import Timer >>> t = Timer("bin(number).count('1')", setup="number=2**31-1") >>> min(t.repeat(number=1000000, repeat=7)) 0.6388884620810749 >>> def count_set_bits(number): ... count = 0 ... while number: ... count += 1 ... number &= number - 1 ... return count ... >>> t = Timer("count_set_bits(number)", setup="from __main__ import count_set_bi ts; number = 2 ** 31-1") >>> min(t.repeat(number=100000, repeat=7)) 0.5898140685478239 >>> def count_set_bits_geeky(number): ... c = ((number & 0xfff) * 0x1001001001001 & 0x84210842108421) % 0x1f ... c += (((number & 0xfff000) >> 12) * 0x1001001001001 & 0x84210842108421) % 0x1f ... c += ((number >> 24) * 0x1001001001001 & 0x84210842108421) % 0x1f ... return c ... >>> t = Timer("count_set_bits_geeky(number)", setup="from __main__ import count_ set_bits_geeky; number = 2 ** 31-1") >>> min(t.repeat(number=100000, repeat=7)) 0.1247119396459766