Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #32135 > unrolled thread

Re: bit count or bit set && Python3

Started byChris Angelico <rosuav@gmail.com>
First post2012-10-26 02:31 +1100
Last post2012-10-25 22:51 +0100
Articles 15 — 8 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: bit count or bit set && Python3 Chris Angelico <rosuav@gmail.com> - 2012-10-26 02:31 +1100
    Re: bit count or bit set && Python3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-25 15:57 +0000
      Re: bit count or bit set && Python3 rusi <rustompmody@gmail.com> - 2012-10-25 09:17 -0700
        Re: bit count or bit set && Python3 Chris Angelico <rosuav@gmail.com> - 2012-10-26 03:29 +1100
          Re: bit count or bit set && Python3 rusi <rustompmody@gmail.com> - 2012-10-25 09:37 -0700
        Re: bit count or bit set && Python3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-25 17:44 +0100
        Re: bit count or bit set && Python3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-25 17:16 +0000
      Re: bit count or bit set && Python3 Serhiy Storchaka <storchaka@gmail.com> - 2012-10-25 22:07 +0300
      Re: bit count or bit set && Python3 Neil Cerutti <neilc@norwich.edu> - 2012-10-25 20:00 +0000
        Re: bit count or bit set && Python3 Neil Cerutti <neilc@norwich.edu> - 2012-10-25 20:04 +0000
        Re: bit count or bit set && Python3 Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-25 14:20 -0600
          Re: bit count or bit set && Python3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-25 23:48 +0000
          Re: bit count or bit set && Python3 Neil Cerutti <neilc@norwich.edu> - 2012-10-26 12:56 +0000
      Re: bit count or bit set && Python3 Charles Hixson <charleshixsn@earthlink.net> - 2012-10-25 09:08 -0700
      Re: bit count or bit set && Python3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-10-25 22:51 +0100

#32135 — Re: bit count or bit set && Python3

FromChris Angelico <rosuav@gmail.com>
Date2012-10-26 02:31 +1100
SubjectRe: bit count or bit set && Python3
Message-ID<mailman.2849.1351179123.27098.python-list@python.org>
On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes <christian@python.org> wrote:
> Simple, easy, faster than a Python loop but not very elegant:
>
>    bin(number).count("1")

Unlikely to be fast.

What you may want is some sort of hybrid loop/lookup approach. Do you
know what your highest bit number is going to be? For instance, are
all your integers 32-bit? You could use something like this:

c = bitcount[n&255] + bitcount[n>>8&255] + bitcount[n>>16&255] + bitcount[n>>24]

where bitcount is a list of 256 values, giving the counts for each
value from 0 to 255.

Profile and test. :)

ChrisA

[toc] | [next] | [standalone]


#32137

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-10-25 15:57 +0000
Message-ID<50896152$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#32135
On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:

> On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes <christian@python.org>
> wrote:
>> Simple, easy, faster than a Python loop but not very elegant:
>>
>>    bin(number).count("1")
> 
> Unlikely to be fast.

Oh I don't know about that. Here's some timing results using Python 2.7:

py> from timeit import Timer
py> t = Timer('bin(number).count("1")', setup='number=2**10001-1')
py> min(t.repeat(number=10000, repeat=7))
0.6819710731506348

Compare to MRAB's suggestion:

def count_set_bits(number):
     count = 0
     while number:
         count += 1
         number &= number - 1
     return count

py> t = Timer('count_set_bits(number)', 
...     setup='from __main__ import count_set_bits; number=2**10001-1')
py> min(t.repeat(number=100, repeat=7))
4.141788959503174


That makes the "inelegant" solution using bin() and count() about 600 
times faster than the mathematically clever solution using bitwise 
operations.

On the other hand, I'm guessing that PyPy would speed up MRAB's version 
significantly.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#32140

Fromrusi <rustompmody@gmail.com>
Date2012-10-25 09:17 -0700
Message-ID<41338e78-a3d9-4ab6-ab53-79ded90411c6@vy11g2000pbb.googlegroups.com>
In reply to#32137
On Oct 25, 8:57 pm, Steven D'Aprano <steve
+comp.lang.pyt...@pearwood.info> wrote:
> On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:
> > On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes <christ...@python.org>
> > wrote:
> >> Simple, easy, faster than a Python loop but not very elegant:
>
> >>    bin(number).count("1")
>
> > Unlikely to be fast.
>
> Oh I don't know about that. Here's some timing results using Python 2.7:
>
> py> from timeit import Timer
> py> t = Timer('bin(number).count("1")', setup='number=2**10001-1')
> py> min(t.repeat(number=10000, repeat=7))
> 0.6819710731506348
>
> Compare to MRAB's suggestion:
>
> def count_set_bits(number):
>      count = 0
>      while number:
>          count += 1
>          number &= number - 1
>      return count
>
> py> t = Timer('count_set_bits(number)',
> ...     setup='from __main__ import count_set_bits; number=2**10001-1')
> py> min(t.repeat(number=100, repeat=7))
> 4.141788959503174
>
> That makes the "inelegant" solution using bin() and count() about 600
> times faster than the mathematically clever solution using bitwise
> operations.

You meant 600% I think?

[toc] | [prev] | [next] | [standalone]


#32143

FromChris Angelico <rosuav@gmail.com>
Date2012-10-26 03:29 +1100
Message-ID<mailman.2852.1351182588.27098.python-list@python.org>
In reply to#32140
On Fri, Oct 26, 2012 at 3:17 AM, rusi <rustompmody@gmail.com> wrote:
> On Oct 25, 8:57 pm, Steven D'Aprano <steve
> +comp.lang.pyt...@pearwood.info> wrote:
>> py> min(t.repeat(number=10000, repeat=7))
>> 0.6819710731506348
>> py> min(t.repeat(number=100, repeat=7))
>> 4.141788959503174
>>
>> That makes the "inelegant" solution using bin() and count() about 600
>> times faster than the mathematically clever solution using bitwise
>> operations.
>
> You meant 600% I think?

It took six times longer to do one hundredth the iterations.

ChrisA

[toc] | [prev] | [next] | [standalone]


#32144

Fromrusi <rustompmody@gmail.com>
Date2012-10-25 09:37 -0700
Message-ID<976a7429-eea9-4309-84bb-389bfd54d710@sh1g2000pbc.googlegroups.com>
In reply to#32143
On Oct 25, 9:30 pm, Chris Angelico <ros...@gmail.com> wrote:
> On Fri, Oct 26, 2012 at 3:17 AM, rusi <rustompm...@gmail.com> wrote:
> > On Oct 25, 8:57 pm, Steven D'Aprano <steve
> > +comp.lang.pyt...@pearwood.info> wrote:
> >> py> min(t.repeat(number=10000, repeat=7))
> >> 0.6819710731506348
> >> py> min(t.repeat(number=100, repeat=7))
> >> 4.141788959503174
>
> >> That makes the "inelegant" solution using bin() and count() about 600
> >> times faster than the mathematically clever solution using bitwise
> >> operations.
>
> > You meant 600% I think?
>
> It took six times longer to do one hundredth the iterations.
>
> ChrisA

Oh! Missed the number=

[toc] | [prev] | [next] | [standalone]


#32146

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2012-10-25 17:44 +0100
Message-ID<mailman.2855.1351183324.27098.python-list@python.org>
In reply to#32140
On 25/10/2012 17:29, Chris Angelico wrote:
> On Fri, Oct 26, 2012 at 3:17 AM, rusi <rustompmody@gmail.com> wrote:
>> On Oct 25, 8:57 pm, Steven D'Aprano <steve
>> +comp.lang.pyt...@pearwood.info> wrote:
>>> py> min(t.repeat(number=10000, repeat=7))
>>> 0.6819710731506348
>>> py> min(t.repeat(number=100, repeat=7))
>>> 4.141788959503174
>>>
>>> That makes the "inelegant" solution using bin() and count() about 600
>>> times faster than the mathematically clever solution using bitwise
>>> operations.
>>
>> You meant 600% I think?
>
> It took six times longer to do one hundredth the iterations.
>
> ChrisA
>

Oh no, not another PEP 393 foul up :)

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]


#32150

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-10-25 17:16 +0000
Message-ID<508973f8$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#32140
On Thu, 25 Oct 2012 09:17:40 -0700, rusi wrote:

> On Oct 25, 8:57 pm, Steven D'Aprano <steve
> +comp.lang.pyt...@pearwood.info> wrote:
>> On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:
>> > On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes
>> > <christ...@python.org> wrote:
>> >> Simple, easy, faster than a Python loop but not very elegant:
>>
>> >>    bin(number).count("1")
>>
>> > Unlikely to be fast.
>>
>> Oh I don't know about that. Here's some timing results using Python
>> 2.7:
>>
>> py> from timeit import Timer
>> py> t = Timer('bin(number).count("1")', setup='number=2**10001-1') 
>> py> min(t.repeat(number=10000, repeat=7)) 
>> 0.6819710731506348
>>
>> Compare to MRAB's suggestion:
>>
>> def count_set_bits(number):
>>      count = 0
>>      while number:
>>          count += 1
>>          number &= number - 1
>>      return count
>>
>> py> t = Timer('count_set_bits(number)', 
>> ...     setup='from __main__ import count_set_bits;
>> ...     number=2**10001-1') 
>> py> min(t.repeat(number=100, repeat=7)) 
>> 4.141788959503174
>>
>> That makes the "inelegant" solution using bin() and count() about 600
>> times faster than the mathematically clever solution using bitwise
>> operations.
> 
> You meant 600% I think?

No, I mean a factor of 600 times faster. Look at the number of iterations 
used in each test: 10000 versus 100.

Using bin() and count() takes 0.6819710731506348/10000 = 6.8e-5 seconds 
on my computer; using MRAB's neat trick takes 4.141788959503174/100 = 
0.041 seconds. 0.041/6.8e-5 is slightly over 600.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#32155

FromSerhiy Storchaka <storchaka@gmail.com>
Date2012-10-25 22:07 +0300
Message-ID<mailman.2861.1351192087.27098.python-list@python.org>
In reply to#32137
Chris Angelico's suggestion:

 >>> bitcount = bytes(bin(i).count("1") for i in range(256))
 >>> t = Timer('sum(number.to_bytes((number.bit_length() + 7) // 8,'
... '"little").translate(bitcount))',
... setup='from __main__ import bitcount; number=2**10001-1')
 >>> min(t.repeat(number=10000, repeat=7))

[toc] | [prev] | [next] | [standalone]


#32156

FromNeil Cerutti <neilc@norwich.edu>
Date2012-10-25 20:00 +0000
Message-ID<aetk2pFsd2rU3@mid.individual.net>
In reply to#32137
On 2012-10-25, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
> On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:
>> On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes
>> <christian@python.org>
>> wrote:
>>> Simple, easy, faster than a Python loop but not very elegant:
>>>
>>>    bin(number).count("1")
>> 
>> Unlikely to be fast.
>
> Oh I don't know about that.

Yes indeed! Python string operations are fast enough and its
arithmetic slow enough that I no longer assume I can beat a neat
lexicographical solution. Try defeating the following with
arithmetic:

def is_palindrom(n):
   s = str(n)
   return s = s[::-1]

> Here's some timing results using Python 2.7:

Excellent work.

You can of course drop to C for arithmetic and likely triumph
over Python strings. That's never been applicable for me, though.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#32157

FromNeil Cerutti <neilc@norwich.edu>
Date2012-10-25 20:04 +0000
Message-ID<aetka5Fsd2rU4@mid.individual.net>
In reply to#32156
On 2012-10-25, Neil Cerutti <neilc@norwich.edu> wrote:
> Try defeating the following with arithmetic:
>
> def is_palindrom(n):
>    s = str(n)
>    return s = s[::-1]

Sorry for the typos. It should've been:

def is_palindrome(n):
   s = str(n)
   return s == s[::-1]

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#32160

FromIan Kelly <ian.g.kelly@gmail.com>
Date2012-10-25 14:20 -0600
Message-ID<mailman.2864.1351196432.27098.python-list@python.org>
In reply to#32156
On Thu, Oct 25, 2012 at 2:00 PM, Neil Cerutti <neilc@norwich.edu> wrote:
> Yes indeed! Python string operations are fast enough and its
> arithmetic slow enough that I no longer assume I can beat a neat
> lexicographical solution. Try defeating the following with
> arithmetic:
>
> def is_palindrom(n):
>    s = str(n)
>    return s = s[::-1]

Problems like these are fundamentally string problems, not math
problems.  The question being asked isn't about some essential
property of the number,  but about its digital representation.
Certainly they can be reasoned about mathematically, but the fact
remains that the math being done is about the properties of strings.

[toc] | [prev] | [next] | [standalone]


#32173

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-10-25 23:48 +0000
Message-ID<5089cfd2$0$29978$c3e8da3$5496439d@news.astraweb.com>
In reply to#32160
On Thu, 25 Oct 2012 14:20:00 -0600, Ian Kelly wrote:

> On Thu, Oct 25, 2012 at 2:00 PM, Neil Cerutti <neilc@norwich.edu> wrote:
>> Yes indeed! Python string operations are fast enough and its arithmetic
>> slow enough that I no longer assume I can beat a neat lexicographical
>> solution. Try defeating the following with arithmetic:
>>
>> def is_palindrom(n):
>>    s = str(n)
>>    return s = s[::-1]
> 
> Problems like these are fundamentally string problems, not math
> problems.  The question being asked isn't about some essential property
> of the number,  but about its digital representation.

Speaking as somebody who sometimes pretends to be a mathematician, I 
think you are wrong there. The property of being a palindrome may have 
been invented in reference to strings, but it is also a mathematical 
property of a number. Properties of the digits of numbers are properties 
of the relationship between the number and some sequence of divisors (the 
powers of some base), which makes them numeric properties.

It's interesting to consider properties of numbers which hold for *every* 
base. For example, I understand that the digits of pi are uniformly 
distributed regardless of which base you use.

Certainly mathematicians frequently ask questions about the digits of 
numbers, generally in base ten but also in other bases. Since the digits 
of a number are based on the numeric properties of the number, they 
certainly are essential to the number (modulo some base).

For example, apart from two itself[1], every prime number ends in a 1 bit 
in base two. In decimal, every factorial greater than 4! ends with a 
zero. A number is divisible by 9 if the sum of the (decimal) digits 
reduces to 9. A number is divisible by 5 if the last digit is 0 or 5. 
These are not accidents, they depend on the numeric properties.


> Certainly they can
> be reasoned about mathematically, but the fact remains that the math
> being done is about the properties of strings.

Strings of digits, which are numerically equal to the remainders when you 
divide the number by successively larger powers of some base.:

123 = 1*10**2 + 2*10**1 + 3*10**0

Hence, mathematical.

Of course, none of this challenges the fact that, at least in Python, 
reasoning about digits is often best done on the string representation 
rather than the number itself.



[1] Which makes 2 the oddest prime of all.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#32215

FromNeil Cerutti <neilc@norwich.edu>
Date2012-10-26 12:56 +0000
Message-ID<aevfjqFaa5tU3@mid.individual.net>
In reply to#32160
On 2012-10-25, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> On Thu, Oct 25, 2012 at 2:00 PM, Neil Cerutti
> <neilc@norwich.edu> wrote:
>> Yes indeed! Python string operations are fast enough and its
>> arithmetic slow enough that I no longer assume I can beat a
>> neat lexicographical solution. Try defeating the following
>> with arithmetic:
>>
>> def is_palindrom(n):
>>    s = str(n)
>>    return s = s[::-1]
>
> Problems like these are fundamentally string problems, not math
> problems.  The question being asked isn't about some essential
> property of the number,  but about its digital representation.
> Certainly they can be reasoned about mathematically, but the
> fact remains that the math being done is about the properties
> of strings.

The "unexpected" part, to me, is that an optimal arithmetic based
solution conceptually is more efficient. You need to compute just
half the digits of the number and then perform a contant compare
operation.

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#32166

FromCharles Hixson <charleshixsn@earthlink.net>
Date2012-10-25 09:08 -0700
Message-ID<mailman.2867.1351199601.27098.python-list@python.org>
In reply to#32137
On 10/25/2012 08:57 AM, Steven D'Aprano wrote:
> On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:
>
>> On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes<christian@python.org>
>> wrote:
>>> Simple, easy, faster than a Python loop but not very elegant:
>>>
>>>     bin(number).count("1")
>> Unlikely to be fast.
> Oh I don't know about that. Here's some timing results using Python 2.7:
>
> py>  from timeit import Timer
> py>  t = Timer('bin(number).count("1")', setup='number=2**10001-1')
> py>  min(t.repeat(number=10000, repeat=7))
> 0.6819710731506348
>
> Compare to MRAB's suggestion:
>
> def count_set_bits(number):
>       count = 0
>       while number:
>           count += 1
>           number&= number - 1
>       return count
>
> py>  t = Timer('count_set_bits(number)',
> ...     setup='from __main__ import count_set_bits; number=2**10001-1')
> py>  min(t.repeat(number=100, repeat=7))
> 4.141788959503174
>
>
> That makes the "inelegant" solution using bin() and count() about 600
> times faster than the mathematically clever solution using bitwise
> operations.
>
> On the other hand, I'm guessing that PyPy would speed up MRAB's version
> significantly.
>
>
>
Really nice and good to know.  I had guessed the other way.   (As you 
point out this is compiler dependent, and I'll be using Python3, 
but...conversion from an int to a bit string must be a *lot* faster than 
I had thought.)

-- 
Charles Hixson

[toc] | [prev] | [next] | [standalone]


#32171

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2012-10-25 22:51 +0100
Message-ID<mailman.2871.1351201709.27098.python-list@python.org>
In reply to#32137
On 25/10/2012 17:08, Charles Hixson wrote:
> On 10/25/2012 08:57 AM, Steven D'Aprano wrote:
>> On Fri, 26 Oct 2012 02:31:53 +1100, Chris Angelico wrote:
>>
>>> On Fri, Oct 26, 2012 at 2:25 AM, Christian Heimes<christian@python.org>
>>> wrote:
>>>> Simple, easy, faster than a Python loop but not very elegant:
>>>>
>>>>     bin(number).count("1")
>>> Unlikely to be fast.
>> Oh I don't know about that. Here's some timing results using Python 2.7:
>>
>> py>  from timeit import Timer
>> py>  t = Timer('bin(number).count("1")', setup='number=2**10001-1')
>> py>  min(t.repeat(number=10000, repeat=7))
>> 0.6819710731506348
>>
>> Compare to MRAB's suggestion:
>>
>> def count_set_bits(number):
>>       count = 0
>>       while number:
>>           count += 1
>>           number&= number - 1
>>       return count
>>
>> py>  t = Timer('count_set_bits(number)',
>> ...     setup='from __main__ import count_set_bits; number=2**10001-1')
>> py>  min(t.repeat(number=100, repeat=7))
>> 4.141788959503174
>>
>>
>> That makes the "inelegant" solution using bin() and count() about 600
>> times faster than the mathematically clever solution using bitwise
>> operations.
>>
>> On the other hand, I'm guessing that PyPy would speed up MRAB's version
>> significantly.
>>
>>
>>
> Really nice and good to know.  I had guessed the other way.   (As you
> point out this is compiler dependent, and I'll be using Python3,
> but...conversion from an int to a bit string must be a *lot* faster than
> I had thought.)
>

The simple rule for Python performance is never guess anything as you'll 
invariably be wrong, time it and/or profile it, then change your code if 
and only if you have to.

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web