Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89565 > unrolled thread

Lucky numbers in Python

Started byCecil Westerhof <Cecil@decebal.nl>
First post2015-04-29 20:24 +0200
Last post2015-04-30 15:28 -0400
Articles 12 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Lucky numbers in Python Cecil Westerhof <Cecil@decebal.nl> - 2015-04-29 20:24 +0200
    Re: Lucky numbers in Python Ian Kelly <ian.g.kelly@gmail.com> - 2015-04-29 13:57 -0600
      Re: Lucky numbers in Python Cecil Westerhof <Cecil@decebal.nl> - 2015-04-29 23:45 +0200
        Re: Lucky numbers in Python Ian Kelly <ian.g.kelly@gmail.com> - 2015-04-29 16:38 -0600
          Re: Lucky numbers in Python Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 02:01 +0200
            Re: Lucky numbers in Python Ian Kelly <ian.g.kelly@gmail.com> - 2015-04-29 20:55 -0600
              Re: Lucky numbers in Python Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 08:34 +0200
        Re: Lucky numbers in Python Chris Kaynor <ckaynor@zindagigames.com> - 2015-04-29 15:56 -0700
      Re: Lucky numbers in Python Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-04-30 10:11 +1000
        Re: Lucky numbers in Python Ian Kelly <ian.g.kelly@gmail.com> - 2015-04-29 21:08 -0600
    Re: Lucky numbers in Python Cecil Westerhof <Cecil@decebal.nl> - 2015-04-30 20:55 +0200
      Re: Lucky numbers in Python Dave Angel <davea@davea.name> - 2015-04-30 15:28 -0400

#89565 — Lucky numbers in Python

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-29 20:24 +0200
SubjectLucky numbers in Python
Message-ID<87lhhabxod.fsf@Equus.decebal.nl>
I wrote a function lucky_numbers:
    def lucky_numbers(n):
        if n < 3:
            return [1]
        sieve = range(1, n + 1, 2)
        sieve_index = 1
        while True:
            skip_count  = sieve[sieve_index]
            sieve_len   = len(sieve)
            if sieve_len < skip_count:
                break
            for del_index in range((sieve_len // skip_count) * skip_count - 1, 
                                  skip_count - 2, -skip_count):
                del sieve[del_index]
            sieve_index += 1
        return sieve

I was wondering if there is a way to do this:
            for del_index in range((sieve_len // skip_count) * skip_count - 1, 
                                  skip_count - 2, -skip_count):
                del sieve[del_index]
in a more efficient way.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [next] | [standalone]


#89572

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-04-29 13:57 -0600
Message-ID<mailman.95.1430337506.3680.python-list@python.org>
In reply to#89565
On Wed, Apr 29, 2015 at 12:24 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
> I was wondering if there is a way to do this:
>             for del_index in range((sieve_len // skip_count) * skip_count - 1,
>                                   skip_count - 2, -skip_count):
>                 del sieve[del_index]
> in a more efficient way.

You can delete using slices.

del sieve[(sieve_len // skip_count) * skip_count - 1 : skip_count - 2
: -skip_count]

Now you no longer need to do the iteration in reverse, which makes the
slicing simpler:

del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count : skip_count]

And although it's not clear to me what this is supposed to be doing,
you probably no longer need the middle term if the intention is to
continue deleting all the way to the end of the list (if it is then I
think you have a bug in the existing implementation, since the last
item in the list can never be deleted).

del sieve[skip_count - 1 :: skip_count]

[toc] | [prev] | [next] | [standalone]


#89582

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-29 23:45 +0200
Message-ID<87d22mbod7.fsf@Equus.decebal.nl>
In reply to#89572
Op Wednesday 29 Apr 2015 21:57 CEST schreef Ian Kelly:

> On Wed, Apr 29, 2015 at 12:24 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
>> I was wondering if there is a way to do this:
>> for del_index in range((sieve_len // skip_count) * skip_count - 1,
>> skip_count - 2, -skip_count):
>> del sieve[del_index]
>> in a more efficient way.
>
> You can delete using slices.
>
> del sieve[(sieve_len // skip_count) * skip_count - 1 : skip_count -
> 2 : -skip_count]
>
> Now you no longer need to do the iteration in reverse, which makes
> the slicing simpler:
>
> del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count :
> skip_count]

I expected that it could be done more efficiently, but did not expect
such a big difference: more as hundred times. The old situation took
20 seconds for 1000000. The new takes 0.17.

The code is know (I added a missing check):
    def lucky_numbers(n):
        if n < 3:
            return [1]
        sieve = range(1, n + 1, 2)
        sieve_index = 1
        while True:
            sieve_len   = len(sieve)
            if (sieve_index + 1) > sieve_len:
                break
            skip_count  = sieve[sieve_index]
            if sieve_len < skip_count:
                break
            del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count : skip_count]
            sieve_index += 1
        return sieve


> And although it's not clear to me what this is supposed to be doing,
> you probably no longer need the middle term if the intention is to
> continue deleting all the way to the end of the list (if it is then
> I think you have a bug in the existing implementation, since the
> last item in the list can never be deleted).

What do you mean by this? Executing:
    lucky_numbers(5)
gives:
    [1, 3]

So the last element (5) is deleted.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89584

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-04-29 16:38 -0600
Message-ID<mailman.103.1430347166.3680.python-list@python.org>
In reply to#89582
On Wed, Apr 29, 2015 at 3:45 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
> Op Wednesday 29 Apr 2015 21:57 CEST schreef Ian Kelly:
>> And although it's not clear to me what this is supposed to be doing,
>> you probably no longer need the middle term if the intention is to
>> continue deleting all the way to the end of the list (if it is then
>> I think you have a bug in the existing implementation, since the
>> last item in the list can never be deleted).
>
> What do you mean by this? Executing:
>     lucky_numbers(5)
> gives:
>     [1, 3]
>
> So the last element (5) is deleted.

Off by one error on my part. This is why negative skip values on
ranges and slices are not recommended: they're confusing. :-)

In that case you can definitely omit the middle term of the slice,
which will be both more concise and clearer in intent, though probably
not significantly faster.

[toc] | [prev] | [next] | [standalone]


#89590

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 02:01 +0200
Message-ID<871tj2bi2p.fsf@Equus.decebal.nl>
In reply to#89584
Op Thursday 30 Apr 2015 00:38 CEST schreef Ian Kelly:

> On Wed, Apr 29, 2015 at 3:45 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
>> Op Wednesday 29 Apr 2015 21:57 CEST schreef Ian Kelly:
>>> And although it's not clear to me what this is supposed to be
>>> doing, you probably no longer need the middle term if the
>>> intention is to continue deleting all the way to the end of the
>>> list (if it is then I think you have a bug in the existing
>>> implementation, since the last item in the list can never be
>>> deleted).
>>
>> What do you mean by this? Executing:
>> lucky_numbers(5)
>> gives:
>> [1, 3]
>>
>> So the last element (5) is deleted.
>
> Off by one error on my part. This is why negative skip values on
> ranges and slices are not recommended: they're confusing. :-)
>
> In that case you can definitely omit the middle term of the slice,
> which will be both more concise and clearer in intent, though
> probably not significantly faster.

It is certainly nit faster. It is even significantly slower. With the
middle term lucky_numbers(int(1e6)) takes 0.13 seconds. Without it
takes 14.3 seconds. Hundred times as long.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89595

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-04-29 20:55 -0600
Message-ID<mailman.110.1430362572.3680.python-list@python.org>
In reply to#89590
On Wed, Apr 29, 2015 at 6:01 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
> Op Thursday 30 Apr 2015 00:38 CEST schreef Ian Kelly:
>> In that case you can definitely omit the middle term of the slice,
>> which will be both more concise and clearer in intent, though
>> probably not significantly faster.
>
> It is certainly nit faster. It is even significantly slower. With the
> middle term lucky_numbers(int(1e6)) takes 0.13 seconds. Without it
> takes 14.3 seconds. Hundred times as long.

That would be rather surprising, since it's the same operation being
performed, so I did my own timing and came up with 0.25 seconds (best
of 3) with the middle term and 0.22 seconds without.

I suspect that you tested it as "del sieve[skip_count - 1 :
skip_count]" (which would delete only one item) rather than "del
sieve[skip_count - 1 :: skip_count]".

[toc] | [prev] | [next] | [standalone]


#89607

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 08:34 +0200
Message-ID<87r3r29lal.fsf@Equus.decebal.nl>
In reply to#89595
Op Thursday 30 Apr 2015 04:55 CEST schreef Ian Kelly:

> On Wed, Apr 29, 2015 at 6:01 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:
>> Op Thursday 30 Apr 2015 00:38 CEST schreef Ian Kelly:
>>> In that case you can definitely omit the middle term of the slice,
>>> which will be both more concise and clearer in intent, though
>>> probably not significantly faster.
>>
>> It is certainly nit faster. It is even significantly slower. With
>> the middle term lucky_numbers(int(1e6)) takes 0.13 seconds. Without
>> it takes 14.3 seconds. Hundred times as long.
>
> That would be rather surprising, since it's the same operation being
> performed, so I did my own timing and came up with 0.25 seconds
> (best of 3) with the middle term and 0.22 seconds without.
>
> I suspect that you tested it as "del sieve[skip_count - 1 :
> skip_count]" (which would delete only one item) rather than "del
> sieve[skip_count - 1 :: skip_count]".

Yeah, that is how I interpreted omitting the middle term. But it
seemed to give the right results.

With your amendment it runs slightly faster. With 1E7 it is 8.0 and
7.7. So almost 4% faster.

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89585

FromChris Kaynor <ckaynor@zindagigames.com>
Date2015-04-29 15:56 -0700
Message-ID<mailman.104.1430348212.3680.python-list@python.org>
In reply to#89582

[Multipart message — attachments visible in raw view] — view raw

On Wed, Apr 29, 2015 at 2:45 PM, Cecil Westerhof <Cecil@decebal.nl> wrote:

> >> I was wondering if there is a way to do this:
> >> for del_index in range((sieve_len // skip_count) * skip_count - 1,
> >> skip_count - 2, -skip_count):
> >> del sieve[del_index]
> >> in a more efficient way.
> >
> > You can delete using slices.
> >
> > del sieve[(sieve_len // skip_count) * skip_count - 1 : skip_count -
> > 2 : -skip_count]
> >
> > Now you no longer need to do the iteration in reverse, which makes
> > the slicing simpler:
> >
> > del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count :
> > skip_count]
>
> I expected that it could be done more efficiently, but did not expect
> such a big difference: more as hundred times. The old situation took
> 20 seconds for 1000000. The new takes 0.17.


Its not too surprising, as deleting the non-end element of a list is a O(n)
operation - it must copy all elements in the list into a new list each
time. This means that your algorithm is roughly O(n*n*log(n)) performance -
n for each list delete, which is wrapped in a for loop of n iterations,
which is wrapped in a while loop which will run log(n) times (I think that
while loop will run log(n) times, but have not actually tested the math).

Deleting a slice should take n time as well, however it is now done only
once rather than once per item to be removed, which should reduce the
overall algorithm to O(n*log(n)) time - aka, a HUGE difference with any
moderate to large input.

Chris

[toc] | [prev] | [next] | [standalone]


#89588

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2015-04-30 10:11 +1000
Message-ID<5541734a$0$12995$c3e8da3$5496439d@news.astraweb.com>
In reply to#89572
On Thu, 30 Apr 2015 05:57 am, Ian Kelly wrote:

> On Wed, Apr 29, 2015 at 12:24 PM, Cecil Westerhof <Cecil@decebal.nl>
> wrote:
>> I was wondering if there is a way to do this:
>>             for del_index in range((sieve_len // skip_count) * skip_count
>>             - 1,
>>                                   skip_count - 2, -skip_count):
>>                 del sieve[del_index]
>> in a more efficient way.
> 
> You can delete using slices.
> 
> del sieve[(sieve_len // skip_count) * skip_count - 1 : skip_count - 2
> : -skip_count]
> 
> Now you no longer need to do the iteration in reverse, which makes the
> slicing simpler:
> 
> del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count :
> skip_count]

True, but *probably* at the expense of speed. When you delete items from a
list, the remaining items have to be moved, which takes time, especially
for very large lists.

Most of the time, rather than deleting items, it is faster to set them to a
placeholder (for example None) and then copy the ones which aren't None in
a separate loop:


def erat(n):
    """Return a list of primes up to and including n.

    This is a fixed-size version of the Sieve of Eratosthenes, using an
    adaptation of the traditional algorithm.

    >>> erat(30)
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

    """
    if n < 2:
        return []
    # Generate a fixed array of integers.
    arr = list(range(n+1))  # A list is faster than an array.
    # Cross out 0 and 1 since they aren't prime.
    arr[0] = arr[1] = None
    i = 2
    while i*i <= n:
        # Cross out all the multiples of i starting from i**2.
        for p in range(i*i, n+1, i):
            arr[p] = None
        # Advance to the next number not crossed off.
        i += 1
        while i <= n and arr[i] is None:
            i += 1
    # Copy the items which aren't None.
    return list(filter(None, arr))


The above is written for Python 3. In Python 2, you can safely drop the two
calls to list() since both range and filter already return lists.

Another alternative is to use slice assignment of the slices that you don't
delete, rather than deleting directly:


del alist[5:55]

is similar to:

alist[:] = alist[:4] + alist[55:]  # Note the [:] on the left.


Which is faster will probably depend on the specifics of the list.

But of course for small lists, none of this matters and you should just use
whatever is easiest and most natural to read.




-- 
Steven

[toc] | [prev] | [next] | [standalone]


#89596

FromIan Kelly <ian.g.kelly@gmail.com>
Date2015-04-29 21:08 -0600
Message-ID<mailman.111.1430363360.3680.python-list@python.org>
In reply to#89588
On Wed, Apr 29, 2015 at 6:11 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Thu, 30 Apr 2015 05:57 am, Ian Kelly wrote:
>
>> On Wed, Apr 29, 2015 at 12:24 PM, Cecil Westerhof <Cecil@decebal.nl>
>> wrote:
>>> I was wondering if there is a way to do this:
>>>             for del_index in range((sieve_len // skip_count) * skip_count
>>>             - 1,
>>>                                   skip_count - 2, -skip_count):
>>>                 del sieve[del_index]
>>> in a more efficient way.
>>
>> You can delete using slices.
>>
>> del sieve[(sieve_len // skip_count) * skip_count - 1 : skip_count - 2
>> : -skip_count]
>>
>> Now you no longer need to do the iteration in reverse, which makes the
>> slicing simpler:
>>
>> del sieve[skip_count - 1 : (sieve_len // skip_count) * skip_count :
>> skip_count]
>
> True, but *probably* at the expense of speed. When you delete items from a
> list, the remaining items have to be moved, which takes time, especially
> for very large lists.
>
> Most of the time, rather than deleting items, it is faster to set them to a
> placeholder (for example None) and then copy the ones which aren't None in
> a separate loop:

You're correct, but I think this would be difficult to apply to the
OP's algorithm since the list indexing depends on the items from
previous iterations having been removed.

[toc] | [prev] | [next] | [standalone]


#89660

FromCecil Westerhof <Cecil@decebal.nl>
Date2015-04-30 20:55 +0200
Message-ID<87fv7htpip.fsf@Equus.decebal.nl>
In reply to#89565
Because I want the code to work with Python 3 also, the code is now:
    def lucky_numbers(n):
        """
        Lucky numbers from 1 up-to n
        http://en.wikipedia.org/wiki/Lucky_number
        """

        if n < 3:
            return [1]
        sieve = list(range(1, n + 1, 2))
        sieve_index = 1
        while True:
            sieve_len   = len(sieve)
            if (sieve_index + 1) > sieve_len:
                break
            skip_count  = sieve[sieve_index]
            if sieve_len < skip_count:
                break
            del sieve[skip_count - 1 : : skip_count]
            sieve_index += 1
        return sieve

It looks like the list in:
        sieve = list(range(1, n + 1, 2))

does not have much influence in Python 2. So I was thinking of leaving
the code like it is. Or is it better to check and do the list only
with Python 3?

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof

[toc] | [prev] | [next] | [standalone]


#89662

FromDave Angel <davea@davea.name>
Date2015-04-30 15:28 -0400
Message-ID<mailman.142.1430422160.3680.python-list@python.org>
In reply to#89660
On 04/30/2015 02:55 PM, Cecil Westerhof wrote:
> Because I want the code to work with Python 3 also, the code is now:
>      def lucky_numbers(n):
>          """
>          Lucky numbers from 1 up-to n
>          http://en.wikipedia.org/wiki/Lucky_number
>          """
>
>          if n < 3:
>              return [1]
>          sieve = list(range(1, n + 1, 2))
>          sieve_index = 1
>          while True:
>              sieve_len   = len(sieve)
>              if (sieve_index + 1) > sieve_len:
>                  break
>              skip_count  = sieve[sieve_index]
>              if sieve_len < skip_count:
>                  break
>              del sieve[skip_count - 1 : : skip_count]
>              sieve_index += 1
>          return sieve
>
> It looks like the list in:
>          sieve = list(range(1, n + 1, 2))
>
> does not have much influence in Python 2. So I was thinking of leaving
> the code like it is. Or is it better to check and do the list only
> with Python 3?
>

I'd do something like this at top of the module:

try:
     range = xrange
except NameError as ex:
     pass

then use range as it is defined in Python3.

if that makes you nervous, then define irange = xrange, and if it gets a 
NameError exception, irange = range


-- 
DaveA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web