Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.007 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'algorithm': 0.04; 'explicitly': 0.05; 'indexing': 0.07; 'smallest': 0.07; 'wednesday,': 0.07; 'collapsed': 0.09; 'if,': 0.09; 'item,': 0.09; 'subject:number': 0.09; 'python': 0.11; 'def': 0.12; 'question.': 0.14; 'ah,': 0.16; 'dictionary,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'heap': 0.16; 'heapq': 0.16; 'inserting': 0.16; 'iterating': 0.16; 'lambda': 0.16; 'simple.': 0.16; 'subject:Prime': 0.16; 'subject:generator': 0.16; 'true:': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'trying': 0.19; 'basically': 0.19; 'thu,': 0.19; 'appears': 0.22; 'example': 0.22; 'import': 0.22; 'instance,': 0.24; "haven't": 0.24; 'first,': 0.26; 'header :In-Reply-To:1': 0.27; 'function': 0.29; 'chris': 0.29; 'on,': 0.29; 'am,': 0.29; 'involving': 0.30; 'originally': 0.30; 'message-id:@mail.gmail.com': 0.30; '(which': 0.31; 'code': 0.31; 'getting': 0.31; 'that.': 0.31; '(since': 0.31; 'factor': 0.31; 'tuples': 0.31; 'allows': 0.31; 'figure': 0.32; 'thanks!': 0.32; 'run': 0.32; 'quite': 0.32; 'worked': 0.33; 'something': 0.35; 'point.': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'version': 0.36; 'really': 0.36; 'module.': 0.36; 'yield': 0.36; 'next': 0.36; "didn't": 0.36; 'should': 0.36; 'example,': 0.37; 'list.': 0.37; 'clear': 0.37; 'performance': 0.37; 'starting': 0.37; 'minimum': 0.38; 'star': 0.38; 'to:addr:python- list': 0.38; 'list,': 0.38; 'does': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'changed': 0.39; 'skip:u 10': 0.60; 'catch': 0.60; 'course.': 0.60; 'removing': 0.60; 'lower': 0.61; 'new': 0.61; 'first': 0.61; 'july': 0.63; 'skip:n 10': 0.64; 'more': 0.64; 'total': 0.65; 'series': 0.66; 'bottom': 0.67; 'bulk': 0.74; 'jul': 0.74; 'prime': 0.74; 'dict.': 0.84; 'experiment': 0.84; 'moves': 0.84; 'ridiculously': 0.84; 'yielded': 0.84; '1:47': 0.91; 'severe': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=NYaryTZjcpQsfn/J7Rg46tFpcpDEQpVD124wU9G7okE=; b=EOwABSImrncnXKXi12/D1A6Itk1NYWmlQSSCQZqpjBN//17MJGBpbaYkqS6LXQ0YTO nDdj145JzNTlvIXzy8wxsiVTgCEaMpjhBbym5kgQinLc3kKbru+niINn9bN5pFqkOSRY hjYtbp+vr3c3KKgVCzZy9/UR5esf4k5P2an7Tny2/TICBK86Jmq98VZP1hLYQhZ1Rxc8 57qTIi669OAUvMYu3yIDPtSElvy9NjN3qdfWu20sZcflw/de++adp20V/sWjNrTE/CTO /WVy/5QE+l5jVp4jktAoI/nMRnHkjDSdktXdLQ5nKn0MWXLjje9YGWOXoEd2izu4YJO5 vI0Q== MIME-Version: 1.0 X-Received: by 10.52.93.106 with SMTP id ct10mr16186845vdb.83.1373472906479; Wed, 10 Jul 2013 09:15:06 -0700 (PDT) In-Reply-To: References: <15167633-b6e7-46cc-a043-8dfe8aaad11e@googlegroups.com> Date: Thu, 11 Jul 2013 02:15:06 +1000 Subject: Re: Prime number generator From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 84 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1373472909 news.xs4all.nl 15977 [2001:888:2000:d::a6]:48715 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:50376 On Thu, Jul 11, 2013 at 1:47 AM, bas wrote: > On Wednesday, July 10, 2013 5:12:19 PM UTC+2, Chris Angelico wrote: >> Well, that does answer the question. Unfortunately the use of lambda >> there has a severe performance cost [ ...] > If you care about speed, you might want to check the heapq module. Removi= ng the smallest item and inserting a new item in a heap both cost O(log(N))= time, while finding the minimum in a dictionary requires iterating over th= e whole dictionary, which cost O(N) time. Ehh, speed isn't the ultimate. I was just trying to avoid something that worked out ridiculously slow (a Python function call IS quite slow). I haven't profiled the code to find out where the bulk of the time is spent, but switching in the lambda-based version doubled total run time, so I didn't like it :) > (untested) > #before loop > from heapq import * > primes =3D [(2,2)] #heap of tuples (multiple, prime). start with 1 item, = so no need for heapify > > #during loop > smallest, prm =3D heappop(primes) > heappush(primes, (smallest+prm, prm)) > > #when new prime found > heappush(primes, (i+i, i)) Ahh, that's the bit I should have thought of! Of course. My original thought experiment had involved basically a really long list, like the classic Sieve but getting longer as time moves on, with composites replaced by None and primes with their next-markers, which I then collapsed to a dict. Always I was thinking in terms of indexing with the prime to get its next composite. Here's the code involving heapq: # -- start -- def primes(): """Generate an infinite series of prime numbers.""" from heapq import heappush,heappop i=3D2 yield 2 prime=3D[(2,2)] # Heap while True: smallest, prm =3D heappop(prime) heappush(prime, (smallest+prm, prm)) while i> > Still trying to figure out your algorithm ... >> It's pretty simple. [...] > I understand what you are trying, but it is not immediately clear to me t= hat it works correctly if for example a smallest factor appears twice in th= e list. I don't have time for it now, but it for sure can be simplified. Th= e while loop, for example, can be replaced by an if, since it will never ex= ecute more than once (since if i is prime > 2, i+1 will be divisible by 2) Ah, good point. Again, I originally was starting from 1, so the while loop was necessary to catch 2 and 3 in the first run. When I changed it to start at 2 and explicitly yield it first, I didn't notice to change that. It works correctly with the smallest multiple appearing twice because it won't yield primes until the smallest value is higher than the current next-prime. So when it's just yielded 11, for instance, both the 2 and 3 slots hold 12; advancing one of those does nothing, advancing the other allows the bottom loop to notice that 13 is now lower than the minimum value (which will then be 14). ChrisA