Path: csiph.com!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 18 Jun 2024 17:34:39 -0700
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <867celixcw.fsf@linuxsc.com>
References: <86r0duwqgg.fsf@linuxsc.com> <86o78mpnlf.fsf@linuxsc.com> <86ttibod7n.fsf@linuxsc.com> <86h6eaoi2r.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Wed, 19 Jun 2024 02:34:40 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="fa54ab77d267afe6c3d864a61670dc2b"; logging-data="1657039"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19F8B52WiI5MqKDGitXL79viizMNcrReqA="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:SZdLfXDx9g+zR3+An6vVdyVYHS8= sha1:W/uhfLk0zN4J/P6uhSvvHc8t0iQ=
Xref: csiph.com comp.lang.c++:119474
Vir Campestris writes:
> OK, I'm getting some weird results.
>
> Firstly, whatever I do my original algorithm storing odds only is
> faster than using mod30. Quite a bit faster. I think this is because
> mod30 requires it to access each byte in the map individually, rather
> than eight at a time.
>
> How much is when I got weird.
>
> Going through the large primes I've got a cache size.
>
> I go through each of the primes, adding a modulo and a word index to
> get the bit that needs to be set next. When the modulo overflows the
> word index gets incremented too.
>
> I do that for a block of memory, then repeat for the next prime, and
> so on until I've done all the primes. I then go onto the next block of
> memory - which is tuned to fit in the cache.
>
> Rather than do some arithmetic I though I'll just run through a loop
> with different sizes of that memory block, and pick the best
> one. That's a bit under a megabyte.
>
> I then remove the loop, instead having a constexpr for the size.
>
> The code now runs a third slower.
>
> I can have the loop go around exactly once, and it is still way faster
> than the hard coded constant!
Would you mind if I asked you to post the code so I can
take a look at it? Or if you would rather email it to
me that also works.