Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c++ > #118058 > unrolled thread
| Started by | Bonita Montero <Bonita.Montero@gmail.com> |
|---|---|
| First post | 2023-12-10 10:46 +0100 |
| Last post | 2024-02-14 15:57 +0100 |
| Articles | 20 on this page of 213 — 14 participants |
Back to article view | Back to comp.lang.c++
Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-10 10:46 +0100
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-10 21:48 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-11 04:15 +0100
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-11 17:12 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-11 18:19 +0100
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-13 15:16 +0000
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-13 15:25 +0000
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-14 15:06 +0000
Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2023-12-14 08:20 -0800
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-23 10:30 -0800
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-23 21:20 +0000
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-24 00:36 -0800
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-29 18:03 +0000
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-01-13 21:31 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-20 13:44 +0100
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-21 15:30 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-21 17:07 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-21 17:13 +0100
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-23 10:21 -0800
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-23 21:21 +0000
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-24 10:49 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-21 14:23 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-22 04:28 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-21 20:02 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-22 17:55 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-23 12:52 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-24 11:03 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-24 13:24 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-26 06:00 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-25 21:39 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-26 10:27 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 12:24 -0800
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-26 23:35 +0000
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 15:37 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 21:59 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-27 10:23 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-27 20:49 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-28 12:00 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-28 15:38 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 04:17 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-28 20:58 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 10:58 +0100
Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2023-12-29 13:56 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 17:04 +0100
Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2023-12-30 19:27 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 11:22 +0100
Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-31 18:49 +0000
Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2024-01-01 12:46 +0100
Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-29 16:01 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 17:06 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 13:45 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 14:09 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 14:12 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 05:42 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 20:45 -0800
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 04:56 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 06:09 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 05:51 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 10:15 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 20:35 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 06:54 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-31 07:01 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 11:20 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-31 17:30 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 06:21 +0100
Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-31 18:44 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 06:22 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-01 08:28 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 14:11 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-01 15:36 -0800
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 04:51 +0000
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-30 12:00 -0800
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-29 17:29 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 05:45 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-30 11:58 -0800
Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2023-12-30 14:58 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:49 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 06:51 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:36 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 07:28 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 22:53 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 14:11 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-01 15:34 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-02 11:55 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-02 10:38 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-03 06:48 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-03 13:32 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-04 04:37 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-05 19:21 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-06 08:18 +0100
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-06 08:31 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-06 10:30 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-06 13:15 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-06 13:19 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-07 10:14 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-07 10:10 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-07 12:46 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-08 06:48 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-08 12:18 -0800
Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-01-08 17:14 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-09 07:19 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-09 23:33 -0800
Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-09 02:02 +0000
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-09 15:12 +0100
OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Vir Campestris <vir.campestris@invalid.invalid> - 2024-01-29 21:31 +0000
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-16 08:06 -0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-16 18:30 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-23 05:51 -0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-24 10:45 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-25 00:48 -0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-25 15:51 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-03-11 10:10 -0700
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-12 10:15 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 12:44 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 07:25 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:20 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:35 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:41 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 19:20 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-15 16:30 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-15 11:21 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-15 19:07 +0800
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-15 12:56 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 19:20 +0100
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-20 08:35 -0700
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-20 18:34 +0200
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-20 18:35 +0200
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-24 12:28 -0700
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-25 06:19 +0200
Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-25 14:14 -0700
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:39 -0800
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 13:52 -0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-27 06:06 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-22 19:34 -0700
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-23 17:54 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-23 14:04 -0700
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-24 07:30 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-24 12:52 -0700
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-24 21:00 +0100
Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-24 13:05 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-05-16 17:28 +0100
Re: Sieve of Erastosthenes optimized to the max Ben Bacarisse <ben@bsb.me.uk> - 2024-05-16 21:40 +0100
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-05-21 19:06 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-05-30 12:32 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-05-30 14:17 +0200
Re: Sieve of Erastosthenes optimized to the max Paavo Helde <eesnimi@osa.pri.ee> - 2024-05-30 19:55 +0300
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-05-31 10:17 +0200
Re: Sieve of Erastosthenes optimized to the max Paavo Helde <eesnimi@osa.pri.ee> - 2024-05-31 20:52 +0300
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-05-30 22:17 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-01 21:07 +0100
Re: Sieve of Erastosthenes optimized to the max Richard Damon <richard@damon-family.org> - 2024-06-01 20:43 -0400
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-02 03:23 -0700
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-02 19:50 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-18 20:56 +0100
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 17:34 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-30 21:47 +0100
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-01 23:20 -0700
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-02 21:24 +0100
Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-03 11:25 +0100
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-15 06:15 -0700
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-20 07:41 -0700
OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-25 12:46 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-10 07:07 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-12 15:32 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-16 07:48 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-15 17:52 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-16 08:40 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-16 19:35 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-16 19:55 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-19 21:23 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:21 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:24 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:43 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-20 17:55 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 18:59 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 12:08 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-27 06:09 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-09-01 21:23 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-09-01 20:40 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-09-02 07:08 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-09-03 17:45 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-09-28 03:46 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-09-28 13:49 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-02 11:44 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-10-02 13:10 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-07 08:41 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-20 12:44 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-11-04 03:56 -0800
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-19 21:34 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-08-19 21:08 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-20 21:14 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 09:35 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 08:31 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:20 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:36 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:39 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 20:13 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2024-08-20 20:50 +0000
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 17:30 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-22 18:38 +0200
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 21:47 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 21:56 +0100
Re: OT: Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-08-22 17:00 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 10:59 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Andrey Tarasevich <andreytarasevich@hotmail.com> - 2024-09-28 15:21 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 12:47 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-18 19:52 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-10 17:24 -0700
Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-11 00:00 -0700
Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-23 07:34 -0700
Re: Sieve of Erastosthenes optimized to the max wij <wyniijj5@gmail.com> - 2024-02-14 00:15 +0800
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-13 19:08 +0100
Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-14 15:57 +0100
Page 8 of 11 — ← Prev page 1 … 6 7 [8] 9 10 11 Next page →
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-05-16 17:28 +0100 |
| Message-ID | <v25c87$1ld9m$1@dont-email.me> |
| In reply to | #118545 |
I've been playing with this. Has it really been this long? I ought to have more time than this... I put together some code, and have been playing with it far too much. Bonita, I don't know whether to curse you or kiss you. (Though in fact I suspect you may be a little young for me. Possibly too young for my kids...) I haven't written any code just for fun in years. Maybe decades! I missed Tim's Mod 30 trick, and that might well help. But I think I'm bored with this. It would save a lot of memory, but the extra computation might make it slower. Maybe I'll try one day ;) My code isn't as fast as Bonita's code. Even allowing that I didn't put any threads in there. But... When I compared the output there were differences. Specifically your program claimed that these numbers 66049,67591,69133,69647,71189,72217,72731,75301,78899,79927,80441,81469,85067,86609,89179, 89693,90721,92263,94319,95861,97403,98431,99973,102029,103057,105113,107683,108197,110767, 111281,112823,113851,115393,117449,118477,118991,120019,123103,125159,126187,128243,129271, 130813,133897,134411,139037,140579,143149,144691,146233,146747,148289,150859,152401,153943, 154457,155999,157541,158569,159083,162167,164737,165251,166279,167821,169363,169877,172961, 173989,175531,177587,180157,182213,184783,186839,188381,189923,190951,193007,194549,195577, 197633,198661,202259,204829,207913,208427,210997,211511,212539,213053,215623,219221,220249, 220763,221791,225389,226417,226931,227959,233099,234127,236183,238753,240809,241837,243379, 244921,248519,249547,251089,252631,254687,256229,259313,260341,261883,262397,264967,265481, 267023,269593,270107,272677,273191,274733,279359,280387,280901,281929,283471,285013,287069, 288611,290153,295807,296321,298891,300947,303517,305059,306601,308657,311741,312769,314311, 315853 are all prime. They aren't. The first one is 257 squared, and the others all have factors too. I'd also suggest that you comment demonstration code. It's intended to show off fancy techniques, and they aren't obvious without explanation. I learned that lesson when I was a student, when I couldn't understand something I'd written myself! Andy
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben@bsb.me.uk> |
|---|---|
| Date | 2024-05-16 21:40 +0100 |
| Message-ID | <87ttixij8d.fsf@bsb.me.uk> |
| In reply to | #119138 |
Vir Campestris <vir.campestris@invalid.invalid> writes: ... > When I compared the output there were differences. Specifically your > program claimed that these numbers > > 66049,67591,69133,69647,71189,72217,72731,75301,78899,79927,80441,81469,85067,86609,89179, > 89693,90721,92263,94319,95861,97403,98431,99973,102029,103057,105113,107683,108197,110767, > 111281,112823,113851,115393,117449,118477,118991,120019,123103,125159,126187,128243,129271, > 130813,133897,134411,139037,140579,143149,144691,146233,146747,148289,150859,152401,153943, > 154457,155999,157541,158569,159083,162167,164737,165251,166279,167821,169363,169877,172961, > 173989,175531,177587,180157,182213,184783,186839,188381,189923,190951,193007,194549,195577, > 197633,198661,202259,204829,207913,208427,210997,211511,212539,213053,215623,219221,220249, > 220763,221791,225389,226417,226931,227959,233099,234127,236183,238753,240809,241837,243379, > 244921,248519,249547,251089,252631,254687,256229,259313,260341,261883,262397,264967,265481, > 267023,269593,270107,272677,273191,274733,279359,280387,280901,281929,283471,285013,287069, > 288611,290153,295807,296321,298891,300947,303517,305059,306601,308657,311741,312769,314311, > 315853 > > are all prime. They aren't. The first one is 257 squared, and the others > all have factors too. In fact they are /all/ 257 times some prime. That must be a big clue as to where the bug is... -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-05-21 19:06 -0700 |
| Message-ID | <86r0duwqgg.fsf@linuxsc.com> |
| In reply to | #119138 |
Vir Campestris <vir.campestris@invalid.invalid> writes: > I've been playing with this. Has it really been this long? I > ought to have more time than this... [...] > > I missed Tim's Mod 30 trick, and that might well help. But I > think I'm bored with this. It would save a lot of memory, but > the extra computation might make it slower. Two comments. Using the mod 30 encoding can be speed competitive with simpler encodings. The second comment is, it isn't just that less memory is used, but that more primes can be found. Bonita brags about how fast some code is, but it's an apples and oranges comparison - that code doesn't do the job because it runs into a memory limit way sooner than using a mod 30 encoding, and trying to run past that limit would make the code R E A L L Y S L O W. > Maybe I'll try one day ;) I hope you do. If you have some difficulty seeing how to avoid the seeming speed penalties, feel free to ask, I am happy to help.
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-05-30 12:32 +0100 |
| Message-ID | <v39o3l$1lvju$1@dont-email.me> |
| In reply to | #119267 |
On 22/05/2024 03:06, Tim Rentsch wrote: > Vir Campestris <vir.campestris@invalid.invalid> writes: > >> I've been playing with this. Has it really been this long? I >> ought to have more time than this... [...] >> >> I missed Tim's Mod 30 trick, and that might well help. But I >> think I'm bored with this. It would save a lot of memory, but >> the extra computation might make it slower. > > Two comments. Using the mod 30 encoding can be speed competitive > with simpler encodings. The second comment is, it isn't just > that less memory is used, but that more primes can be found. > Bonita brags about how fast some code is, but it's an apples > and oranges comparison - that code doesn't do the job because > it runs into a memory limit way sooner than using a mod 30 > encoding, and trying to run past that limit would make the > code R E A L L Y S L O W. > >> Maybe I'll try one day ;) > > I hope you do. If you have some difficulty seeing how to > avoid the seeming speed penalties, feel free to ask, I am > happy to help. I think I've come to the end of what my code will do. I'm not going to post it up (unless you really want it), but I'll put in some comments. Firstly the C++ bit: std::vector<bool> is much too slow to be any use. I rolled my own. I played with std::vector<uint8_t>, uint32_t and uint64_t. The last was fastest. But... if you allocate a std::vector it insists on initialising all the elements. That takes a while. I then played with reserve and emplace_back - and that was slow too. So I've rolled my own store. Allocating a zero filled buffer of enormous size with calloc is surprisingly fast. I have a feeling (which I can't prove) that under the hood Linux is allocating one zero filled page to me, and setting up the page tables for the rest of the buffer as unwriteable. It then copies the zero buffer whenever a write fault happens. Of course I can when needed use malloc, not calloc - sometimes I don't care what is in the buffer. That's the end of the on-topic part. I don't store even numbers. That's an optimisation, and is of course equivalent to Tim's Mod 30 trick, but with mod 2 instead. And it doesn't require an divide which may be slow. When I mask out just 3 I get a pattern like this: 2492492492492490 There is a single nybble at the end of that which is zero, but the rest is a 3-nybble repeat. You see something similar with 5, 4210842108421080 except this time the repeat is 5 nybbles. It doesn't matter what your word size is - nybble, byte, uint64_t - the repeat is always the prime. It also incidentally doesn't matter if you do store evens, the repeat is still the size of the prime. Now 3 is the most expensive prime to write into the output mask. I have to write every third bit, which is a lot of operations. 5 is nearly as bad, and as the prime gets bigger the cost falls. But... I don't have to write all those bits. I can just copy the repeating part of the mask. With uint64_t the mask for 3 is 2492492492492490,[9249249249249249,4924924924924924,2492492492492492] I don't copy that first word, only the three repeats. for 5 it's 4210842108421080,[8421084210842108,0842108421084210,1084210842108421,2108421084210842,4210842108421084] I then realised that running an iterator over just 3 words was fairly expensive too - not as expensive as every bit, but it still hurts. However I can build a mask from the ones for 3 and 5, which has a repeat of 15 (3*5) 6692cd259a4b3490, 96692cd259a4b349,496692cd259a4b34,3496692cd259a4b3,b3496692cd259a4b, 4b3496692cd259a4,a4b3496692cd259a,9a4b3496692cd259,59a4b3496692cd25, 259a4b3496692cd2,d259a4b3496692cd,cd259a4b3496692c,2cd259a4b3496692, 92cd259a4b349669,692cd259a4b34966,6692cd259a4b3496 and copy that. In fact I can do that with any number of single prime masks I want. It gets big quite quickly - for 3 to 23 the repeat length is 111,546,435. I found that it was best to: - Make a mask for the first 5 primes, 3 5 7 11 13. - Make a mask from all of these. That has a repeat of 15015 words. - Make masks for the next few primes, up to 31. - Combine the mask from the 15015 and the others up to 31. That's about 100E9 in size. (It's better to do that in two stages, otherwise we reset the index on the small ones really often.) - Work out the biggest prime we have stored accurately. That will be just under 37 squared, 37 being the next one we haven't masked in - Collect all the primes between 31 and 37^2 - Go though the buffer in chunks of about 16k words, marking the multiples of all these primes. - We can now work out all the rest of the primes we need. Collect them, then go through the buffer again. - It's not necessary to do this a third time. Note that although this has been quite a fun thing to play with it's had almost no C++ content. In fact I found that to get best performance I had to drop down to C type code. I kept a class to manage my buffers though, it's much easier to drop an object that manages a malloc-ed buffer than to remember exactly when I need to call free. Correctness trumps speed. Now Mod 30 - by storing the odd numbers only I'm halving the storage, and I don't need to do an expensive divide. It's only shifts and ANDs. If I use mod 30 (30 = 2*3*5) conveniently there are 8 candidate numbers I need to store a mask for. Sadly my computer likes 64 bit operations better. But the required storage is now only 27% of the original mask if all numbers are put in it. I can carry on with this, but the storage doesn't have any more nice sizes. The number of bits in each page are: Prime, MOD value, number of bits per page, saving 2, 2 1, 50% 3, 6, 2, 33% 5, 30, 8, 27% 7, 210, 48, 23% 11, 2310, 480, 21% 13, 30030, 5760, 19% 17, 510510, 92160, 18% 19, 9699690, 1658880, 17% 23, 223092870, 36495360, 16% That's probably best pasted into a spreadsheet as CSV to make the columns line up. Maybe I'll play with it. But the rain stopped, and I must get on with the weeding! Andy
[toc] | [prev] | [next] | [standalone]
| From | Bonita Montero <Bonita.Montero@gmail.com> |
|---|---|
| Date | 2024-05-30 14:17 +0200 |
| Message-ID | <v39qp8$1mbqj$1@raubtier-asyl.eternal-september.org> |
| In reply to | #119427 |
Am 30.05.2024 um 13:32 schrieb Vir Campestris:
> Firstly the C++ bit:
> std::vector<bool> is much too slow to be any use. I rolled my own.
If you need a bit-vector<> use a vector<> of size_t's since size_t
is usually the native register width.
> But... if you allocate a std::vector it insists on initialising all
> the elements. That takes a while. I then played with reserve and
> emplace_back - and that was slow too. So I've rolled my own store.
This is my ndi helper class which handles non default-initialized
data without construction and destruction:
#pragma once
#include <new>
#include <utility>
template<typename T>
union ndi
{
ndi();
template<typename ... Args>
ndi( Args &&... args );
~ndi();
ndi &operator =( T const & );
template<typename ... Args>
T &emplace( Args &&... args );
void destruct();
operator T &();
operator T const &() const;
T *operator &();
T const *operator &() const;
T *operator ->();
T const *operator ->() const;
private:
T m_value;
};
template<typename T>
inline ndi<T>::ndi()
{
}
template<typename T>
inline ndi<T>::~ndi()
{
}
template<typename T>
template<typename ... Args>
inline ndi<T>::ndi( Args &&... args ) :
m_value( std::forward<Args>( args ) ... )
{
}
template<typename T>
inline ndi<T> &ndi<T>::operator =( T const &other )
{
m_value = other;
return *this;
}
template<typename T>
template<typename ... Args>
inline T &ndi<T>::emplace( Args &&... args )
{
new( (void *)& m_value ) T( std::forward<Args>( args ) ... );
return this->m_value;
}
template<typename T>
inline void ndi<T>::destruct()
{
m_value.~T();
}
template<typename T>
inline ndi<T>::operator T &()
{
return m_value;
}
template<typename T>
inline ndi<T>::operator T const &() const
{
return m_value;
}
template<typename T>
inline T *ndi<T>::operator &()
{
return &m_value;
}
template<typename T>
inline T const *ndi<T>::operator &() const
{
return &m_value;
}
template<typename T>
inline T *ndi<T>::operator ->()
{
return &m_value;
}
template<typename T>
inline T const *ndi<T>::operator ->() const
{
return &m_value;
}
[toc] | [prev] | [next] | [standalone]
| From | Paavo Helde <eesnimi@osa.pri.ee> |
|---|---|
| Date | 2024-05-30 19:55 +0300 |
| Message-ID | <v3ab2f$1pasg$1@dont-email.me> |
| In reply to | #119427 |
> > Allocating a zero filled buffer of enormous size with calloc is > surprisingly fast. I have a feeling (which I can't prove) that under the > hood Linux is allocating one zero filled page to me, and setting up the > page tables for the rest of the buffer as unwriteable. It then copies > the zero buffer whenever a write fault happens. This might well be so (a standard COW page mapped initially to a special zeroed page), but from what google tells me there are also special kernel threads whose task is to zero out unused memory pages so they can be given to new processes. This is needed for security, but would also speed up calloc().
[toc] | [prev] | [next] | [standalone]
| From | Bonita Montero <Bonita.Montero@gmail.com> |
|---|---|
| Date | 2024-05-31 10:17 +0200 |
| Message-ID | <v3c12q$258qe$1@raubtier-asyl.eternal-september.org> |
| In reply to | #119435 |
Am 30.05.2024 um 18:55 schrieb Paavo Helde: > This might well be so (a standard COW page mapped initially to a special > zeroed page), but from what google tells me there are also special > kernel threads whose task is to zero out unused memory pages so they can > be given to new processes. ... Any reference on that ?
[toc] | [prev] | [next] | [standalone]
| From | Paavo Helde <eesnimi@osa.pri.ee> |
|---|---|
| Date | 2024-05-31 20:52 +0300 |
| Message-ID | <v3d2op$2b0og$1@dont-email.me> |
| In reply to | #119444 |
On 31.05.2024 11:17, Bonita Montero wrote: > Am 30.05.2024 um 18:55 schrieb Paavo Helde: > >> This might well be so (a standard COW page mapped initially to a >> special zeroed page), but from what google tells me there are also >> special kernel threads whose task is to zero out unused memory pages >> so they can be given to new processes. ... > > Any reference on that ? > "https://answers.microsoft.com/en-us/windows/forum/all/physical-and-virtual-memory-in-windows-10/e36fb5bc-9ac8-49af-951c-e7d39b979938" "The memory manager maintains a thread that wakes up periodically to initialize pages on the Free page list so that it can move them to the Zero page list. The Zero page list contains pages that have been initialized to zero, ready for use when the memory manager needs a new page."
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-05-30 22:17 -0700 |
| Message-ID | <86o78mpnlf.fsf@linuxsc.com> |
| In reply to | #119427 |
Vir Campestris <vir.campestris@invalid.invalid> writes:
> On 22/05/2024 03:06, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>>> I've been playing with this. Has it really been this long? I
>>> ought to have more time than this... [...]
>>>
>>> I missed Tim's Mod 30 trick, and that might well help. But I
>>> think I'm bored with this. It would save a lot of memory, but
>>> the extra computation might make it slower.
>>
>> Two comments. Using the mod 30 encoding can be speed competitive
>> with simpler encodings. The second comment is, it isn't just
>> that less memory is used, but that more primes can be found.
>> Bonita brags about how fast some code is, but it's an apples
>> and oranges comparison - that code doesn't do the job because
>> it runs into a memory limit way sooner than using a mod 30
>> encoding, and trying to run past that limit would make the
>> code R E A L L Y S L O W.
>>
>>> Maybe I'll try one day ;)
>>
>> I hope you do. If you have some difficulty seeing how to
>> avoid the seeming speed penalties, feel free to ask, I am
>> happy to help.
>
> I think I've come to the end of what my code will do. [...]
A few miscellaneous responses...
> Firstly the C++ bit:
>
> std::vector<bool> is much too slow to be any use.
> [various alternatives]
I simply used a static array, more accurately a union of
two arrays (code here is approximate only):
#define LARGEST_N 2400000000000 // primes less than 2.4e12
union {
unsigned char bytes[ LARGEST_N / 30 ];
unsigned long long ulls[ LARGEST_N / 30 / 8 ];
} all;
> That's the end of the on-topic part.
>
> I don't store even numbers. That's an optimisation, and is of course
> equivalent to Tim's Mod 30 trick, but with mod 2 instead. And it
> doesn't require an divide which may be slow.
Like I keep saying, keeping the bits in a mod 30 representation
doesn't need any divides (at least not in the main loop, where
multiples are being computed).
> When I mask out just 3 I get a pattern like this:
>
> 2492492492492490
>
> There is a single nybble at the end of that which is zero, but the
> rest is a 3-nybble repeat. [similarly for 3 and 5, etc...]
In a mod 30 representation, the multiples of 7 repeat after 7
bytes. If we replicate those 7 bytes 8 times, we get 7 unsigned
long longs. Those 7 ULLs can be combined with logic operations into
the all.ulls[] array. Similarly for 11, 13, etc, for as high as is
reasonable (I think I topped out at 29).
> However I can build a mask from the ones for 3 and 5, which has a
> repeat of 15 (3*5)
> 6692cd259a4b3490,
> 96692cd259a4b349,496692cd259a4b34,3496692cd259a4b3,b3496692cd259a4b,
> 4b3496692cd259a4,a4b3496692cd259a,9a4b3496692cd259,59a4b3496692cd25,
> 259a4b3496692cd2,d259a4b3496692cd,cd259a4b3496692c,2cd259a4b3496692,
> 92cd259a4b349669,692cd259a4b34966,6692cd259a4b3496
>
> and copy that.
>
> In fact I can do that with any number of single prime masks I want.
> [...]
I thought about doing something along these lines but decided it
was too much work. So I just did the blocks of 7*8 bytes, 11*8
bytes, 13*8 bytes, etc, up to some fairly small prime, then
switched to an alternate method where individual bytes are
addressed and modified.
> Note that although this has been quite a fun thing to play with it's
> had almost no C++ content. In fact I found that to get best
> performance I had to drop down to C type code.
Right. I used straight C, and I don't see any part of the
program that would benefit from having C++.
> I kept a class to
> manage my buffers though, it's much easier to drop an object that
> manages a malloc-ed buffer than to remember exactly when I need to
> call free. Correctness trumps speed.
I used static allocation for the big bitmap, and small local
arrays (less than 50 * 8 bytes) for the small temporaries.
No malloc()s or free()s to worry about. Initialized to zero
without any code needed.
> Now Mod 30 - by storing the odd numbers only I'm halving the storage,
> and I don't need to do an expensive divide. It's only shifts and ANDs.
>
> If I use mod 30 (30 = 2*3*5) conveniently there are 8 candidate
> numbers I need to store a mask for.
By special casing each of the possibilities (there are 8*8 == 64 of
them), only multiplies and adds are needed, and no shifts -- masks
are evaluated to compile-time constants.
> Sadly my computer likes 64 bit
> operations better. But the required storage is now only 27% of the
> original mask if all numbers are put in it.
Once the primes get above a fairly small number, the density of
places to zap is low enough so that working in bytes is okay.
In particular once we are looking at primes above 240 there is
less than one bit per 64 bit ULL, which makes using bytes
rather than ULLs less painful
> I can carry on with this, but the storage doesn't have any more nice
> sizes. [...]
Yeah. It's a happy coincidence that mod 30 needs 8 bits
per block-of-30, and unfortunately no other nice size in
sight for larger prime-set mods.
> Maybe I'll play with it. But the rain stopped, and I must get on
> with the weeding!
Now you have something to look forward to the next time
it rains. :)
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-06-01 21:07 +0100 |
| Message-ID | <v3fv2u$2ursd$1@dont-email.me> |
| In reply to | #119442 |
On 31/05/2024 06:17, Tim Rentsch wrote:
> Vir Campestris <vir.campestris@invalid.invalid> writes:
>
>> On 22/05/2024 03:06, Tim Rentsch wrote:
>>
>>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>>
>>>> I've been playing with this. Has it really been this long? I
>>>> ought to have more time than this... [...]
>>>>
>>>> I missed Tim's Mod 30 trick, and that might well help. But I
>>>> think I'm bored with this. It would save a lot of memory, but
>>>> the extra computation might make it slower.
>>>
>>> Two comments. Using the mod 30 encoding can be speed competitive
>>> with simpler encodings. The second comment is, it isn't just
>>> that less memory is used, but that more primes can be found.
>>> Bonita brags about how fast some code is, but it's an apples
>>> and oranges comparison - that code doesn't do the job because
>>> it runs into a memory limit way sooner than using a mod 30
>>> encoding, and trying to run past that limit would make the
>>> code R E A L L Y S L O W.
>>>
>>>> Maybe I'll try one day ;)
>>>
>>> I hope you do. If you have some difficulty seeing how to
>>> avoid the seeming speed penalties, feel free to ask, I am
>>> happy to help.
>>
>> I think I've come to the end of what my code will do. [...]
>
> A few miscellaneous responses...
>
>> Firstly the C++ bit:
>>
>> std::vector<bool> is much too slow to be any use.
>> [various alternatives]
>
> I simply used a static array, more accurately a union of
> two arrays (code here is approximate only):
>
> #define LARGEST_N 2400000000000 // primes less than 2.4e12
>
> union {
> unsigned char bytes[ LARGEST_N / 30 ];
> unsigned long long ulls[ LARGEST_N / 30 / 8 ];
> } all;
>
>> That's the end of the on-topic part.
>>
>> I don't store even numbers. That's an optimisation, and is of course
>> equivalent to Tim's Mod 30 trick, but with mod 2 instead. And it
>> doesn't require an divide which may be slow.
>
> Like I keep saying, keeping the bits in a mod 30 representation
> doesn't need any divides (at least not in the main loop, where
> multiples are being computed).
>
>> When I mask out just 3 I get a pattern like this:
>>
>> 2492492492492490
>>
>> There is a single nybble at the end of that which is zero, but the
>> rest is a 3-nybble repeat. [similarly for 3 and 5, etc...]
>
> In a mod 30 representation, the multiples of 7 repeat after 7
> bytes. If we replicate those 7 bytes 8 times, we get 7 unsigned
> long longs. Those 7 ULLs can be combined with logic operations into
> the all.ulls[] array. Similarly for 11, 13, etc, for as high as is
> reasonable (I think I topped out at 29).
>
>> However I can build a mask from the ones for 3 and 5, which has a
>> repeat of 15 (3*5)
>> 6692cd259a4b3490,
>> 96692cd259a4b349,496692cd259a4b34,3496692cd259a4b3,b3496692cd259a4b,
>> 4b3496692cd259a4,a4b3496692cd259a,9a4b3496692cd259,59a4b3496692cd25,
>> 259a4b3496692cd2,d259a4b3496692cd,cd259a4b3496692c,2cd259a4b3496692,
>> 92cd259a4b349669,692cd259a4b34966,6692cd259a4b3496
>>
>> and copy that.
>>
>> In fact I can do that with any number of single prime masks I want.
>> [...]
>
> I thought about doing something along these lines but decided it
> was too much work. So I just did the blocks of 7*8 bytes, 11*8
> bytes, 13*8 bytes, etc, up to some fairly small prime, then
> switched to an alternate method where individual bytes are
> addressed and modified.
>
>> Note that although this has been quite a fun thing to play with it's
>> had almost no C++ content. In fact I found that to get best
>> performance I had to drop down to C type code.
>
> Right. I used straight C, and I don't see any part of the
> program that would benefit from having C++.
>
>> I kept a class to
>> manage my buffers though, it's much easier to drop an object that
>> manages a malloc-ed buffer than to remember exactly when I need to
>> call free. Correctness trumps speed.
>
> I used static allocation for the big bitmap, and small local
> arrays (less than 50 * 8 bytes) for the small temporaries.
> No malloc()s or free()s to worry about. Initialized to zero
> without any code needed.
>
>> Now Mod 30 - by storing the odd numbers only I'm halving the storage,
>> and I don't need to do an expensive divide. It's only shifts and ANDs.
>>
>> If I use mod 30 (30 = 2*3*5) conveniently there are 8 candidate
>> numbers I need to store a mask for.
>
> By special casing each of the possibilities (there are 8*8 == 64 of
> them), only multiplies and adds are needed, and no shifts -- masks
> are evaluated to compile-time constants.
>
>> Sadly my computer likes 64 bit
>> operations better. But the required storage is now only 27% of the
>> original mask if all numbers are put in it.
>
> Once the primes get above a fairly small number, the density of
> places to zap is low enough so that working in bytes is okay.
> In particular once we are looking at primes above 240 there is
> less than one bit per 64 bit ULL, which makes using bytes
> rather than ULLs less painful
>
One C++ bit I did use was to template all my classes with the storetype,
so I could run with 8, 32 or 64 bits. 64 was fastest. I didn't bother
with 16. The store access is of course behind the hood a cache line.
>> I can carry on with this, but the storage doesn't have any more nice
>> sizes. [...]
>
> Yeah. It's a happy coincidence that mod 30 needs 8 bits
> per block-of-30, and unfortunately no other nice size in
> sight for larger prime-set mods.
>
>> Maybe I'll play with it. But the rain stopped, and I must get on
>> with the weeding!
>
> Now you have something to look forward to the next time
> it rains. :)
I'm missing something.
When setting the bits for a multiple of a reasonably large prime what I
do is:
(1) Shift right 1 to drop the last bit.
(2) The remaining top bits tell me which word needs to be accessed. I
shift the multiple right to get the index of the word.
(3) The remaining bottom bits tell me which bit in the word needs to be
set, so I shift 1 left by those bottom bits to get a mask.
My operation (2) is a divide, by a power of 2.
Take 997 as an example:
multiple mod30 div30
997 7 33
1994 14 66
2991 21 99
3988 28 132
4985 5 166
5982 12 199
6979 19 232
7976 26 265
8973 3 299
9970 10 332
10967 17 365
11964 24 398
12961 1 432
13958 8 465
14955 15 498
15952 22 531
16949 29 564
17946 6 598
18943 13 631
19940 20 664
20937 27 697
21934 4 731
22931 11 764
23928 18 797
24925 25 830
25922 2 864
26919 9 897
27916 16 930
28913 23 963
29910 0 997
30907 7 1030
I don't see how you get that last column without a divide by 30.
Andy
[toc] | [prev] | [next] | [standalone]
| From | Richard Damon <richard@damon-family.org> |
|---|---|
| Date | 2024-06-01 20:43 -0400 |
| Message-ID | <v3gf85$2n53o$11@i2pn2.org> |
| In reply to | #119452 |
On 6/1/24 4:07 PM, Vir Campestris wrote:
> On 31/05/2024 06:17, Tim Rentsch wrote:
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>>> On 22/05/2024 03:06, Tim Rentsch wrote:
>>>
>>>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>>>
>>>>> I've been playing with this. Has it really been this long? I
>>>>> ought to have more time than this... [...]
>>>>>
>>>>> I missed Tim's Mod 30 trick, and that might well help. But I
>>>>> think I'm bored with this. It would save a lot of memory, but
>>>>> the extra computation might make it slower.
>>>>
>>>> Two comments. Using the mod 30 encoding can be speed competitive
>>>> with simpler encodings. The second comment is, it isn't just
>>>> that less memory is used, but that more primes can be found.
>>>> Bonita brags about how fast some code is, but it's an apples
>>>> and oranges comparison - that code doesn't do the job because
>>>> it runs into a memory limit way sooner than using a mod 30
>>>> encoding, and trying to run past that limit would make the
>>>> code R E A L L Y S L O W.
>>>>
>>>>> Maybe I'll try one day ;)
>>>>
>>>> I hope you do. If you have some difficulty seeing how to
>>>> avoid the seeming speed penalties, feel free to ask, I am
>>>> happy to help.
>>>
>>> I think I've come to the end of what my code will do. [...]
>>
>> A few miscellaneous responses...
>>
>>> Firstly the C++ bit:
>>>
>>> std::vector<bool> is much too slow to be any use.
>>> [various alternatives]
>>
>> I simply used a static array, more accurately a union of
>> two arrays (code here is approximate only):
>>
>> #define LARGEST_N 2400000000000 // primes less than 2.4e12
>>
>> union {
>> unsigned char bytes[ LARGEST_N / 30 ];
>> unsigned long long ulls[ LARGEST_N / 30 / 8 ];
>> } all;
>>
>>> That's the end of the on-topic part.
>>>
>>> I don't store even numbers. That's an optimisation, and is of course
>>> equivalent to Tim's Mod 30 trick, but with mod 2 instead. And it
>>> doesn't require an divide which may be slow.
>>
>> Like I keep saying, keeping the bits in a mod 30 representation
>> doesn't need any divides (at least not in the main loop, where
>> multiples are being computed).
>>
>>> When I mask out just 3 I get a pattern like this:
>>>
>>> 2492492492492490
>>>
>>> There is a single nybble at the end of that which is zero, but the
>>> rest is a 3-nybble repeat. [similarly for 3 and 5, etc...]
>>
>> In a mod 30 representation, the multiples of 7 repeat after 7
>> bytes. If we replicate those 7 bytes 8 times, we get 7 unsigned
>> long longs. Those 7 ULLs can be combined with logic operations into
>> the all.ulls[] array. Similarly for 11, 13, etc, for as high as is
>> reasonable (I think I topped out at 29).
>>
>>> However I can build a mask from the ones for 3 and 5, which has a
>>> repeat of 15 (3*5)
>>> 6692cd259a4b3490,
>>> 96692cd259a4b349,496692cd259a4b34,3496692cd259a4b3,b3496692cd259a4b,
>>> 4b3496692cd259a4,a4b3496692cd259a,9a4b3496692cd259,59a4b3496692cd25,
>>> 259a4b3496692cd2,d259a4b3496692cd,cd259a4b3496692c,2cd259a4b3496692,
>>> 92cd259a4b349669,692cd259a4b34966,6692cd259a4b3496
>>>
>>> and copy that.
>>>
>>> In fact I can do that with any number of single prime masks I want.
>>> [...]
>>
>> I thought about doing something along these lines but decided it
>> was too much work. So I just did the blocks of 7*8 bytes, 11*8
>> bytes, 13*8 bytes, etc, up to some fairly small prime, then
>> switched to an alternate method where individual bytes are
>> addressed and modified.
>>
>>> Note that although this has been quite a fun thing to play with it's
>>> had almost no C++ content. In fact I found that to get best
>>> performance I had to drop down to C type code.
>>
>> Right. I used straight C, and I don't see any part of the
>> program that would benefit from having C++.
>>
>>> I kept a class to
>>> manage my buffers though, it's much easier to drop an object that
>>> manages a malloc-ed buffer than to remember exactly when I need to
>>> call free. Correctness trumps speed.
>>
>> I used static allocation for the big bitmap, and small local
>> arrays (less than 50 * 8 bytes) for the small temporaries.
>> No malloc()s or free()s to worry about. Initialized to zero
>> without any code needed.
>>
>>> Now Mod 30 - by storing the odd numbers only I'm halving the storage,
>>> and I don't need to do an expensive divide. It's only shifts and ANDs.
>>>
>>> If I use mod 30 (30 = 2*3*5) conveniently there are 8 candidate
>>> numbers I need to store a mask for.
>>
>> By special casing each of the possibilities (there are 8*8 == 64 of
>> them), only multiplies and adds are needed, and no shifts -- masks
>> are evaluated to compile-time constants.
>>
>>> Sadly my computer likes 64 bit
>>> operations better. But the required storage is now only 27% of the
>>> original mask if all numbers are put in it.
>>
>> Once the primes get above a fairly small number, the density of
>> places to zap is low enough so that working in bytes is okay.
>> In particular once we are looking at primes above 240 there is
>> less than one bit per 64 bit ULL, which makes using bytes
>> rather than ULLs less painful
>>
>
> One C++ bit I did use was to template all my classes with the storetype,
> so I could run with 8, 32 or 64 bits. 64 was fastest. I didn't bother
> with 16. The store access is of course behind the hood a cache line.
>
>>> I can carry on with this, but the storage doesn't have any more nice
>>> sizes. [...]
>>
>> Yeah. It's a happy coincidence that mod 30 needs 8 bits
>> per block-of-30, and unfortunately no other nice size in
>> sight for larger prime-set mods.
>>
>>> Maybe I'll play with it. But the rain stopped, and I must get on
>>> with the weeding!
>>
>> Now you have something to look forward to the next time
>> it rains. :)
>
> I'm missing something.
>
> When setting the bits for a multiple of a reasonably large prime what I
> do is:
> (1) Shift right 1 to drop the last bit.
> (2) The remaining top bits tell me which word needs to be accessed. I
> shift the multiple right to get the index of the word.
> (3) The remaining bottom bits tell me which bit in the word needs to be
> set, so I shift 1 left by those bottom bits to get a mask.
>
> My operation (2) is a divide, by a power of 2.
>
> Take 997 as an example:
> multiple mod30 div30
> 997 7 33
> 1994 14 66
> 2991 21 99
> 3988 28 132
> 4985 5 166
> 5982 12 199
> 6979 19 232
> 7976 26 265
> 8973 3 299
> 9970 10 332
> 10967 17 365
> 11964 24 398
> 12961 1 432
> 13958 8 465
> 14955 15 498
> 15952 22 531
> 16949 29 564
> 17946 6 598
> 18943 13 631
> 19940 20 664
> 20937 27 697
> 21934 4 731
> 22931 11 764
> 23928 18 797
> 24925 25 830
> 25922 2 864
> 26919 9 897
> 27916 16 930
> 28913 23 963
> 29910 0 997
> 30907 7 1030
>
> I don't see how you get that last column without a divide by 30.
>
> Andy
>
>
Incrementally.
Each line increase the div30 by a fixed amount, with a +1 every time the
mod30 accumulator get to 30 or above. So you can take the prime and
compute the mod30 / div30 for it, and then move to the next multiple
with only adds and a compare.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-02 03:23 -0700 |
| Message-ID | <86ttibod7n.fsf@linuxsc.com> |
| In reply to | #119452 |
Vir Campestris <vir.campestris@invalid.invalid> writes:
> On 31/05/2024 06:17, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
[...]
>>> I don't store even numbers. That's an optimisation, and is of
>>> course equivalent to Tim's Mod 30 trick, but with mod 2 instead.
>>> And it doesn't require an divide which may be slow.
>>
>> Like I keep saying, keeping the bits in a mod 30 representation
>> doesn't need any divides (at least not in the main loop, where
>> multiples are being computed).
[...]
> I'm missing something.
>
> When setting the bits for a multiple of a reasonably large prime
> what I do is:
>
> (1) Shift right 1 to drop the last bit.
>
> (2) The remaining top bits tell me which word needs to be accessed.
> I shift the multiple right to get the index of the word.
>
> (3) The remaining bottom bits tell me which bit in the word needs
> to be set, so I shift 1 left by those bottom bits to get a mask.
>
> My operation (2) is a divide, by a power of 2.
>
> Take 997 as an example:
> multiple mod30 div30
> 997 7 33
> 1994 14 66
> 2991 21 99
> 3988 28 132
> 4985 5 166
> 5982 12 199
> 6979 19 232
> 7976 26 265
> 8973 3 299
> 9970 10 332
> [etc]
>
> I don't see how you get that last column without a divide by 30.
Good example. Let's look at this example more closely, first in
the a-bit-for-only-odd-numbers scheme. Forgive me if any of the
following seems obvious. Let me redo your table slightly:
multiplier product mod 2
1 997 1
2 1994 0
3 2991 1
4 3988 0
5 4985 1
6 5982 0
7 6979 1
8 7976 0
9 8973 1
10 9970 0
11 10967 1
[etc]
Which multipliers do we need to consider? Obviously not 1.
After that, we don't need any even multipliers, since they
are all divisible by two, and there aren't even places in
the table where the product could go. So we quickly get to
this:
multiplier product mod 2
3 2991 1
5 4985 1
7 6979 1
9 8973 1
11 10967 1
[etc]
Now which multipliers do we need? We don't need 3 - we have
already crossed out all multiples of 3. Similarly we don't
need 5, 7, or 11. We don't need 9, because it has a factor
of 3, - any multiple of 9 is also a multiple of 3 and has already
been crossed out. And so forth for 13, 15, 17, 19, 21, ...
A little thought should convince you we don't need to consider
any multiplier less than 997. Let's pick up the table from
there (I have taken out 'product' and put in 'factors'):
multiplier factors
997 997
999 3 3 3 37
1001 7 11 13
1003 17 59
1005 3 5 67
1007 19 53
1009 1009
1011 3 337
We need to eliminate 997*997, as it certainly has not been
crossed out before. We don't need to consider 999, 1001,
1003, 1005, 1007, or 1011, because they all have prime
factors less than 997, and so these products have already
been eliminated. For 1009 though, it's prime, and has no
smaller prime factors, so we need to cross out 997*1009.
And so forth. Speaking more generally, we need to consider
multipliers greater than 997 (as well as 997 itself) that
have not already been marked as composite. We discover what
those numbers are by scanning the bitmap. In pseudocode:
for each prime p :
starting at the bitmap location for p, and up
for each not-yet-marked-composite m :
mark as composite p*m
end for
end for
(I need to put in an asterisk here to say that this code
mostly works but misses some composites, and if you play
around with it you will see why. But that isn't important
to my explanation so I'd like to pretend for now that
this code is what we need to look at (and in fact it
isn't far from code that actually does work).)
Now let's think about how things change when we are storing
bits only for numbers that are nonzero mod 30, so one block
of 30 per byte. First it should be obvious that we need to
consider only those multipliers that are nonzero mod 30, or
in other words those numbers that correspond to a bit in the
table. Pseudocode now looks something like this:
for each index i in the table :
for each bit in table[i] :
if that bit has been marked composite : continue
// we have found the next prime
for each index j > i in the table :
for each bit in table[j] :
if that bit has beem marked composite : continue
// we have found a multiplier m and ..
// need to mark as composite p*m
end
end
end
end
The relevant values are:
p == i*30 + x
m == j*30 + y
where x and y are constant values that are nonzero mod 30,
that is, from the set { 1, 7, 11, 13, 17, 19, 23, 29 },
depending only on the respective bit positions for p and m.
The product p*m is what we're looking for, and in particular
p*m / 30 and p*m % 30. Doing a little algebra
p*m = (i*30 + x) * (j*30 + y)
= i*30 * j*30 + i*30*y + j*30*x + x*y
The components of this last expression are
p*m / 30 = 30*i*j + i*y + j*x + (x*y/30)
p*m % 30 = (x*y%30)
In the pseudocode we said 'for each bit in table[...]', but
rather than use a for loop we write 8 specific tests, so the
values x and y are specific constants in each of the 8 cases.
Since they are constants, the values (x*y/30) and (x*y%30) can in
each of the 64 cases be computed as compile-time constants, and
so we eliminate the divide and remainder operations from run-time
by computing them at compile time. Similarly the value (x*y%30)
can be turned into a bit mask at compile-time, because it's just
a long series of ?: expressions, one for each of the 8 values of
nonzero residues mod 30.
Note that the expression for p*m/30 gives the index in the
table for where to zap the bit for the product.
Were you able to follow all that? Do you see how the scheme
works now?
I have deliberately left out explaining the problem of missing
some composite values and what to do about it, but I can
explain further if someone has trouble with that.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-02 19:50 -0700 |
| Message-ID | <86h6eaoi2r.fsf@linuxsc.com> |
| In reply to | #119455 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Pseudocode now looks something like this:
>
> for each index i in the table :
> for each bit in table[i] :
> if that bit has been marked composite : continue
> // we have found the next prime
> for each index j > i in the table :
> for each bit in table[j] :
> if that bit has beem marked composite : continue
> // we have found a multiplier m and ..
> // need to mark as composite p*m
> end
> end
> end
> end
Correction: that should be
for each index j >= i in the table :
Also I have left out some possible optimizations, in the
interest of simplifying the explanation.
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-06-18 20:56 +0100 |
| Message-ID | <v4sool$1grge$1@dont-email.me> |
| In reply to | #119457 |
OK, I'm getting some weird results. Firstly, whatever I do my original algorithm storing odds only is faster than using mod30. Quite a bit faster. I think this is because mod30 requires it to access each byte in the map individually, rather than eight at a time. How much is when I got weird. Going through the large primes I've got a cache size. I go through each of the primes, adding a modulo and a word index to get the bit that needs to be set next. When the modulo overflows the word index gets incremented too. I do that for a block of memory, then repeat for the next prime, and so on until I've done all the primes. I then go onto the next block of memory - which is tuned to fit in the cache. Rather than do some arithmetic I though I'll just run through a loop with different sizes of that memory block, and pick the best one. That's a bit under a megabyte. I then remove the loop, instead having a constexpr for the size. The code now runs a third slower. I can have the loop go around exactly once, and it is still way faster than the hard coded constant! Andy
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-18 17:34 -0700 |
| Message-ID | <867celixcw.fsf@linuxsc.com> |
| In reply to | #119473 |
Vir Campestris <vir.campestris@invalid.invalid> writes: > OK, I'm getting some weird results. > > Firstly, whatever I do my original algorithm storing odds only is > faster than using mod30. Quite a bit faster. I think this is because > mod30 requires it to access each byte in the map individually, rather > than eight at a time. > > How much is when I got weird. > > Going through the large primes I've got a cache size. > > I go through each of the primes, adding a modulo and a word index to > get the bit that needs to be set next. When the modulo overflows the > word index gets incremented too. > > I do that for a block of memory, then repeat for the next prime, and > so on until I've done all the primes. I then go onto the next block of > memory - which is tuned to fit in the cache. > > Rather than do some arithmetic I though I'll just run through a loop > with different sizes of that memory block, and pick the best > one. That's a bit under a megabyte. > > I then remove the loop, instead having a constexpr for the size. > > The code now runs a third slower. > > I can have the loop go around exactly once, and it is still way faster > than the hard coded constant! Would you mind if I asked you to post the code so I can take a look at it? Or if you would rather email it to me that also works.
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-06-30 21:47 +0100 |
| Message-ID | <v5sg9s$mat4$1@dont-email.me> |
| In reply to | #119474 |
On 19/06/2024 01:34, Tim Rentsch wrote:
> Would you mind if I asked you to post the code so I can
> take a look at it? Or if you would rather email it to
> me that also works.
I've hacked out all my include files, added a GPL licence, and included
build instructions. It follows in the signature block. 600 lines :(
I bet the line wrapping breaks it too.
Andy
--
/*
Programme to calculate primes using the Sieve or Eratosthenes
Copyright (C) 2024 the person known as Vir Campestris
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
*/
/*
I build this programme in two ways on Linux. These commands
g++ -O3 --std=c++20 usenet.cpp -o usenet && time ./usenet
g++ -O3 --std=c++20 -D FAST usenet.cpp -o usenet && time ./usenet
compile and run it with, and without, a loop that goes around once.
It is faster when the loop is present.
*/
#include <algorithm>
#include <cassert>
#include <chrono>
#include <climits>
#include <cmath>
#include <csignal>
#include <cstring>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <list>
#include <thread>
#include <vector>
// If these trigger you have a really odd architecture.
static_assert(sizeof(uint8_t) == 1);
static_assert(CHAR_BIT == 8);
// The type of a prime number.
// size_t isn't big enough on a 32-bit machine with 2GB memory allocated
and 16 primes per byte.
typedef uint64_t PRIME;
// Internal class used to store the mask data for one or more
// primes rather than all primes. This has an initial block,
// followed by a block which contains data which is repeated
// for all higher values.
// Example: StoreType== uint8_t, prime == 3, the mask should be
// 90 meaning 1,3,5,7,11,13 are prime but not 9, 15
// 24,49,92 covers odds 17-63
// 24,49,92 covers 65-111, with an identical mask
// As the mask is identical we only need to store the
// non-repeat, plus one copy of the repeated block and data to
// show where it starts and ends. Note that for big primes the
// initial block may be more than one StoreType.
// As there are no common factors between any prime and the
// number of bits in StoreType the repeat is the size of the
// prime.
// If you have an unusual architecture,
// such as ICL 1900 (24 bit word)
// or DECSystem10 (36 bit word)
// this repeat could be smaller for 3,
// but that is too much of a corner case for me to care.
// I haven't seen one of them since the 1970s.
template <typename StoreType, PRIME modulus=30> class Masker
{
// The length of the initial block.
size_t initSize;
// The size of the repeated part. The entire store size is the
sum of these two.
size_t repSize;
// The actual buffer
// Allocates a zero filled buffer
class Buffer
{
StoreType* mpBuffer; // base of the data referred to
size_t mSize; // size of the buffer in
StoreType-s.
public:
// Normal constructor
Buffer(size_t size = 0) : mpBuffer(nullptr), mSize(size)
{
if (size > 0)
{
mpBuffer =
static_cast<StoreType*>(std::malloc(size * sizeof(StoreType)));
if (!mpBuffer)
throw std::bad_alloc();
}
}
// Move constructor
Buffer(Buffer&& source) : mpBuffer(source.mpBuffer),
mSize(source.mSize)
{
source.mSize = 0;
source.mpBuffer = nullptr;
}
// Move assignment
Buffer& operator=(Buffer&& source)
{
if (mpBuffer)
std::free(mpBuffer);
mpBuffer = source.mpBuffer;
mSize = source.mSize;
source.mSize = 0;
source.mpBuffer = nullptr;
return *this;
}
void resize(size_t newSize)
{
mpBuffer =
static_cast<StoreType*>(std::realloc(mpBuffer, newSize *
sizeof(StoreType)));
if (!mpBuffer)
throw std::bad_alloc();
mSize = newSize;
}
// Get the data start
inline StoreType* operator*() const
{
return mpBuffer;
}
// Get the owned size
inline PRIME const & size() const
{
return mSize;
}
// clean up.
~Buffer()
{
if (mpBuffer)
std::free(mpBuffer);
}
} mBuffer;
// Raw pointer to the store for speed.
StoreType * mStorePtr;
// Lookup table for the mask bit from the modulus
static StoreType const mMaskLookup[modulus];
public:
class MaskIterator
{
StoreType const * index, *repStart, *repEnd;
public:
MaskIterator(Masker const * origin)
: index(origin->mStorePtr)
, repStart(index + origin->initSize)
, repEnd(repStart + origin->repSize)
{}
inline StoreType const & next() // returns value with
iterator post-increment
{
auto const & retval = *index;
if (++index >= repEnd)
index = repStart;
return retval;
}
};
// Default constructor. Makes a dummy mask for overwriting.
Masker()
: initSize(0)
, repSize(1)
, mBuffer(1)
, mStorePtr(*mBuffer)
{
*mStorePtr = 0; // nothing marked unprime
}
// Move constructor (destroys source)
Masker(Masker&& source)
: initSize(source.initSize)
, repSize(source.repSize)
, mBuffer(std::move(source.mBuffer))
, mStorePtr(*mBuffer)
{
}
// Copy constructor (preserves source)
Masker(Masker& source)
: initSize(source.initSize)
, repSize(source.repSize)
, mBuffer(initSize + repSize)
, mStorePtr(*mBuffer)
{
memcpy(*mBuffer, *source.mBuffer, mBuffer.size() *
sizeof(StoreType));
}
// move (destroys source)
Masker& operator=(Masker&& source)
{
initSize = source.initSize;
repSize = source.repSize;
mStorePtr = source.mStorePtr;
mBuffer = std::move(source.mBuffer);
return *this;
}
// Construct for a single prime
Masker(PRIME prime)
: initSize((prime/modulus)+1)
, repSize(prime)
, mBuffer(initSize + repSize)
, mStorePtr(*mBuffer)
{
// Pre-zero the buffer. For bigger primes we don't
write every word.
// This is fast enough not to care about excess writes.
memset(mStorePtr, 0, mBuffer.size() * sizeof(StoreType));
// The max prime we'll store
PRIME const maxPrime = (initSize + repSize) * modulus;
// Filter all the bits. Note that although we
// don't strictly need to filter from prime*3,
// but only from prime**2, doing from *3 makes
// the init block smaller.
PRIME const prime2 = prime * 2;
PRIME const prime3 = prime * 3;
PRIME const modstep = prime % modulus;
PRIME const wordstep = prime / modulus;
PRIME mod = prime3 % modulus;
PRIME word = prime3 / modulus;
do
{
auto const & mask = mMaskLookup[mod];
if (mask)
mStorePtr[word] |= mask;
mod += modstep;
if (mod >= modulus)
{
mod -= modulus;
word += wordstep + 1;
}
else word += wordstep;
}
while (word < mBuffer.size());
}
// Construct from two others, making a mask that excludes all
numbers
// marked not prime in either of them. The output init block
// will be the largest from either of the inputs; the repeat
block size will be the
// product of the input repeat sizes.
// The output could get quite large - 3.7E12 for all primes up
to 37
Masker(const Masker& left, const Masker& right, size_t
storeLimit = 0)
: initSize(std::max(left.initSize, right.initSize))
, repSize (left.repSize * right.repSize)
, mBuffer()
{
if (storeLimit)
repSize = std::min(initSize + repSize,
storeLimit) - initSize;
auto storeSize = initSize + repSize;
// Actually construct the store with the desired size
mBuffer = storeSize;
mStorePtr = *mBuffer;
auto li = left.begin();
auto ri = right.begin();
for (size_t word = 0; word < storeSize; ++word)
{
mStorePtr[word] = li.next() | ri.next();
}
}
// Construct from several others, making a mask that excludes
all numbers
// marked not prime in any of them. The output init block
// will be the largest from any of the inputs; the repeat size
will be the
// product of all the input repeat sizes.
// The output could get quite large - 3.7E12 for all primes up
to 37
// the type iterator should be an iterator into a collection of
Masker-s.
template <typename iterator> Masker(const iterator& begin,
const iterator& end, size_t storeLimit = 0)
: initSize(0)
, repSize(1)
, mBuffer() // empty for now
{
// Iterate over the inputs. We will
// * Determine the maximum init size
// * Determine the product of all the repeat sizes.
// * Record the number of primes we represent.
size_t nInputs = std::distance(begin, end);
std::vector<MaskIterator> iterators;
iterators.reserve(nInputs);
for (auto i = begin; i != end; ++i)
{
initSize = std::max(initSize, i->initSize);
repSize *= i->repSize;
iterators.push_back(i->begin());
}
if (storeLimit)
repSize = std::min(initSize + repSize,
storeLimit) - initSize;
auto storeSize = initSize + repSize;
// Actually construct the store with the desired size
mBuffer = storeSize;
mStorePtr = *mBuffer;
// take the last one off (most efficient to remove)
// and use it as the initial mask value
auto last = iterators.back();
iterators.pop_back();
for (auto word = 0; word < storeSize; ++word)
{
StoreType mask = last.next();
for(auto &i: iterators)
{
mask |= i.next();
}
mStorePtr[word] = mask;
}
}
template <typename iterator> Masker& andequals(size_t
cachesize, iterator begin, iterator end)
{
// first is the prime itself; second is the current
multiple of it _without_ the last bit
struct value
{
PRIME const modstep;
size_t const wordstep;
PRIME mod;
size_t word;
value(PRIME prime)
: modstep(prime % modulus)
, wordstep(prime / modulus)
{
PRIME const primeS = prime * prime;
mod = primeS % modulus;
word = primeS / modulus;
}
bool operator<(size_t limit) const
{
return word < limit;
}
void next(StoreType * store)
{
auto const & mask = mMaskLookup[mod];
if (mask)
store[word] |= mask;
mod += modstep;
if (mod >= modulus)
{
mod -= modulus;
word += wordstep + 1;
}
else word += wordstep;
}
};
std::vector<value> values;
values.reserve(std::distance(begin, end));
for (auto prime = begin; prime != end; ++prime)
{
values.emplace_back(*prime);
}
size_t limit = 0;
do
{
limit = std::min(limit + cachesize, initSize +
repSize);
for (auto & value: values)
{
while (value < limit)
value.next(mStorePtr);
}
limit += cachesize;
}
while (limit < initSize + repSize);
return *this;
}
// duplicates the data up to the amount needed for a prime newSize
void resize(PRIME newSize)
{
size_t sizeWords = (newSize / modulus) + 1;
assert(sizeWords > (initSize + repSize)); // not
allowed to shrink!
mBuffer.resize(sizeWords);
mStorePtr = *mBuffer;
auto copySource = mStorePtr + initSize;
do
{
auto copySize = std::min(sizeWords - repSize -
initSize, repSize);
auto dest = copySource + repSize;
memcpy(dest, copySource, copySize *
sizeof(StoreType));
repSize += copySize;
} while ((initSize + repSize) < sizeWords);
}
// returns true if this mask thinks value is prime.
bool get(PRIME value) const
{
assert(value < modulus * mBuffer.size());
if ((value <= 3) || (value == 5))
return true;
auto mod = value % modulus;
auto mask = mMaskLookup[mod];
if (mask == 0)
return false;
auto word = value / modulus;
auto ret = mStorePtr[word] & mask;
return !ret;
}
// Get the beginning of the bitmap.
// Incrementing this iterator can continue indefinitely,
regardless of the bitmap size.
MaskIterator begin() const
{
return MaskIterator(this);
}
size_t size() const
{
return initSize + repSize;
}
};
template<> uint8_t const Masker<uint8_t, 30>::mMaskLookup[30] =
{
0, // 0
0x01, // 1
0, // 2
0, // 3
0, // 4
0, // 5
0, // 6
0x02, // 7
0, // 8
0, // 9
0, // 10
0x04, // 11
0, // 12
0x08, // 13
0, // 14
0, // 15
0, // 16
0x10, // 17
0, // 18
0x20, // 19
0, // 20
0, // 21
0, // 22
0x40, // 23
0, // 24
0, // 25
0, // 26
0, // 27
0, // 28
0x80 // 29
};
typedef uint8_t StoreType;
typedef Masker<StoreType> Mask;
#ifndef tinyCount
constexpr size_t tinyCount = 7;
#endif
#ifndef smallCount
constexpr size_t smallCount = 10;
#endif
#ifndef bigCache
constexpr size_t bigCache = 0x80000;
#endif
int main(int argc, char**argv)
{
// find out if the user asked for a different number of primes
PRIME PrimeCount = 1e9;
if (argc >= 2)
{
std::istringstream arg1(argv[1]);
std::string crud;
arg1 >> PrimeCount >> crud;
if (!crud.empty() || PrimeCount < 3)
{
double isitfloat;
arg1.seekg(0, arg1.beg);
crud.clear();
arg1 >> isitfloat >> crud;
if (!crud.empty() || isitfloat < 3)
{
std::cerr << "Invalid first argument
\"" << argv[1] << "\" Should be decimal number of primes required\n";
exit(42);
}
else PrimeCount = isitfloat;
}
}
std::string outName;
if (argc >= 3)
{
outName = argv[2];
}
#ifdef FAST
for (size_t bigCache = 0x80000; bigCache < 0x80001; ++bigCache)
#endif
{
Mask primes;
PRIME nextPrime;
{
nextPrime = 7;
Mask tiny(nextPrime);
std::vector<Mask> smallMasks;
smallMasks.emplace_back(tiny);
// Make a mask for the really small primes
size_t count = 1; // allows for the 3
for ( ; count < tinyCount; ++count)
{
do
{
nextPrime += 2;
} while (!tiny.get(nextPrime));
tiny = Mask(tiny, nextPrime);
}
// this limit is too small, but is easy to
calculate and big enough
auto limit = nextPrime*nextPrime;
smallMasks.clear();
for (; count < smallCount; ++count)
{
do
{
nextPrime += 2;
} while (!tiny.get(nextPrime));
smallMasks.emplace_back(nextPrime);
}
assert(nextPrime <= limit && "If this triggers
smallCount is too big and the code will give wrong results");
// Make a single masker for all the small
primes. Don't make it bigger than needed.
auto small = Mask(smallMasks.begin(),
smallMasks.end(), PrimeCount / 30 + 1);
primes = Mask(tiny, small, PrimeCount / 30 + 1);
}
// if the buffer isn't big enough yet expand it by
duplicating early entries.
if (primes.size() < PrimeCount / 30 + 1)
{
primes.resize(PrimeCount);
}
do
{
std::vector<PRIME> quickies;
PRIME quickMaxPrime =
std::min(nextPrime*nextPrime, PRIME(sqrt(PrimeCount) + 1));
do
{
do
{
nextPrime += 2;
} while (!primes.get(nextPrime));
quickies.emplace_back(nextPrime);
} while (nextPrime < quickMaxPrime);
// this function adds all the quickies into the
mask, but does all of them in
// one area of memory before moving onto the
next. The area of memory,
// and the list of primes, both fit into the L1
cache.
// You may need to adjust bigCache for best
performance.
primes.andequals(bigCache, quickies.begin(),
quickies.end());
} while (nextPrime*nextPrime < PrimeCount);
std::cerr << "tinyCount " << tinyCount
<< " smallCount " << smallCount << " bigCache "
<< std::hex << "0x" << bigCache << std::endl;
if (outName.length())
{
std::ofstream outfile(outName);
outfile << "2\n";
for (PRIME prime = 3; prime < PrimeCount; prime
+= 2)
{
if (primes.get(prime))
{
outfile << prime << '\n';
}
}
}
}
return 0;
}
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-07-01 23:20 -0700 |
| Message-ID | <86zfr0b9hw.fsf@linuxsc.com> |
| In reply to | #119591 |
Vir Campestris <vir.campestris@invalid.invalid> writes: > On 19/06/2024 01:34, Tim Rentsch wrote: > >> Would you mind if I asked you to post the code so I can >> take a look at it? Or if you would rather email it to >> me that also works. > > I've hacked out all my include files, added a GPL licence, and > included build instructions. It follows in the signature block. > 600 lines :( > > I bet the line wrapping breaks it too. I got the code. There were some line wrappings but I just went through and fixed those by hand. After fixing the problem line wraps I was able to compile and run the code (I used std=c++17, which apparently works okay). I did a simple check to make sure the primes being found are right, which they do indeed seem to be. I haven't really experimented with changing the code; I only just ran it with various bounds on what primes to find, mostly to get a sense of the performance. I admit I don't understand the scheme the code is using. It looks to me like the code is doing divisions and remainders a lot, which my code basically doesn't do at all (it does do one division for each prime up to the square root of the limit of primes being sought, but that's all). I know the big template near the beginning is crucial to understanding what the code is doing, but I haven't been able to figure out what abstraction that template is supposed to represent. The core of my program is between 50 and 60 lines, and all pretty straightforward. Is there some suggestion you could make or a question you would like to ask? What can I do that might be helpful?
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-07-02 21:24 +0100 |
| Message-ID | <v61nmg$1pb9h$1@dont-email.me> |
| In reply to | #119596 |
On 02/07/2024 07:20, Tim Rentsch wrote: > Vir Campestris <vir.campestris@invalid.invalid> writes: > >> On 19/06/2024 01:34, Tim Rentsch wrote: >> >>> Would you mind if I asked you to post the code so I can >>> take a look at it? Or if you would rather email it to >>> me that also works. >> >> I've hacked out all my include files, added a GPL licence, and >> included build instructions. It follows in the signature block. >> 600 lines :( >> >> I bet the line wrapping breaks it too. > > I got the code. There were some line wrappings but I just went > through and fixed those by hand. After fixing the problem line > wraps I was able to compile and run the code (I used std=c++17, > which apparently works okay). I did a simple check to make sure > the primes being found are right, which they do indeed seem to > be. I haven't really experimented with changing the code; I > only just ran it with various bounds on what primes to find, > mostly to get a sense of the performance. > > I admit I don't understand the scheme the code is using. It > looks to me like the code is doing divisions and remainders a > lot, which my code basically doesn't do at all (it does do one > division for each prime up to the square root of the limit of > primes being sought, but that's all). I know the big template > near the beginning is crucial to understanding what the code is > doing, but I haven't been able to figure out what abstraction > that template is supposed to represent. The core of my program > is between 50 and 60 lines, and all pretty straightforward. > > Is there some suggestion you could make or a question you would > like to ask? What can I do that might be helpful? The mods and divs should _not_ be in the main path. It would be interesting to see your code. Or for that matter to see a performance report from it. The template is mostly just to allow me to experiment with the size of the stored data. That's not easy with the mod30 scheme, but with the original one I found the code ran faster when I did everything with uint64_t data. Perhaps I could use mod (8*30)! Something that has come to mind as an enhancement - at the moment for larger primes it thinks about all the odd numbers that are a multiple of the prime, and sets them in the mask. It doesn't need to do them all - for each prime there's a set of steps that can be done. For 7, for example, you start at 49, then add at each step 7 multiplied by 4 2 4 2 4 6 2 6. That pattern of 8 repeats infinitely. It's the same for all the other primes, a pattern of 8. Andy
[toc] | [prev] | [next] | [standalone]
| From | Vir Campestris <vir.campestris@invalid.invalid> |
|---|---|
| Date | 2024-07-03 11:25 +0100 |
| Message-ID | <v638ud$25623$1@dont-email.me> |
| In reply to | #119596 |
On 02/07/2024 07:20, Tim Rentsch wrote: > I got the code. There were some line wrappings but I just went > through and fixed those by hand. After fixing the problem line > wraps I was able to compile and run the code (I used std=c++17, > which apparently works okay). I did a simple check to make sure > the primes being found are right, which they do indeed seem to > be. I haven't really experimented with changing the code; I > only just ran it with various bounds on what primes to find, > mostly to get a sense of the performance. > > I admit I don't understand the scheme the code is using. It > looks to me like the code is doing divisions and remainders a > lot, which my code basically doesn't do at all (it does do one > division for each prime up to the square root of the limit of > primes being sought, but that's all). I know the big template > near the beginning is crucial to understanding what the code is > doing, but I haven't been able to figure out what abstraction > that template is supposed to represent. The core of my program > is between 50 and 60 lines, and all pretty straightforward. > > Is there some suggestion you could make or a question you would > like to ask? What can I do that might be helpful? I checked. The modulus operations (%) are easy to check for. There are two for each prime number. Not for each multiple of it, but for the primes themselves. And there's one in the "get" function that checks to see if a number is prime. The complexity is at least partly due to an optimisation. You'll recall perhaps that Bonita's original code preset the entire bitmap to aa - which is faster than setting all the numbers divisible by 2. I pointed out that you don't even need to set them, and set off to write code that stored odd numbers only. But then I realised that if you store the mask for 3 only you get a pattern. It has one odd word at the beginning, then a repeating pattern of 3 words. It doesn't matter what the word size is. This is equally true of 5, 7, 11 and the first few primes. It's also true if you set the mask up for 3 and 5 together - except this time the repeat length is 15. And it's far faster to duplicate those 3, or 15, or 3*5*7, words than to do each prime individually. There comes a point where it isn't worth it though - the repeat length for all the primes up to 31 is 1.5E12. Exactly when it stops being worthwhile isn't obvious. Then you came along with the mod30 suggestion. There is still a repeat length though, and it's still worth merging masks for small primes together rather than doing each one individually. But the code is bigger. MUCH bigger. Andy
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-07-15 06:15 -0700 |
| Message-ID | <86v8166bl1.fsf@linuxsc.com> |
| In reply to | #119603 |
Vir Campestris <vir.campestris@invalid.invalid> writes:
> On 02/07/2024 07:20, Tim Rentsch wrote:
>
>> I got the code. There were some line wrappings but I just went
>> through and fixed those by hand. After fixing the problem line
>> wraps I was able to compile and run the code (I used std=c++17,
>> which apparently works okay). I did a simple check to make sure
>> the primes being found are right, which they do indeed seem to
>> be. I haven't really experimented with changing the code; I
>> only just ran it with various bounds on what primes to find,
>> mostly to get a sense of the performance.
>>
>> I admit I don't understand the scheme the code is using. It
>> looks to me like the code is doing divisions and remainders a
>> lot, which my code basically doesn't do at all (it does do one
>> division for each prime up to the square root of the limit of
>> primes being sought, but that's all). I know the big template
>> near the beginning is crucial to understanding what the code is
>> doing, but I haven't been able to figure out what abstraction
>> that template is supposed to represent. The core of my program
>> is between 50 and 60 lines, and all pretty straightforward.
>>
>> Is there some suggestion you could make or a question you would
>> like to ask? What can I do that might be helpful?
>
> I checked. The modulus operations (%) are easy to check for.
>
> There are two for each prime number. Not for each multiple of it, but
> for the primes themselves.
>
> And there's one in the "get" function that checks to see if a number
> is prime.
>
> The complexity is at least partly due to an optimisation.
It's taken me a while to get into this. I think I understand
your code fairly well now. My idea is to respond in pieces,
starting with the simpler aspects first.
> You'll recall perhaps that Bonita's original code preset the entire
> bitmap to aa - which is faster than setting all the numbers divisible
> by 2. I pointed out that you don't even need to set them, and set off
> to write code that stored odd numbers only.
>
> But then I realised that if you store the mask for 3 only you get a
> pattern. It has one odd word at the beginning, then a repeating
> pattern of 3 words. It doesn't matter what the word size is.
>
> This is equally true of 5, 7, 11 and the first few primes. It's also
> true if you set the mask up for 3 and 5 together - except this time
> the repeat length is 15.
>
> And it's far faster to duplicate those 3, or 15, or 3*5*7, words than
> to do each prime individually.
>
> There comes a point where it isn't worth it though - the repeat length
> for all the primes up to 31 is 1.5E12.
>
> Exactly when it stops being worthwhile isn't obvious.
>
> Then you came along with the mod30 suggestion. There is still a repeat
> length though, and it's still worth merging masks for small primes
> together rather than doing each one individually. But the code is
> bigger. MUCH bigger.
I am assuming a mod30 representation in all of my followup
comments. Just to be explicit, there is an array of 8-bit
bytes, with bytes[i] indicating primeness or compositeness
for the 8 numbers 30*i + { 1, 7, 11, 13, 17, 19, 23, 29 },
one bit for each of those 8 possibilities. (There may be
optimizations if the array elements are 64-bit quantities
rather than 8-bit quantities, but let's ignore that for
now.)
There are two plausible strategies for "pre-setting" the
multiples of small primes. The first strategy uses two stages.
The first stage makes a mask for the four primes 7, 11, 13, and
17, which has 17017 elements, and simply replicates that mask
over and over through the entire array. The second stage makes a
mask for the three primes 19, 23, and 29, which has 12673
elements, and uses bitwise operations to combine that mask
throughout the primes array. Incidentally the convention I chose
is that a 1 bit means the number is definitely composite, so the
array could be initialized with all zero bits.
The second strategy makes a mask for all seven of 7, 11, 13, 17,
19, 23, and 29, which has 215656441 elements. This mask is then
copied over and over throughout the array. Because 215656441 is
quite a bit larger than your L1 cache number, it might be a win
to divide the mask into strips of size (L1 cache / 2), and copy
each strip separately, to get better cache performance.
Note that for the "setting" step (that is, the second strategy
or the first stage of the first strategy), the mask can be formed
in the initial part of the prime/composite array. After taking
care of all multiples of the seven primes in the initial byte,
the initial byte should be set to indicate that those numbers
are prime, since otherwise they would be set wrongly.
I haven't experimented to see which of the above ideas has better
performance. Like you say it's the small primes that matter,
because they have so many multiples, and it isn't obvious where
the cutoff should be. But it's clear that every prime in the
first byte is worth doing, so you might try playing around with
just that part to see which approach does better. I suggest
using a primes array that is as large as the memory on your test
system can accommodate without swapping.
I might be able to write followon segments at the rate of one per
day or so. I'm not expecting to do them any faster than that,
and very likely will miss some days because of other activities.
[toc] | [prev] | [next] | [standalone]
Page 8 of 11 — ← Prev page 1 … 6 7 [8] 9 10 11 Next page →
Back to top | Article view | comp.lang.c++
csiph-web