Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c++ > #118058 > unrolled thread

Sieve of Erastosthenes optimized to the max

Started byBonita Montero <Bonita.Montero@gmail.com>
First post2023-12-10 10:46 +0100
Last post2024-02-14 15:57 +0100
Articles 20 on this page of 213 — 14 participants

Back to article view | Back to comp.lang.c++


Contents

  Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-10 10:46 +0100
    Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-10 21:48 +0000
      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-11 04:15 +0100
        Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-11 17:12 +0000
          Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-11 18:19 +0100
            Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-13 15:16 +0000
              Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-13 15:25 +0000
                Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-14 15:06 +0000
              Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2023-12-14 08:20 -0800
                Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-23 10:30 -0800
                  Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-23 21:20 +0000
                    Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-24 00:36 -0800
                      Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-29 18:03 +0000
                        Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-01-13 21:31 -0800
          Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-20 13:44 +0100
            Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-21 15:30 +0000
              Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-21 17:07 +0100
              Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-21 17:13 +0100
              Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-23 10:21 -0800
                Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2023-12-23 21:21 +0000
                  Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-12-24 10:49 -0800
    Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-21 14:23 -0800
      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-22 04:28 +0100
        Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-21 20:02 -0800
          Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-22 17:55 +0100
            Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-23 12:52 -0800
              Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-24 11:03 +0100
                Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-24 13:24 -0800
                  Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-26 06:00 +0100
                    Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-25 21:39 -0800
                      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-26 10:27 +0100
                        Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 12:24 -0800
                          Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-26 23:35 +0000
                            Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 15:37 -0800
                              Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-26 21:59 -0800
                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-27 10:23 +0100
                                  Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-27 20:49 +0000
                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-28 12:00 +0100
                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-28 15:38 -0800
                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 04:17 +0100
                                          Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-28 20:58 -0800
                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 10:58 +0100
                                              Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2023-12-29 13:56 +0100
                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 17:04 +0100
                                                  Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2023-12-30 19:27 +0100
                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 11:22 +0100
                                                    Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-31 18:49 +0000
                                                      Re: Sieve of Erastosthenes optimized to the max David Brown <david.brown@hesbynett.no> - 2024-01-01 12:46 +0100
                                              Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-29 16:01 +0000
                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-29 17:06 +0100
                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 13:45 -0800
                                                Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 14:09 -0800
                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 14:12 -0800
                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 05:42 +0100
                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 20:45 -0800
                                                      Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 04:56 +0000
                                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 06:09 +0100
                                                          Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 05:51 +0000
                                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 10:15 +0100
                                                              Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 20:35 +0000
                                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 06:54 +0100
                                                                  Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-31 07:01 +0000
                                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 11:20 +0100
                                                                      Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-31 17:30 +0000
                                                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 06:21 +0100
                                                          Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2023-12-31 18:44 +0000
                                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 06:22 +0100
                                                              Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-01 08:28 +0000
                                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 14:11 +0100
                                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-01 15:36 -0800
                                                    Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-30 04:51 +0000
                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-30 12:00 -0800
                                              Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2023-12-29 17:29 +0000
                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-30 05:45 +0100
                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-30 11:58 -0800
                                                    Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2023-12-30 14:58 -0800
                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:49 -0800
                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-31 06:51 +0100
                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:36 -0800
                                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 07:28 +0100
                                                          Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 22:53 -0800
                                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-01 14:11 +0100
                                                              Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-01 15:34 -0800
                                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-02 11:55 +0100
                                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-02 10:38 -0800
                                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-03 06:48 +0100
                                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-03 13:32 -0800
                                                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-04 04:37 +0100
                                                                          Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-05 19:21 -0800
                                                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-06 08:18 +0100
                                                                              Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-06 08:31 +0000
                                                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-06 10:30 +0100
                                                                                  Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-06 13:15 -0800
                                                                                    Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-06 13:19 -0800
                                                                                      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-07 10:14 +0100
                                                                                    Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-07 10:10 +0100
                                                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-07 12:46 -0800
                                                                                        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-08 06:48 +0100
                                                                                          Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-08 12:18 -0800
                                                                                            Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-01-08 17:14 -0800
                                                                                              Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-09 07:19 +0100
                                                                                              Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-01-09 23:33 -0800
                                                                                          Re: Sieve of Erastosthenes optimized to the max Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-09 02:02 +0000
                                                                                            Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-01-09 15:12 +0100
                                                                                              OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Vir Campestris <vir.campestris@invalid.invalid> - 2024-01-29 21:31 +0000
                                                                                                Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-16 08:06 -0800
                                                                                                  Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-16 18:30 +0100
                                                                                                    Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-23 05:51 -0800
                                                                                                      Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-24 10:45 +0100
                                                                                                        Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-02-25 00:48 -0800
                                                                                                          Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-25 15:51 +0100
                                                                                                            Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-03-11 10:10 -0700
                                                                                                              Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-12 10:15 +0100
                                                                                                                Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 12:44 +0800
                                                                                                                  Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 07:25 +0100
                                                                                                                    Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:20 +0800
                                                                                                                      Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:35 +0800
                                                                                                                        Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-14 17:41 +0800
                                                                                                                          Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 19:20 +0100
                                                                                                                            Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-15 16:30 +0800
                                                                                                                              Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-15 11:21 +0100
                                                                                                                                Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) wij <wyniijj5@gmail.com> - 2024-03-15 19:07 +0800
                                                                                                                                  Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-15 12:56 +0100
                                                                                                                        Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-14 19:20 +0100
                                                                                                                Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-20 08:35 -0700
                                                                                                                  Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-20 18:34 +0200
                                                                                                                    Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-20 18:35 +0200
                                                                                                                    Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-24 12:28 -0700
                                                                                                                      Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Bonita Montero <Bonita.Montero@gmail.com> - 2024-04-25 06:19 +0200
                                                                                                                        Re: OT: A better sieve (was Re: Sieve of Erastosthenes optimized to the max) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-04-25 14:14 -0700
                                                      Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-31 11:39 -0800
                                          Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2023-12-29 13:52 -0800
                          Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2023-12-27 06:06 +0100
                            Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-22 19:34 -0700
                              Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-23 17:54 +0100
                                Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-23 14:04 -0700
                                  Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-24 07:30 +0100
                                    Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-24 12:52 -0700
                                      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-03-24 21:00 +0100
                                        Re: Sieve of Erastosthenes optimized to the max "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-03-24 13:05 -0700
                                          Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-05-16 17:28 +0100
                                            Re: Sieve of Erastosthenes optimized to the max Ben Bacarisse <ben@bsb.me.uk> - 2024-05-16 21:40 +0100
                                            Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-05-21 19:06 -0700
                                              Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-05-30 12:32 +0100
                                                Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-05-30 14:17 +0200
                                                Re: Sieve of Erastosthenes optimized to the max Paavo Helde <eesnimi@osa.pri.ee> - 2024-05-30 19:55 +0300
                                                  Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-05-31 10:17 +0200
                                                    Re: Sieve of Erastosthenes optimized to the max Paavo Helde <eesnimi@osa.pri.ee> - 2024-05-31 20:52 +0300
                                                Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-05-30 22:17 -0700
                                                  Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-01 21:07 +0100
                                                    Re: Sieve of Erastosthenes optimized to the max Richard Damon <richard@damon-family.org> - 2024-06-01 20:43 -0400
                                                    Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-02 03:23 -0700
                                                      Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-02 19:50 -0700
                                                        Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-18 20:56 +0100
                                                          Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 17:34 -0700
                                                            Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-30 21:47 +0100
                                                              Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-01 23:20 -0700
                                                                Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-02 21:24 +0100
                                                                Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-03 11:25 +0100
                                                                  Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-15 06:15 -0700
                                                                  Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-20 07:41 -0700
                                                                    OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-07-25 12:46 +0100
                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-10 07:07 -0700
                                                                        Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-12 15:32 +0100
                                                                          Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-16 07:48 -0700
                                                                        Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-15 17:52 +0100
                                                                          Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-16 08:40 -0700
                                                                            Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-16 19:35 +0200
                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-16 19:55 +0200
                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-19 21:23 +0100
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:21 +0200
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:24 +0200
                                                                                    Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 17:43 +0200
                                                                                    Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-20 17:55 +0100
                                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 18:59 +0200
                                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 12:08 -0700
                                                                                        Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-27 06:09 +0200
                                                                                          Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-09-01 21:23 +0100
                                                                                            Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-09-01 20:40 -0700
                                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-09-02 07:08 +0200
                                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-09-03 17:45 +0100
                                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-09-28 03:46 -0700
                                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-09-28 13:49 -0700
                                                                                                    Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-02 11:44 +0100
                                                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-10-02 13:10 +0200
                                                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-10-07 08:41 -0700
                                                                                                        Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-10-20 12:44 +0100
                                                                                                          Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-11-04 03:56 -0800
                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-19 21:34 +0100
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-08-19 21:08 -0700
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-20 21:14 +0100
                                                                                    Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 09:35 -0700
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 08:31 -0700
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:20 +0200
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:36 +0200
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 19:39 +0200
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-20 20:13 +0200
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max scott@slp53.sl.home (Scott Lurndal) - 2024-08-20 20:50 +0000
                                                                                    Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 17:30 +0100
                                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-08-22 18:38 +0200
                                                                                        Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 21:47 +0100
                                                                              Re: OT: Re: Sieve of Erastosthenes optimized to the max Vir Campestris <vir.campestris@invalid.invalid> - 2024-08-22 21:56 +0100
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max red floyd <no.spam.here@its.invalid> - 2024-08-22 17:00 -0700
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 10:59 -0700
                                                                                  Re: OT: Re: Sieve of Erastosthenes optimized to the max Andrey Tarasevich <andreytarasevich@hotmail.com> - 2024-09-28 15:21 -0700
                                                                                Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-26 12:47 -0700
                                                                            Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-18 19:52 -0700
                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-10 17:24 -0700
                                                                      Re: OT: Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-08-11 00:00 -0700
                                                                  Re: Sieve of Erastosthenes optimized to the max Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-07-23 07:34 -0700
    Re: Sieve of Erastosthenes optimized to the max wij <wyniijj5@gmail.com> - 2024-02-14 00:15 +0800
      Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-13 19:08 +0100
        Re: Sieve of Erastosthenes optimized to the max Bonita Montero <Bonita.Montero@gmail.com> - 2024-02-14 15:57 +0100

Page 9 of 11 — ← Prev page 1 … 7 8 [9] 10 11  Next page →


#119693

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-07-20 07:41 -0700
Message-ID<861q3o5do1.fsf@linuxsc.com>
In reply to#119603
Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 02/07/2024 07:20, Tim Rentsch wrote:
>
>> I got the code.  [...]
>>
>
> I checked.  The modulus operations (%) are easy to check for.
>
> There are two for each prime number.  Not for each multiple of it, but
> for the primes themselves.
>
> And there's one in the "get" function that checks to see if a number
> is prime.  [...]

And one divide for each modulus.

Both divide and mod can be simplified if we use a different
representation for numbers.  The idea is to separate the
number into a mod 30 part and divided by 30 part.  An
arbitrary number N can be split into an ordered pair

   N/30 , N%30

stored in, let's say, an 8 byte quantity with seven bytes
for N/30 and one byte for N%30.  Let me call these two
parts i and a for one number N, and j and b for a second
number M.  Then two form N*M we need to find

   (i*30+a) * (j*30+b)

which is

   (i*30*j*30) + (i*30*b) + (j*30*a) + a*b

Of course we want to take the product and express it in
the form (product/30, product%30), which can be done by

   30*i*j + i*b + j*a + (a*b)/30,   (a*b)%30

The two numbers a and b are both under 30, so a*b/30 and
a*b%30 can be done by lookup in a small array.

   30*i*j + i*b + j*a + divide_30[a][b],   modulo_30[a][b]

Now all operations can be done using just multiplies and
table lookups.  Dividing by 30 is just taking the first
component;  remainder 30 is just taking the second component.

One further refinement:  for numbers relatively prime to 2,
3, and 5, there are only 8 possibles, so the modulo 30
component can be reduced to just three bits.  That makes the
tables smaller, at a cost of needing a table lookup to find
and 'a' or 'b' part.

Does this all make sense?

[toc] | [prev] | [next] | [standalone]


#119748 — OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-07-25 12:46 +0100
SubjectOT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<v7tdts$29195$1@dont-email.me>
In reply to#119693
On 23/07/2024 15:34, Tim Rentsch wrote:
 > Vir Campestris <vir.campestris@invalid.invalid> writes:
 >
 > [...]
 >
 > I was hoping for some feedback, even if very brief,
 > before I continue.

Sorry, I was slow. It's really odd, I'm retired, and I ought to have 
lots of time for things like this... and yet I seem to be busy all the time.

On 20/07/2024 15:41, Tim Rentsch wrote:
> Vir Campestris <vir.campestris@invalid.invalid> writes:
> 
>> On 02/07/2024 07:20, Tim Rentsch wrote:
>>
>>> I got the code.  [...]
>>>
>>
>> I checked.  The modulus operations (%) are easy to check for.
>>
>> There are two for each prime number.  Not for each multiple of it, but
>> for the primes themselves.
>>
>> And there's one in the "get" function that checks to see if a number
>> is prime.  [...]
> 
> And one divide for each modulus.

I found searching my code for the '%' character was quite easy.
'/', however, is in all my comments, so I get lots of false positives.

> 
> Both divide and mod can be simplified if we use a different
> representation for numbers.  The idea is to separate the
> number into a mod 30 part and divided by 30 part.  An
> arbitrary number N can be split into an ordered pair
> 
>     N/30 , N%30
> 
> stored in, let's say, an 8 byte quantity with seven bytes
> for N/30 and one byte for N%30.  Let me call these two
> parts i and a for one number N, and j and b for a second
> number M.  Then two form N*M we need to find
> 

I suspect the cost of extracting the divided value from the 7 bytes will 
be prohibitive. You may find it's best to have one table for the N/30 
parts (big entries, word aligned) and another for the N%30 part (small 
entries, byte aligned). BICBW.

>     (i*30+a) * (j*30+b)
> 
> which is
> 
>     (i*30*j*30) + (i*30*b) + (j*30*a) + a*b
> 
> Of course we want to take the product and express it in
> the form (product/30, product%30), which can be done by
> 
>     30*i*j + i*b + j*a + (a*b)/30,   (a*b)%30
> 
> The two numbers a and b are both under 30, so a*b/30 and
> a*b%30 can be done by lookup in a small array.
> 
>     30*i*j + i*b + j*a + divide_30[a][b],   modulo_30[a][b]
> 
> Now all operations can be done using just multiplies and
> table lookups.  Dividing by 30 is just taking the first
> component;  remainder 30 is just taking the second component.
> 
> One further refinement:  for numbers relatively prime to 2,
> 3, and 5, there are only 8 possibles, so the modulo 30
> component can be reduced to just three bits.  That makes the
> tables smaller, at a cost of needing a table lookup to find
> and 'a' or 'b' part.
> 
> Does this all make sense?

Yes, it does. It's not a direction I was thinking of at all, although I 
too had table lookups in mind.

What I had considered is an extrapolation from my original code.

For each prime my original code  starts with p^2, and sets the bit. It 
than adds p*2, and sets the bit for that. Since I wasn't storing even 
numbers every prime is odd, which means p^2 is odd. p^2+p would be even, 
so I skip that and go on to p^2+p*2, and so on.

Given the mod30 storage I should be able to make a table with 8 entries, 
each containing a step and a mask. The step is the number of bytes in 
the big table to step to for the next bit to set; the mask is the bit to 
set.

For 37, for example, the first bits we need to set are these:

Mul'ple	value	/30	Step	%30
37	1369	45		19
41	1517	50	5	17
47	1739	57	7	29
49	1813	60	3	13
53	1961	65	5	11
59	2183	72	7	23
61	2257	75	3	7

67	2479	82	7	19
71	2627	87	5	17
77	2849	94	7	29
79	2923	97	3	13
83	3071	102	5	11
89	3293	109	7	23
91	3367	112	3	7


You'll see the pattern; there is a repeat 8 lines long of both the Step 
(which is the delta in the word we need to access) and the %30 value.

So what the code should do is start at p^2, and build this table. It'll 
be different for every prime, and will be quite expensive to build. But 
once the table is built it can add the step onto the current table 
index, then set the bit corresponding to the %30 value. The table should 
of course contain only the step and the bitmask corresponding to the %30 
value.

(I used 37 in this example BTW, because I wanted a value more than 30, 
and with 31 the /30 column increments by 1 each time I added a prime)

For small values, as we discussed, it's going to be faster to copy the 
small table for that prime than to apply the mask to each word of the 
store individually.

I had also considered looking for a mod-div value greater than 30.

Bonita's original code stored a bit for every prime. We can call that a 
compression ratio of 1. I store the odd only, so the compression ratio 
is 2. By storing mod30 you increase that to 3.75. But there's a cost on 
modern computers because it implies lots of byte level access to store.

If on the other hand we store mod(2*3*5*7*11), which is 2310, we get 480 
candidates. 480 isn't far off 512 which is a nice size for us to handle, 
and gives us an even better compression ratio of just over 4.5. But of 
course all those tables with 8 or 30 entries will be getting a bit big 
and painful.

I might even get around to writing some of it. Be warned, this is a time 
trap!

Andy

[toc] | [prev] | [next] | [standalone]


#119791 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-08-10 07:07 -0700
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<868qx45v5g.fsf@linuxsc.com>
In reply to#119748
Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 23/07/2024 15:34, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>> [...]
>>
>> I was hoping for some feedback, even if very brief,
>> before I continue.
>
> Sorry, I was slow.  It's really odd, I'm retired, and I ought to
> have lots of time for things like this... and yet I seem to be
> busy all the time.

No worries, I understand.

I'm sorry to be so long in responding.  Among other things
there was some sort of snafu with my news server, resulting
in hundreds of lost and/or duplicated messages, and that
took a while to sort out.

> On 20/07/2024 15:41, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>>> On 02/07/2024 07:20, Tim Rentsch wrote:
[...]
>> Both divide and mod can be simplified if we use a different
>> representation for numbers.  The idea is to separate the
>> number into a mod 30 part and divided by 30 part.  An
>> arbitrary number N can be split into an ordered pair
>>
>>     N/30 , N%30
>>
>> stored in, let's say, an 8 byte quantity with seven bytes
>> for N/30 and one byte for N%30.  Let me call these two
>> parts i and a for one number N, and j and b for a second
>> number M.  Then two form N*M we need to find
>
> I suspect the cost of extracting the divided value from the 7
> bytes will be prohibitive.  You may find it's best to have one
> table for the N/30 parts (big entries, word aligned) and another
> for the N%30 part (small entries, byte aligned).  BICBW.

Two tables, absolutely.  What does BICBW stand for?

>>     (i*30+a) * (j*30+b)
>>
>> which is
>>
>>     (i*30*j*30) + (i*30*b) + (j*30*a) + a*b
>>
>> Of course we want to take the product and express it in
>> the form (product/30, product%30), which can be done by
>>
>>     30*i*j + i*b + j*a + (a*b)/30,   (a*b)%30
>>
>> The two numbers a and b are both under 30, so a*b/30 and
>> a*b%30 can be done by lookup in a small array.
>>
>>     30*i*j + i*b + j*a + divide_30[a][b],   modulo_30[a][b]
>>
>> Now all operations can be done using just multiplies and
>> table lookups.  Dividing by 30 is just taking the first
>> component;  remainder 30 is just taking the second component.
>>
>> One further refinement:  for numbers relatively prime to 2,
>> 3, and 5, there are only 8 possibles, so the modulo 30
>> component can be reduced to just three bits.  That makes the
>> tables smaller, at a cost of needing a table lookup to find
>> and 'a' or 'b' part.
>>
>> Does this all make sense?
>
> Yes, it does.  It's not a direction I was thinking of at all,
> although I too had table lookups in mind.
>
> What I had considered is an extrapolation from my original code.

I read the comments below in conjunction with getting a better
understanding of your original code, which is part of what took
me so long before responding.  Before I get to the stuff below
(which I think I will save for a second reply), here is some code
expanding on what I said above.  It should compile cleanly as is.
However, its purpose is to illustrate what I said above.

  typedef unsigned long NN, Index, Count, Mask;
  typedef unsigned char Byte;

  static  void    remove_multiples( Index, Count, Index );

  static  Byte    bytes[1000000000];
  static  NN      mods[] = { 1, 7, 11, 13, 17, 19, 23, 29, };

  Count
  sieve( NN limit ){
      Count r = 0;
      Index i;
      Count shift;
      Index i_limit = (limit + 29)/30;

      bytes[0] = 1;
      for(  i = 0;  i < i_limit;  i++  ){
          Byte bits = bytes[i];
          for(  shift = 0;  shift < 8;  shift++  ){
              NN p = i*30 + mods[shift];
              if(  bits & 1<<shift  )  continue;
              if(  p >= limit  )  continue;
              r++;        // count how many primes found
                          // could print the prime 'p' here if desired
              if(  p*p < limit  )  remove_multiples( i, shift, i_limit );
          }
      }

      return  r;
  }

  static  Byte    shift_for[30] = {
      9,0,9,9,9,9,  9,1,9,9,9,2,  9,3,9,9,9,4,  9,5,9,9,9,6,  9,9,9,9,9,7,
  };

  static  Byte  ab_div_30[8][8] = {
   {  1*1/30, 1*7/30, 1*11/30, 1*13/30, 1*17/30, 1*19/30, 1*23/30, 1*29/30 },
   {  7*1/30, 7*7/30, 7*11/30, 7*13/30, 7*17/30, 7*19/30, 7*23/30, 7*29/30 },
   { 11*1/30,11*7/30,11*11/30,11*13/30,11*17/30,11*19/30,11*23/30,11*29/30 },
   { 13*1/30,13*7/30,13*11/30,13*13/30,13*17/30,13*19/30,13*23/30,13*29/30 },
   { 17*1/30,17*7/30,17*11/30,17*13/30,17*17/30,17*19/30,17*23/30,17*29/30 },
   { 19*1/30,19*7/30,19*11/30,19*13/30,19*17/30,19*19/30,19*23/30,19*29/30 },
   { 23*1/30,23*7/30,23*11/30,23*13/30,23*17/30,23*19/30,23*23/30,23*29/30 },
   { 29*1/30,29*7/30,29*11/30,29*13/30,29*17/30,29*19/30,29*23/30,29*29/30 },
  };

  static  Byte  ab_mod_30[8][8] = {
   {  1*1%30, 1*7%30, 1*11%30, 1*13%30, 1*17%30, 1*19%30, 1*23%30, 1*29%30 },
   {  7*1%30, 7*7%30, 7*11%30, 7*13%30, 7*17%30, 7*19%30, 7*23%30, 7*29%30 },
   { 11*1%30,11*7%30,11*11%30,11*13%30,11*17%30,11*19%30,11*23%30,11*29%30 },
   { 13*1%30,13*7%30,13*11%30,13*13%30,13*17%30,13*19%30,13*23%30,13*29%30 },
   { 17*1%30,17*7%30,17*11%30,17*13%30,17*17%30,17*19%30,17*23%30,17*29%30 },
   { 19*1%30,19*7%30,19*11%30,19*13%30,19*17%30,19*19%30,19*23%30,19*29%30 },
   { 23*1%30,23*7%30,23*11%30,23*13%30,23*17%30,23*19%30,23*23%30,23*29%30 },
   { 29*1%30,29*7%30,29*11%30,29*13%30,29*17%30,29*19%30,29*23%30,29*29%30 },
  };

  void
  remove_multiples( Index i, Count shift_a, Index i_limit ){
      Index j, k;
      NN a = mods[shift_a], b, c;
      Count shift;

      for(  shift = shift_a;  shift < 8;  shift++  ){
          j = i;
          b = mods[shift];
          k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
          c = ab_mod_30[ shift_a ][ shift ];
          if(  k >= i_limit  )  continue;
          bytes[k] |= 1 << shift_for[c];
      }

      for(  j = i+1;  (30UL*i+a)*(30UL*j+1) < i_limit * 30;  j++  ){
          for(  shift = 0;  shift < 8;  shift++  ){
              b = mods[shift];
              k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
              c = ab_mod_30[ shift_a ][ shift ];
              if(  k >= i_limit  )  continue;
              bytes[k] |= 1 << shift_for[c];
          }
      }
  }

In both sieve() and remove_multiples(), there is an inner loop
that ranges over the possible shift values.  (There is also an
unnested loop in remove_multiples() that ranges over a subset of
shift values, but for now please ignore that.)

Since these loops always perform exactly eight iterations, they
can be completely unrolled without expanding the code too much.
Here is a revised version of the outer loop in sieve() (actually
the code below is not yet ready for compilation, because some
things have changed):

    for(  i = 0;  i < i_limit;  i++  ){
        Byte bits = bytes[i] ^ 0xFF;
        if(  bits & 0001  )  remove_multiples( i_limit, i,  1 ),  r++;
        if(  bits & 0002  )  remove_multiples( i_limit, i,  7 ),  r++;
        if(  bits & 0004  )  remove_multiples( i_limit, i, 11 ),  r++;
        if(  bits & 0010  )  remove_multiples( i_limit, i, 13 ),  r++;
        if(  bits & 0020  )  remove_multiples( i_limit, i, 17 ),  r++;
        if(  bits & 0040  )  remove_multiples( i_limit, i, 19 ),  r++;
        if(  bits & 0100  )  remove_multiples( i_limit, i, 23 ),  r++;
        if(  bits & 0200  )  remove_multiples( i_limit, i, 29 ),  r++;
    }

We changed the interface to remove_multiples() to pass along the
index value 'i' and also the value for 'a' rather than the value
calculated for the prime 'p'.  Since the loop has been unrolled,
the values for the 'a' arguments are all constants (and of course
all the mask values are constants).

After making a similar change to the remove_multiples() loop, the
values for the 'b's are likewise all constants.  If the optimizer
is smart enough, all the 'a*b/30' values and 'a*b%30' values turn
into constants.  Also the expressions 'i*b' and 'j*a' all turn
into multiplication by a constant.  Furthermore, to determine the
appropriate bit value to 'or' in, an eight-choice conditional
expression that tests each of the eight possible residues can
similarly be optimized away so the masks to 'or' in are all
constants.  Do you see how that all works?

The point of all this unrolling is to get rid of all the shifting
and table lookups.  The only variables involved are the loop
variables 'i' and 'j';  everything else has been turned into a
constant.  Granted, the generated code has been expanded a great
deal (basically a factor of 8*8 == 64), but the pieces are now
much simpler so the result is still manageable.

I wanted to be sure you understand how the original idea is
transformed into a better performing version before posting
something more refined.  I get that your time is at a premium,
still you might trying going through the exercise of unrolling
the loops and writing a set_product() routine (or macro) to get a
sense of where I'm going here.

Hope to send response to send part sometime soon.

[toc] | [prev] | [next] | [standalone]


#119803 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-08-12 15:32 +0100
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<v9d6e1$3avh0$2@dont-email.me>
In reply to#119791
(replying to two messages at once)

On 10/08/2024 15:07, Tim Rentsch wrote:
>> I suspect the cost of extracting the divided value from the 7
>> bytes will be prohibitive.  You may find it's best to have one
>> table for the N/30 parts (big entries, word aligned) and another
>> for the N%30 part (small entries, byte aligned).  BICBW.
> Two tables, absolutely.  What does BICBW stand for?

But I Could Be Wrong. I thought that one was fairly well known.

On 11/08/2024 08:00, Tim Rentsch wrote:
 > Note that the %30 tables can be shared.  There are 8 of them, one
 > for each of the eight residues mod 30 of the original prime.

I started coding that up. Then it occurred to me that there wasn't going 
to be much saving by storing a pointer (8 bytes on my machine) to the 
correct 8-byte long table. I suppose I could store a byte array index 
instead, but I've gone off in other directions.

You've picked up my cache management tweak.

I'll read through your code. But it might not be today. We've got the 
hottest day of the year here, and I don't have aircon in my garden 
office. In winter the computer and me are enough to keep it warm, but 
right now it's getting warmer by the minute.

Andy

[toc] | [prev] | [next] | [standalone]


#119815 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-08-16 07:48 -0700
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<86ikw0le1k.fsf@linuxsc.com>
In reply to#119803
Vir Campestris <vir.campestris@invalid.invalid> writes:

> (replying to two messages at once)
>
> On 10/08/2024 15:07, Tim Rentsch wrote:
>
>>> I suspect the cost of extracting the divided value from the 7
>>> bytes will be prohibitive.  You may find it's best to have one
>>> table for the N/30 parts (big entries, word aligned) and another
>>> for the N%30 part (small entries, byte aligned).  BICBW.
>>
>> Two tables, absolutely.  What does BICBW stand for?
>
> But I Could Be Wrong.  I thought that one was fairly well known.

Ah.  No wonder I didn't know it. ;)

> On 11/08/2024 08:00, Tim Rentsch wrote:
>
>> Note that the %30 tables can be shared.  There are 8 of them, one
>> for each of the eight residues mod 30 of the original prime.
>
> I started coding that up.  Then it occurred to me that there wasn't
> going to be much saving by storing a pointer (8 bytes on my machine)
> to the correct 8-byte long table.  I suppose I could store a byte array
> index instead, but I've gone off in other directions.

Right, I was meaning keeping track of the index, not a pointer.

> You've picked up my cache management tweak.

It's an interesting idea, although I'm not sure how well it
integrates with what I've done.

> I'll read through your code.  But it might not be today.  We've got the
> hottest day of the year here, and I don't have aircon in my garden
> office.  In winter the computer and me are enough to keep it warm, but
> right now it's getting warmer by the minute.

My sympathies.  I too find it hard to work when it's too hot.

I see you have started looking at my code... I'll take a look.

[toc] | [prev] | [next] | [standalone]


#119813 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-08-15 17:52 +0100
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<v9lbns$11alj$1@dont-email.me>
In reply to#119791
I'm going to come back to this slightly earlier version, and add 
comments inline.

On 10/08/2024 15:07, Tim Rentsch wrote:
> Vir Campestris <vir.campestris@invalid.invalid> writes:
> 
>> On 23/07/2024 15:34, Tim Rentsch wrote:
>>
>>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>>
>>> [...]
>>>
>>> I was hoping for some feedback, even if very brief,
>>> before I continue.
>>
>> Sorry, I was slow.  It's really odd, I'm retired, and I ought to
>> have lots of time for things like this... and yet I seem to be
>> busy all the time.
> 
> No worries, I understand.
> 
> I'm sorry to be so long in responding.  Among other things
> there was some sort of snafu with my news server, resulting
> in hundreds of lost and/or duplicated messages, and that
> took a while to sort out.
> 

I think that was Eternal September - he had an outage.

>> On 20/07/2024 15:41, Tim Rentsch wrote:
>>
>>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>>
>>>> On 02/07/2024 07:20, Tim Rentsch wrote:
> [...]
>>> Both divide and mod can be simplified if we use a different
>>> representation for numbers.  The idea is to separate the
>>> number into a mod 30 part and divided by 30 part.  An
>>> arbitrary number N can be split into an ordered pair
>>>
>>>      N/30 , N%30
>>>
>>> stored in, let's say, an 8 byte quantity with seven bytes
>>> for N/30 and one byte for N%30.  Let me call these two
>>> parts i and a for one number N, and j and b for a second
>>> number M.  Then two form N*M we need to find
>>
>> I suspect the cost of extracting the divided value from the 7
>> bytes will be prohibitive.  You may find it's best to have one
>> table for the N/30 parts (big entries, word aligned) and another
>> for the N%30 part (small entries, byte aligned).  BICBW.
> 
> Two tables, absolutely.  What does BICBW stand for?
> 
>>>      (i*30+a) * (j*30+b)
>>>
>>> which is
>>>
>>>      (i*30*j*30) + (i*30*b) + (j*30*a) + a*b
>>>
>>> Of course we want to take the product and express it in
>>> the form (product/30, product%30), which can be done by
>>>
>>>      30*i*j + i*b + j*a + (a*b)/30,   (a*b)%30
>>>
>>> The two numbers a and b are both under 30, so a*b/30 and
>>> a*b%30 can be done by lookup in a small array.
>>>
>>>      30*i*j + i*b + j*a + divide_30[a][b],   modulo_30[a][b]
>>>
>>> Now all operations can be done using just multiplies and
>>> table lookups.  Dividing by 30 is just taking the first
>>> component;  remainder 30 is just taking the second component.
>>>
>>> One further refinement:  for numbers relatively prime to 2,
>>> 3, and 5, there are only 8 possibles, so the modulo 30
>>> component can be reduced to just three bits.  That makes the
>>> tables smaller, at a cost of needing a table lookup to find
>>> and 'a' or 'b' part.
>>>
>>> Does this all make sense?
>>
>> Yes, it does.  It's not a direction I was thinking of at all,
>> although I too had table lookups in mind.
>>
>> What I had considered is an extrapolation from my original code.
> 
> I read the comments below in conjunction with getting a better
> understanding of your original code, which is part of what took
> me so long before responding.  Before I get to the stuff below
> (which I think I will save for a second reply), here is some code
> expanding on what I said above.  It should compile cleanly as is.
> However, its purpose is to illustrate what I said above.
> 
>    typedef unsigned long NN, Index, Count, Mask;
>    typedef unsigned char Byte;
> 
I've been using uint64_t, not unsigned long. When I started C unsigned 
long was 32 bits; I'm never quite sure what you'll get.

>    static  void    remove_multiples( Index, Count, Index );
> 
>    static  Byte    bytes[1000000000];
>    static  NN      mods[] = { 1, 7, 11, 13, 17, 19, 23, 29, };
> 
>    Count
>    sieve( NN limit ){
>        Count r = 0;
>        Index i;
>        Count shift;
>        Index i_limit = (limit + 29)/30;
> 
>        bytes[0] = 1;
>        for(  i = 0;  i < i_limit;  i++  ){
>            Byte bits = bytes[i];
>            for(  shift = 0;  shift < 8;  shift++  ){
>                NN p = i*30 + mods[shift];
>                if(  bits & 1<<shift  )  continue;
>                if(  p >= limit  )  continue;
>                r++;        // count how many primes found
>                            // could print the prime 'p' here if desired
>                if(  p*p < limit  )  remove_multiples( i, shift, i_limit );
>            }
>        }
> 
>        return  r;
>    }
> 
>    static  Byte    shift_for[30] = {
>        9,0,9,9,9,9,  9,1,9,9,9,2,  9,3,9,9,9,4,  9,5,9,9,9,6,  9,9,9,9,9,7,
>    };
> 
>    static  Byte  ab_div_30[8][8] = {
>     {  1*1/30, 1*7/30, 1*11/30, 1*13/30, 1*17/30, 1*19/30, 1*23/30, 1*29/30 },
>     {  7*1/30, 7*7/30, 7*11/30, 7*13/30, 7*17/30, 7*19/30, 7*23/30, 7*29/30 },
>     { 11*1/30,11*7/30,11*11/30,11*13/30,11*17/30,11*19/30,11*23/30,11*29/30 },
>     { 13*1/30,13*7/30,13*11/30,13*13/30,13*17/30,13*19/30,13*23/30,13*29/30 },
>     { 17*1/30,17*7/30,17*11/30,17*13/30,17*17/30,17*19/30,17*23/30,17*29/30 },
>     { 19*1/30,19*7/30,19*11/30,19*13/30,19*17/30,19*19/30,19*23/30,19*29/30 },
>     { 23*1/30,23*7/30,23*11/30,23*13/30,23*17/30,23*19/30,23*23/30,23*29/30 },
>     { 29*1/30,29*7/30,29*11/30,29*13/30,29*17/30,29*19/30,29*23/30,29*29/30 },
>    };
> 
>    static  Byte  ab_mod_30[8][8] = {
>     {  1*1%30, 1*7%30, 1*11%30, 1*13%30, 1*17%30, 1*19%30, 1*23%30, 1*29%30 },
>     {  7*1%30, 7*7%30, 7*11%30, 7*13%30, 7*17%30, 7*19%30, 7*23%30, 7*29%30 },
>     { 11*1%30,11*7%30,11*11%30,11*13%30,11*17%30,11*19%30,11*23%30,11*29%30 },
>     { 13*1%30,13*7%30,13*11%30,13*13%30,13*17%30,13*19%30,13*23%30,13*29%30 },
>     { 17*1%30,17*7%30,17*11%30,17*13%30,17*17%30,17*19%30,17*23%30,17*29%30 },
>     { 19*1%30,19*7%30,19*11%30,19*13%30,19*17%30,19*19%30,19*23%30,19*29%30 },
>     { 23*1%30,23*7%30,23*11%30,23*13%30,23*17%30,23*19%30,23*23%30,23*29%30 },
>     { 29*1%30,29*7%30,29*11%30,29*13%30,29*17%30,29*19%30,29*23%30,29*29%30 },
>    };
> 

constexpr on those tables might give the optimiser something else to 
play with.

>    void
>    remove_multiples( Index i, Count shift_a, Index i_limit ){

So i is (prime/30) and shift_a is the bit in the mask byte - 
corresponding to (prime%30)?

>        Index j, k;
>        NN a = mods[shift_a], b, c;
>        Count shift;
> 
>        for(  shift = shift_a;  shift < 8;  shift++  ){
>            j = i;
>            b = mods[shift];
>            k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
>            c = ab_mod_30[ shift_a ][ shift ];
>            if(  k >= i_limit  )  continue;
>            bytes[k] |= 1 << shift_for[c];
>        }
> 
>        for(  j = i+1;  (30UL*i+a)*(30UL*j+1) < i_limit * 30;  j++  ){
>            for(  shift = 0;  shift < 8;  shift++  ){
>                b = mods[shift];
>                k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
>                c = ab_mod_30[ shift_a ][ shift ];
>                if(  k >= i_limit  )  continue;
>                bytes[k] |= 1 << shift_for[c];
>            }
>        }
>    }
> 
> In both sieve() and remove_multiples(), there is an inner loop
> that ranges over the possible shift values.  (There is also an
> unnested loop in remove_multiples() that ranges over a subset of
> shift values, but for now please ignore that.)
> 
> Since these loops always perform exactly eight iterations, they
> can be completely unrolled without expanding the code too much.
> Here is a revised version of the outer loop in sieve() (actually
> the code below is not yet ready for compilation, because some
> things have changed):
> 
>      for(  i = 0;  i < i_limit;  i++  ){
>          Byte bits = bytes[i] ^ 0xFF;
>          if(  bits & 0001  )  remove_multiples( i_limit, i,  1 ),  r++;
>          if(  bits & 0002  )  remove_multiples( i_limit, i,  7 ),  r++;
>          if(  bits & 0004  )  remove_multiples( i_limit, i, 11 ),  r++;
>          if(  bits & 0010  )  remove_multiples( i_limit, i, 13 ),  r++;
>          if(  bits & 0020  )  remove_multiples( i_limit, i, 17 ),  r++;
>          if(  bits & 0040  )  remove_multiples( i_limit, i, 19 ),  r++;
>          if(  bits & 0100  )  remove_multiples( i_limit, i, 23 ),  r++;
>          if(  bits & 0200  )  remove_multiples( i_limit, i, 29 ),  r++;
>      }

Before unrolling always benchmark. (Modern compilers are pretty smart, 
and sometimes will unroll for you. Modern processors with branch 
prediction are pretty good about working out when to jump and when not 
to. And with speculative execution they sometimes run both paths from a 
branch... I could go on!)
> 
> We changed the interface to remove_multiples() to pass along the
> index value 'i' and also the value for 'a' rather than the value
> calculated for the prime 'p'.  Since the loop has been unrolled,
> the values for the 'a' arguments are all constants (and of course
> all the mask values are constants).
> 
> After making a similar change to the remove_multiples() loop, the
> values for the 'b's are likewise all constants.  If the optimizer
> is smart enough, all the 'a*b/30' values and 'a*b%30' values turn
> into constants.  Also the expressions 'i*b' and 'j*a' all turn
> into multiplication by a constant.  Furthermore, to determine the
> appropriate bit value to 'or' in, an eight-choice conditional
> expression that tests each of the eight possible residues can
> similarly be optimized away so the masks to 'or' in are all
> constants.  Do you see how that all works?
> 
> The point of all this unrolling is to get rid of all the shifting
> and table lookups.  The only variables involved are the loop
> variables 'i' and 'j';  everything else has been turned into a
> constant.  Granted, the generated code has been expanded a great
> deal (basically a factor of 8*8 == 64), but the pieces are now
> much simpler so the result is still manageable.
> 
And now a few minutes later as I read that I understand the reason for 
the unrolling better. But my point still stands - benchmark it.

Of course the decision on whether to unroll or not might depend on the 
target CPU!

> I wanted to be sure you understand how the original idea is
> transformed into a better performing version before posting
> something more refined.  I get that your time is at a premium,
> still you might trying going through the exercise of unrolling
> the loops and writing a set_product() routine (or macro) to get a
> sense of where I'm going here.
> 
> Hope to send response to send part sometime soon.

You seem to have spotted a pattern in the steps that I didn't - I looked 
at the mask values, and decided it wasn't worth working out which of the 
8 possible sequences to use - remembering which was too expensive. But 
it's also (I think, I haven't coded it) to work out whether you need to 
step 2, 4, 8 times the prime. And that would save more memory.

Another thought BTW - whether it's worth storing with a larger modulo - 
say 2*3*5*7 - not just 2*3*5 is a different decision from deciding 
whether the steps between primes should take that into consideration.

Andy

[toc] | [prev] | [next] | [standalone]


#119816 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-08-16 08:40 -0700
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<86r0aojx1m.fsf@linuxsc.com>
In reply to#119813
Vir Campestris <vir.campestris@invalid.invalid> writes:

> I'm going to come back to this slightly earlier version, and add
> comments inline.

Okay good deal.  I'm taking out some parts that look like they
are no longer relevant.

> On 10/08/2024 15:07, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>>> On 20/07/2024 15:41, Tim Rentsch wrote:
>>>
>>>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>>>
>>>>> On 02/07/2024 07:20, Tim Rentsch wrote:
>>
>> [...]
>>
>>>> Both divide and mod can be simplified if we use a different
>>>> representation for numbers.  The idea is to separate the
>>>> number into a mod 30 part and divided by 30 part.  An
>>>> arbitrary number N can be split into an ordered pair
>>>>
>>>>      N/30 , N%30
>>>>
>>>> stored in, let's say, an 8 byte quantity with seven bytes
>>>> for N/30 and one byte for N%30.  Let me call these two
>>>> parts i and a for one number N, and j and b for a second
>>>> number M.  Then two form N*M we need to find
>>>>
>>>>      (i*30+a) * (j*30+b)
>>>>
>>>> which is
>>>>
>>>>      (i*30*j*30) + (i*30*b) + (j*30*a) + a*b
>>>>
>>>> Of course we want to take the product and express it in
>>>> the form (product/30, product%30), which can be done by
>>>>
>>>>      30*i*j + i*b + j*a + (a*b)/30,   (a*b)%30
>>>>
>>>> The two numbers a and b are both under 30, so a*b/30 and
>>>> a*b%30 can be done by lookup in a small array.
>>>>
>>>>      30*i*j + i*b + j*a + divide_30[a][b],   modulo_30[a][b]
>>>>
>>>> Now all operations can be done using just multiplies and
>>>> table lookups.  Dividing by 30 is just taking the first
>>>> component;  remainder 30 is just taking the second component.
>>>>
>>>> One further refinement:  for numbers relatively prime to 2,
>>>> 3, and 5, there are only 8 possibles, so the modulo 30
>>>> component can be reduced to just three bits.  That makes the
>>>> tables smaller, at a cost of needing a table lookup to find
>>>> and 'a' or 'b' part.
>>>>
>>>> Does this all make sense?
>>>
>>> Yes, it does.  It's not a direction I was thinking of at all,
>>> although I too had table lookups in mind.
>>>
>>> What I had considered is an extrapolation from my original code.
>>
>> I read the comments below in conjunction with getting a better
>> understanding of your original code, which is part of what took
>> me so long before responding.  Before I get to the stuff below
>> (which I think I will save for a second reply), here is some code
>> expanding on what I said above.  It should compile cleanly as is.
>> However, its purpose is to illustrate what I said above.
>>
>>    typedef unsigned long NN, Index, Count, Mask;
>>    typedef unsigned char Byte;
>
> I've been using uint64_t, not unsigned long.  When I started C
> unsigned long was 32 bits;  I'm never quite sure what you'll get.

I'm coming around to the point of view that size_t is the best
choice for these types.  In any case though the point of using a
typedef is so the more meaningful names can be used but still
changed easily where they are defined.

>>    static  void    remove_multiples( Index, Count, Index );
>>
>>    static  Byte    bytes[1000000000];
>>    static  NN      mods[] = { 1, 7, 11, 13, 17, 19, 23, 29, };
>>
>>    Count
>>    sieve( NN limit ){
>>        Count r = 0;
>>        Index i;
>>        Count shift;
>>        Index i_limit = (limit + 29)/30;
>>
>>        bytes[0] = 1;
>>        for(  i = 0;  i < i_limit;  i++  ){
>>            Byte bits = bytes[i];
>>            for(  shift = 0;  shift < 8;  shift++  ){
>>                NN p = i*30 + mods[shift];
>>                if(  bits & 1<<shift  )  continue;
>>                if(  p >= limit  )  continue;
>>                r++;        // count how many primes found
>>                            // could print the prime 'p' here if desired
>>                if(  p*p < limit  )  remove_multiples( i, shift, i_limit );
>>            }
>>        }
>>
>>        return  r;
>>    }
>>
>>    static  Byte    shift_for[30] = {
>>        9,0,9,9,9,9,  9,1,9,9,9,2,  9,3,9,9,9,4,  9,5,9,9,9,6,  9,9,9,9,9,7,
>>    };
>>
>>  static  Byte  ab_div_30[8][8] = {
>>   {  1*1/30, 1*7/30, 1*11/30, 1*13/30, 1*17/30, 1*19/30, 1*23/30, 1*29/30 },
>>   {  7*1/30, 7*7/30, 7*11/30, 7*13/30, 7*17/30, 7*19/30, 7*23/30, 7*29/30 },
>>   { 11*1/30,11*7/30,11*11/30,11*13/30,11*17/30,11*19/30,11*23/30,11*29/30 },
>>   { 13*1/30,13*7/30,13*11/30,13*13/30,13*17/30,13*19/30,13*23/30,13*29/30 },
>>   { 17*1/30,17*7/30,17*11/30,17*13/30,17*17/30,17*19/30,17*23/30,17*29/30 },
>>   { 19*1/30,19*7/30,19*11/30,19*13/30,19*17/30,19*19/30,19*23/30,19*29/30 },
>>   { 23*1/30,23*7/30,23*11/30,23*13/30,23*17/30,23*19/30,23*23/30,23*29/30 },
>>   { 29*1/30,29*7/30,29*11/30,29*13/30,29*17/30,29*19/30,29*23/30,29*29/30 },
>>  };
>>
>>  static  Byte  ab_mod_30[8][8] = {
>>   {  1*1%30, 1*7%30, 1*11%30, 1*13%30, 1*17%30, 1*19%30, 1*23%30, 1*29%30 },
>>   {  7*1%30, 7*7%30, 7*11%30, 7*13%30, 7*17%30, 7*19%30, 7*23%30, 7*29%30 },
>>   { 11*1%30,11*7%30,11*11%30,11*13%30,11*17%30,11*19%30,11*23%30,11*29%30 },
>>   { 13*1%30,13*7%30,13*11%30,13*13%30,13*17%30,13*19%30,13*23%30,13*29%30 },
>>   { 17*1%30,17*7%30,17*11%30,17*13%30,17*17%30,17*19%30,17*23%30,17*29%30 },
>>   { 19*1%30,19*7%30,19*11%30,19*13%30,19*17%30,19*19%30,19*23%30,19*29%30 },
>>   { 23*1%30,23*7%30,23*11%30,23*13%30,23*17%30,23*19%30,23*23%30,23*29%30 },
>>   { 29*1%30,29*7%30,29*11%30,29*13%30,29*17%30,29*19%30,29*23%30,29*29%30 },
>>  };
>
> constexpr on those tables might give the optimiser something else to
> play with.

All of these tables are going to disappear in the next revision.

>>    void
>>    remove_multiples( Index i, Count shift_a, Index i_limit ){
>
> So i is (prime/30) and shift_a is the bit in the mask byte -
> corresponding to (prime%30)?

Right.  The value 'a' is 'mods[shift_a]', and the value 'shift_a' is
'shift_for[a]'.  Again though we will be dispensing with shift_for[]
and put in code that can be compiled to give constant values.

>>        Index j, k;
>>        NN a = mods[shift_a], b, c;
>>        Count shift;
>>
>>        for(  shift = shift_a;  shift < 8;  shift++  ){
>>            j = i;
>>            b = mods[shift];
>>            k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
>>            c = ab_mod_30[ shift_a ][ shift ];
>>            if(  k >= i_limit  )  continue;
>>            bytes[k] |= 1 << shift_for[c];
>>        }
>>
>>        for(  j = i+1;  (30UL*i+a)*(30UL*j+1) < i_limit * 30;  j++  ){
>>            for(  shift = 0;  shift < 8;  shift++  ){
>>                b = mods[shift];
>>                k = i*j*30 + i*b + j*a + ab_div_30[ shift_a ][ shift ];
>>                c = ab_mod_30[ shift_a ][ shift ];
>>                if(  k >= i_limit  )  continue;
>>                bytes[k] |= 1 << shift_for[c];
>>            }
>>        }
>>    }
>>
>> In both sieve() and remove_multiples(), there is an inner loop
>> that ranges over the possible shift values.  (There is also an
>> unnested loop in remove_multiples() that ranges over a subset of
>> shift values, but for now please ignore that.)
>>
>> Since these loops always perform exactly eight iterations, they
>> can be completely unrolled without expanding the code too much.
>> Here is a revised version of the outer loop in sieve() (actually
>> the code below is not yet ready for compilation, because some
>> things have changed):
>>
>>      for(  i = 0;  i < i_limit;  i++  ){
>>          Byte bits = bytes[i] ^ 0xFF;
>>          if(  bits & 0001  )  remove_multiples( i_limit, i,  1 ),  r++;
>>          if(  bits & 0002  )  remove_multiples( i_limit, i,  7 ),  r++;
>>          if(  bits & 0004  )  remove_multiples( i_limit, i, 11 ),  r++;
>>          if(  bits & 0010  )  remove_multiples( i_limit, i, 13 ),  r++;
>>          if(  bits & 0020  )  remove_multiples( i_limit, i, 17 ),  r++;
>>          if(  bits & 0040  )  remove_multiples( i_limit, i, 19 ),  r++;
>>          if(  bits & 0100  )  remove_multiples( i_limit, i, 23 ),  r++;
>>          if(  bits & 0200  )  remove_multiples( i_limit, i, 29 ),  r++;
>>      }
>
> Before unrolling always benchmark.  (Modern compilers are pretty smart,
> and sometimes will unroll for you.  Modern processors with branch
> prediction are pretty good about working out when to jump and when not
> to.  And with speculative execution they sometimes run both paths from
> a branch... I could go on!)

I understand the advice.  Keep in mind though that the code I'm
showing (and also true in the next version) is presented as a way of
explaining how I got to where I got.  I didn't start with the loop
version and unroll it;  everything was unrolled right from the get
go.  And there are more revisions to come, even after the next one
(to be shown below).

>> We changed the interface to remove_multiples() to pass along the
>> index value 'i' and also the value for 'a' rather than the value
>> calculated for the prime 'p'.  Since the loop has been unrolled,
>> the values for the 'a' arguments are all constants (and of course
>> all the mask values are constants).
>>
>> After making a similar change to the remove_multiples() loop, the
>> values for the 'b's are likewise all constants.  If the optimizer
>> is smart enough, all the 'a*b/30' values and 'a*b%30' values turn
>> into constants.  Also the expressions 'i*b' and 'j*a' all turn
>> into multiplication by a constant.  Furthermore, to determine the
>> appropriate bit value to 'or' in, an eight-choice conditional
>> expression that tests each of the eight possible residues can
>> similarly be optimized away so the masks to 'or' in are all
>> constants.  Do you see how that all works?
>>
>> The point of all this unrolling is to get rid of all the shifting
>> and table lookups.  The only variables involved are the loop
>> variables 'i' and 'j';  everything else has been turned into a
>> constant.  Granted, the generated code has been expanded a great
>> deal (basically a factor of 8*8 == 64), but the pieces are now
>> much simpler so the result is still manageable.
>
> And now a few minutes later as I read that I understand the reason
> for the unrolling better.  But my point still stands - benchmark
> it.
>
> Of course the decision on whether to unroll or not might depend on
> the target CPU!

For this approach to work it's crucial to unroll completely to make
sure all the divides and remainders, and also the mask choices, get
optimized out.  I suppose it's possible that some architecture out
there could be faster without all constant folding, but it seems
unlikely, and in any case I made an early decision to write the code
using this approach, which has worked out well.

>> I wanted to be sure you understand how the original idea is
>> transformed into a better performing version before posting
>> something more refined.  I get that your time is at a premium,
>> still you might trying going through the exercise of unrolling
>> the loops and writing a set_product() routine (or macro) to get a
>> sense of where I'm going here.
>>
>> Hope to send response to send part sometime soon.
>
> You seem to have spotted a pattern in the steps that I didn't - I
> looked at the mask values, and decided it wasn't worth working out
> which of the 8 possible sequences to use - remembering which was too
> expensive.  But it's also (I think, I haven't coded it) to work out
> whether you need to step 2, 4, 8 times the prime.  And that would save
> more memory.

The possible steps for mod30 are 2 or 4 or 6 times the prime.

> Another thought BTW - whether it's worth storing with a larger modulo
> - say 2*3*5*7 - not just 2*3*5 is a different decision from deciding
> whether the steps between primes should take that into consideration.

My guess is that the possible steps are still just 2 or 4 or 6
(times the prime), but I haven't checked that.

Now here is revised code where the unrolling has been done, and also
uses optimizable mask production:

void
mod30_sieve_x( Index N, NN limit, Index i_limit ){
  Index  i;

    bytes[0] = 1;
    for(  i = 0;  i < i_limit;  i++  ){
	Bits cbits = bytes[i];
	Bits pbits = cbits ^ -1;
	if(  pbits & 0001  )  remove_multiples_x( N, i,  1, pbits );
	if(  pbits & 0002  )  remove_multiples_x( N, i,  7, pbits );
	if(  pbits & 0004  )  remove_multiples_x( N, i, 11, pbits );
	if(  pbits & 0010  )  remove_multiples_x( N, i, 13, pbits );
	if(  pbits & 0020  )  remove_multiples_x( N, i, 17, pbits );
	if(  pbits & 0040  )  remove_multiples_x( N, i, 19, pbits );
	if(  pbits & 0100  )  remove_multiples_x( N, i, 23, pbits );
	if(  pbits & 0200  )  remove_multiples_x( N, i, 29, pbits );
    }
}


#define	set_product_x( i, a, j, b ) (					\
    bytes[ i*j*30 + i*b + j*a + a*b/30 ] |= BIT_FOR( (a*b%30) )		\
)

#define BIT_FOR( m ) (							     \
  m ==  1 ? 0001  :  m ==  7 ? 0002  :  m == 11 ? 0004  :  m == 13 ? 0010  : \
  m == 17 ? 0020  :  m == 19 ? 0040  :  m == 23 ? 0100  :  m == 29 ? 0200  : \
  0									     \
)

void
remove_multiples_x( Index N, Index i, NN a, Bits ibits ){
  NN	 p	=  30*i + a;
  Index	 j;
  Index	 jn	=  (N*30/p + 29)/30;

    switch(  a  ){
      case  1:	set_product_x( i, a, i,  1 );
      case  7:	set_product_x( i, a, i,  7 );
      case 11:	set_product_x( i, a, i, 11 );
      case 13:	set_product_x( i, a, i, 13 );
      case 17:	set_product_x( i, a, i, 17 );
      case 19:	set_product_x( i, a, i, 19 );
      case 23:	set_product_x( i, a, i, 23 );
      case 29:	set_product_x( i, a, i, 29 );
    }

    for(  j = i+1;  j < jn;  j++  ){
	set_product_x( i, a, j,  1 );
	set_product_x( i, a, j,  7 );
	set_product_x( i, a, j, 11 );
	set_product_x( i, a, j, 13 );
	set_product_x( i, a, j, 17 );
	set_product_x( i, a, j, 19 );
	set_product_x( i, a, j, 23 );
	set_product_x( i, a, j, 29 );
    }
}


Some comments about the upper bounds of the loops:

First, assume that 'i_limit' has been computed in conjunction with
the determination of the parameter N so that it is exactly right.

Second, the initializing expression for 'jn' is safe but may miss
part of one more value of j.  There are different ways to take care
of that, but it's a minor detail so I'm skipping over it for now.
Of course feel free to think about how that could be done if you
want to, and if not then no worries, I expect to say more about that
in a later followup.

[toc] | [prev] | [next] | [standalone]


#119817 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-16 19:35 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<v9o2kf$1gqv1$1@raubtier-asyl.eternal-september.org>
In reply to#119816
Am 16.08.2024 um 17:40 schrieb Tim Rentsch:

>      bytes[0] = 1;
>      for(  i = 0;  i < i_limit;  i++  ){
> 	Bits cbits = bytes[i];
> 	Bits pbits = cbits ^ -1;
> 	if(  pbits & 0001  )  remove_multiples_x( N, i,  1, pbits );
> 	if(  pbits & 0002  )  remove_multiples_x( N, i,  7, pbits );
> 	if(  pbits & 0004  )  remove_multiples_x( N, i, 11, pbits );
> 	if(  pbits & 0010  )  remove_multiples_x( N, i, 13, pbits );
> 	if(  pbits & 0020  )  remove_multiples_x( N, i, 17, pbits );
> 	if(  pbits & 0040  )  remove_multiples_x( N, i, 19, pbits );
> 	if(  pbits & 0100  )  remove_multiples_x( N, i, 23, pbits );
> 	if(  pbits & 0200  )  remove_multiples_x( N, i, 29, pbits );
>      }
> }

use a countr_zero() with a following switch-statement and make the
default std::unreachable() (C++23). Then everything is only a table
lookup. With your code you've got a lot of bad predictible branches.

But basically I don't think it is a good idea to skip numbers exept
multiples of two. With the three you save a sixth of memory, with
the five you save a 15-th and at the end you get about 20% less
storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
for a lot of computation. That's the point where I dropped this
idea and I think this extra computation is higher than the time
for the saved memory loads.

[toc] | [prev] | [next] | [standalone]


#119818 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-16 19:55 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<v9o3q1$1h24f$1@raubtier-asyl.eternal-september.org>
In reply to#119817
Am 16.08.2024 um 19:35 schrieb Bonita Montero:

> But basically I don't think it is a good idea to skip numbers exept
> multiples of two. With the three you save a sixth of memory, with
> the five you save a 15-th and at the end you get about 20% less
> storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
> for a lot of computation. That's the point where I dropped this
> idea and I think this extra computation is higher than the time
> for the saved memory loads.
> 

This calculates the percentage of the memory-access-savings:

#include <iostream>
#include <span>
#include <algorithm>

using namespace std;

int main()
{
	static unsigned const rawPrimes[] =
	{
		2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 
67, 71,
		73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151,
		157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233,
		239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317,
		331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419,
		421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503,
		509, 521, 523, 541
	};
	span primes( rawPrimes );
	double sum = 0.0;
	auto
		rEnd = primes.rend(),
		cur = rEnd - 2;
	do
	{
		double add = 1.0;
		for_each( cur, rEnd, [&]( int p )
			{ add /= p; } );
		sum += add;
	} while( --cur != primes.rbegin() );
	cout << 100 * sum << "%" << endl;
}

[toc] | [prev] | [next] | [standalone]


#119823 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-08-19 21:23 +0100
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va09jq$30cvv$1@dont-email.me>
In reply to#119817
On 16/08/2024 18:35, Bonita Montero wrote:
<snip>
> But basically I don't think it is a good idea to skip numbers exept
> multiples of two. With the three you save a sixth of memory, with
> the five you save a 15-th and at the end you get about 20% less
> storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
> for a lot of computation. That's the point where I dropped this
> idea and I think this extra computation is higher than the time
> for the saved memory loads.
> 
It's not just storage you save, it's also computation.

That program I published up-thread is almost as fast as yours - within 
10% of the elapsed time - while only using a single core. The amount of 
CPU used is much lower.

Andy

[toc] | [prev] | [next] | [standalone]


#119832 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-20 17:21 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va2ca2$3e60e$1@raubtier-asyl.eternal-september.org>
In reply to#119823
Am 19.08.2024 um 22:23 schrieb Vir Campestris:
> On 16/08/2024 18:35, Bonita Montero wrote:
> <snip>
>> But basically I don't think it is a good idea to skip numbers exept
>> multiples of two. With the three you save a sixth of memory, with
>> the five you save a 15-th and at the end you get about 20% less
>> storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
>> for a lot of computation. That's the point where I dropped this
>> idea and I think this extra computation is higher than the time
>> for the saved memory loads.
>>
> It's not just storage you save, it's also computation.
> 
> That program I published up-thread is almost as fast as yours - within 
> 10% of the elapsed time - while only using a single core. The amount of 
> CPU used is much lower.
> 
> Andy

On my AMD 7990X all primes up to 2 ^ 32 are calculated with my code
with a 65W-setting of the CPU in 0.168s. The total CPU-time is 3.688s.
With your code all primes up to 2 ^ 32 take nearly exactly four seconds
with a single thread. So the overall computation time with my algorithm
is about seconds less.

[toc] | [prev] | [next] | [standalone]


#119833 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-20 17:24 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va2cfr$3e9l1$1@raubtier-asyl.eternal-september.org>
In reply to#119832
Am 20.08.2024 um 17:21 schrieb Bonita Montero:
> Am 19.08.2024 um 22:23 schrieb Vir Campestris:
>> On 16/08/2024 18:35, Bonita Montero wrote:
>> <snip>
>>> But basically I don't think it is a good idea to skip numbers exept
>>> multiples of two. With the three you save a sixth of memory, with
>>> the five you save a 15-th and at the end you get about 20% less
>>> storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
>>> for a lot of computation. That's the point where I dropped this
>>> idea and I think this extra computation is higher than the time
>>> for the saved memory loads.
>>>
>> It's not just storage you save, it's also computation.
>>
>> That program I published up-thread is almost as fast as yours - within 
>> 10% of the elapsed time - while only using a single core. The amount 
>> of CPU used is much lower.
>>
>> Andy
> 
> On my AMD 7990X all primes up to 2 ^ 32 are calculated with my code
> with a 65W-setting of the CPU in 0.168s. The total CPU-time is 3.688s.
> With your code all primes up to 2 ^ 32 take nearly exactly four seconds
> with a single thread. So the overall computation time with my algorithm
> is about seconds less.

And if I run my code with a single core:

	C:\Users\Boni\Documents\Source\bitmapSieve>timep 
"x64\Release-clang++\bitmapSieve.exe" 0x100000000 "" 1
	real   1.795s
	user   1.766s
	sys    0.000s
	cylces 8.011.804.500

Thats less than half the computation time.

[toc] | [prev] | [next] | [standalone]


#119834 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-20 17:43 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va2dik$3eemd$1@raubtier-asyl.eternal-september.org>
In reply to#119833
Am 20.08.2024 um 17:24 schrieb Bonita Montero:

>      C:\Users\Boni\Documents\Source\bitmapSieve>timep "x64\Release- 
> clang++\bitmapSieve.exe" 0x100000000 "" 1
>      real   1.795s
>      user   1.766s
>      sys    0.000s
>      cylces 8.011.804.500
> 
> Thats less than half the computation time.

And printing the primees to a file is rather slow with your code
as you use C++ stream I/O.

	C:\Users\Boni\Documents\Source\test>timep x64\Release-clang-cl\test.exe 
4294967296 primes.txt
	tinyCount 7 smallCount 10 bigCache 0x80000
	real   41.756s
	user   36.734s
	sys    4.734s
	cylces 186.843.020.132

This is my code:

	C:\Users\Boni\Documents\Source\bitmapSieve>timep 
"x64\Release-clang++\bitmapSieve.exe" 4294967296 primes.txt
	real   6.460s
	user   5.828s
	sys    4.344s
	cylces 46.660.685.085



[toc] | [prev] | [next] | [standalone]


#119835 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-08-20 17:55 +0100
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va2hq5$3f1gl$1@dont-email.me>
In reply to#119833
On 20/08/2024 16:24, Bonita Montero wrote:
> Am 20.08.2024 um 17:21 schrieb Bonita Montero:
>> Am 19.08.2024 um 22:23 schrieb Vir Campestris:
>>> On 16/08/2024 18:35, Bonita Montero wrote:
>>> <snip>
>>>> But basically I don't think it is a good idea to skip numbers exept
>>>> multiples of two. With the three you save a sixth of memory, with
>>>> the five you save a 15-th and at the end you get about 20% less
>>>> storage (1 / (2 * 3) + 1 / (2 * 3 * 5) + 1 / (2 * 3 * 5 * 7) ...)
>>>> for a lot of computation. That's the point where I dropped this
>>>> idea and I think this extra computation is higher than the time
>>>> for the saved memory loads.
>>>>
>>> It's not just storage you save, it's also computation.
>>>
>>> That program I published up-thread is almost as fast as yours - 
>>> within 10% of the elapsed time - while only using a single core. The 
>>> amount of CPU used is much lower.
>>>
>>> Andy
>>
>> On my AMD 7990X all primes up to 2 ^ 32 are calculated with my code
>> with a 65W-setting of the CPU in 0.168s. The total CPU-time is 3.688s.
>> With your code all primes up to 2 ^ 32 take nearly exactly four seconds
>> with a single thread. So the overall computation time with my algorithm
>> is about seconds less.
> 
> And if I run my code with a single core:
> 
>      C:\Users\Boni\Documents\Source\bitmapSieve>timep 
> "x64\Release-clang++\bitmapSieve.exe" 0x100000000 "" 1
>      real   1.795s
>      user   1.766s
>      sys    0.000s
>      cylces 8.011.804.500
> 
> Thats less than half the computation time.

That's really odd. I too have an AMD - although my AMD Ryzen 5 3400G is 
chosen for low power, not speed. Surely it can't be because I use Linux, 
not Windows?

I'm leaning towards thinking you may have a point on the mod30 code. 
There are more operations on the innermost loop with mod30, although the 
loop goes around fewer times. It also means that the store is forced to 
be byte wide - I find that my original odd-only code is significantly 
faster - about 30% - when the store is 64 bit wide rather than byte.

Speed of writing the primes out? I don't care. The only reason I put it 
in there was to allow me to check the output. If I did care I'd be 
putting a more efficient enumeration function too.

Andy

[toc] | [prev] | [next] | [standalone]


#119836 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-20 18:59 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<va2i1h$3f3tc$1@raubtier-asyl.eternal-september.org>
In reply to#119835
Am 20.08.2024 um 18:55 schrieb Vir Campestris:

> That's really odd. I too have an AMD - although my AMD Ryzen 5 3400G is 
> chosen for low power, not speed. Surely it can't be because I use Linux, 
> not Windows?

My code spends its time almost only in userspace. The compiler is clang
17.0.3, the clang-version currently shipped as an alternative to MSVC
with Visual Studio.

> 
> I'm leaning towards thinking you may have a point on the mod30 code. 
> There are more operations on the innermost loop with mod30, although the 
> loop goes around fewer times. It also means that the store is forced to 
> be byte wide - I find that my original odd-only code is significantly 
> faster - about 30% - when the store is 64 bit wide rather than byte.

Make the algorithm cache-aware, i.e. calculate chunks which fit into
the L2-cache.

> Speed of writing the primes out? I don't care. The only reason I put it 
> in there was to allow me to check the output. If I did care I'd be 
> putting a more efficient enumeration function too.

My output code isn't even parallelized and writes only a gigabyte per
second to the SSD.

[toc] | [prev] | [next] | [standalone]


#119915 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-08-26 12:08 -0700
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<86v7zn3xwk.fsf@linuxsc.com>
In reply to#119835
Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 20/08/2024 16:24, Bonita Montero wrote:
>
>> Am 20.08.2024 um 17:21 schrieb Bonita Montero:
>>
>>> Am 19.08.2024 um 22:23 schrieb Vir Campestris:
>>>
>>>> On 16/08/2024 18:35, Bonita Montero wrote:
>>>> <snip>
>>>>
>>>>> But basically I don't think it is a good idea to skip numbers
>>>>> exept multiples of two.  [...] I think this extra computation
>>>>> is higher than the time for the saved memory loads.

[...]

> I'm leaning towards thinking you may have a point on the mod30
> code.  There are more operations on the innermost loop with mod30,
> although the loop goes around fewer times.  It also means that the
> store is forced to be byte wide - I find that my original odd-only
> code is significantly faster - about 30% - when the store is 64
> bit wide rather than byte.  [...]

One motivation for choosing a mod30 representation is being able to
compute more primes -- almost twice as many.  Any machine with 64GB
of memory should be able to compute primes up to 1.5 trillion.
Even if an odds-only representation runs faster for more limited
sizes (and I'm not yet convinced that it does), it doesn't matter
if it's faster, because it doesn't do the job.  I was computing all
32-bit primes 20 years ago, back when 32-bit machines were a thing.
The challenge for today's world is correspondingly higher.

[toc] | [prev] | [next] | [standalone]


#119918 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-08-27 06:09 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<vajji3$2rbid$1@raubtier-asyl.eternal-september.org>
In reply to#119915
Am 26.08.2024 um 21:08 schrieb Tim Rentsch:

> One motivation for choosing a mod30 representation is being able to
> compute more primes -- almost twice as many.  Any machine with 64GB
> of memory should be able to compute primes up to 1.5 trillion.

You don't need to hold all primes in memory. Just calculate the primes
up to the square root and then you can calculate every segment up to
the maximum from that.

[toc] | [prev] | [next] | [standalone]


#119986 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromVir Campestris <vir.campestris@invalid.invalid>
Date2024-09-01 21:23 +0100
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<vb2if8$1jqts$2@dont-email.me>
In reply to#119918
On 27/08/2024 05:09, Bonita Montero wrote:
> Am 26.08.2024 um 21:08 schrieb Tim Rentsch:
> 
>> One motivation for choosing a mod30 representation is being able to
>> compute more primes -- almost twice as many.  Any machine with 64GB
>> of memory should be able to compute primes up to 1.5 trillion.
> 
> You don't need to hold all primes in memory. Just calculate the primes
> up to the square root and then you can calculate every segment up to
> the maximum from that.

So you compute all the primes up to the square root of a larger number.

I was cleaning up my odds-only code with a view to posting it, when I 
had an idea.

Take 101 (because I can write things more easily).

I want to mark

10201,10302,10403,10504,10605,10706...

With odds only that becomes
10201,10403,10605,10807,11009,11211...

and as the loop increment is a constant it is nice and fast. With mod30 
it isn't a constant, and it hurts to work out the increment.

Except...

If I mark 10201,13231,16261,19291,22321
with an increment of 30*p,

then go back to 10201, add 2p, then increment that by 30
10201	13231	16261
and do that for the other 6 of the 8, my inner loop has a constant 
increment, and my code is compatible with the mod30 store.

That might be quicker. I also have some thoughts over storing the prime 
as two parts. It's 30m+r, where m is modulus and r remainder. r maps 
one-for-one into the byte mask, and m is the index.

I'm going to waste lots more time on this I can see!

Andy

[toc] | [prev] | [next] | [standalone]


#119987 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-09-01 20:40 -0700
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<86tteyrad0.fsf@linuxsc.com>
In reply to#119986
Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 27/08/2024 05:09, Bonita Montero wrote:
>
>> Am 26.08.2024 um 21:08 schrieb Tim Rentsch:
>>
>>> One motivation for choosing a mod30 representation is being able to
>>> compute more primes -- almost twice as many.  Any machine with 64GB
>>> of memory should be able to compute primes up to 1.5 trillion.
>>
>> You don't need to hold all primes in memory.  Just calculate the primes
>> up to the square root and then you can calculate every segment up to
>> the maximum from that.

This is a brainless statement (and incidentally illustrates why I
was motivated not to read postings from BM).  In fact no primes need
be pre-calculated to determine whether any given number is prime.
The point of using a sieve is the sieve method is much faster than
not using one.  For example, suppose we want to determine all primes
less than a trillion.  Taking the approach of pre-computing only
those primes less than a million (the square root) and then testing
numbers individually takes more than 100 times as many operations as
using a sieve.  Furthermore the operations used are more expensive
for the non-sieve approach - simply setting a bit in the case of a
sieve, versus computing a remainder in the non-sieve case.

> So you compute all the primes up to the square root of a larger number.
>
> I was cleaning up my odds-only code with a view to posting it, when I
> had an idea.
>
> Take 101 (because I can write things more easily).
>
> I want to mark
>
> 10201,10302,10403,10504,10605,10706...
>
> With odds only that becomes
> 10201,10403,10605,10807,11009,11211...

Right.  Marking of multiples needs to start only at p*p, not
at p+2.

> and as the loop increment is a constant it is nice and fast.  With
> mod30 it isn't a constant, and it hurts to work out the increment.
>
> Except...
>
> If I mark 10201,13231,16261,19291,22321
> with an increment of 30*p,
>
> then go back to 10201, add 2p, then increment that by 30
> 10201	13231	16261
> and do that for the other 6 of the 8, my inner loop has a constant
> increment, and my code is compatible with the mod30 store.

Yes, reordering the loops this way might very well make things
work better.  It also has the property, I believe, that the
bit to set is the same in each of the eight outer loops, and
so can be computed only once for each of the eight outer loops.
Very good!

> That might be quicker.  I also have some thoughts over storing the
> prime as two parts.  It's 30m+r, where m is modulus and r remainder. r
> maps one-for-one into the byte mask, and m is the index.

This is what I've been saying all along!

> I'm going to waste lots more time on this I can see!

I'm interested to see what you come up with.

[toc] | [prev] | [next] | [standalone]


#119988 — Re: OT: Re: Sieve of Erastosthenes optimized to the max

FromBonita Montero <Bonita.Montero@gmail.com>
Date2024-09-02 07:08 +0200
SubjectRe: OT: Re: Sieve of Erastosthenes optimized to the max
Message-ID<vb3h8k$1rp0b$1@raubtier-asyl.eternal-september.org>
In reply to#119987
Am 02.09.2024 um 05:40 schrieb Tim Rentsch:

> This is a brainless statement (and incidentally illustrates why I
> was motivated not to read postings from BM).  In fact no primes need
> be pre-calculated to determine whether any given number is prime.
> The point of using a sieve is the sieve method is much faster than
> not using one.  For example, suppose we want to determine all primes
> less than a trillion.  Taking the approach of pre-computing only
> those primes less than a million (the square root) and then testing
> numbers individually takes more than 100 times as many operations as
> using a sieve.  Furthermore the operations used are more expensive
> for the non-sieve approach - simply setting a bit in the case of a
> sieve, versus computing a remainder in the non-sieve case.


I don't wanted to say that each numbe is indivually sieved against all
primes up to the square root but a range. I do that myself to speed up
the computation.

[toc] | [prev] | [next] | [standalone]


Page 9 of 11 — ← Prev page 1 … 7 8 [9] 10 11  Next page →

Back to top | Article view | comp.lang.c++


csiph-web