Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > sci.math.num-analysis > #34096 > unrolled thread
| Started by | Tristan Wibberley <tristan.wibberley+netnews2@alumni.manchester.ac.uk> |
|---|---|
| First post | 2026-02-18 11:21 +0000 |
| Last post | 2026-02-24 18:00 +0000 |
| Articles | 19 — 7 participants |
Back to article view | Back to sci.math.num-analysis
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: srand(0) Tristan Wibberley <tristan.wibberley+netnews2@alumni.manchester.ac.uk> - 2026-02-18 11:21 +0000
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-19 10:01 +0100
Re: srand(0) James Kuyper <jameskuyper@alumni.caltech.edu> - 2026-02-19 14:33 -0500
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-19 20:47 +0100
Re: srand(0) James Kuyper <jameskuyper@alumni.caltech.edu> - 2026-02-20 16:01 -0500
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-21 11:09 +0100
Re: srand(0) Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2026-02-19 14:39 -0800
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-20 09:16 +0100
Re: srand(0) Paul <nospam@needed.invalid> - 2026-02-23 08:32 -0500
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-23 16:05 +0100
Re: srand(0) Michael S <already5chosen@yahoo.com> - 2026-02-23 19:59 +0200
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-23 20:06 +0100
Re: srand(0) Paul <nospam@needed.invalid> - 2026-02-23 15:24 -0500
Re: srand(0) Axel Reichert <mail@axel-reichert.de> - 2026-02-24 07:08 +0100
Re: srand(0) David Brown <david.brown@hesbynett.no> - 2026-02-24 10:24 +0100
Re: srand(0) Axel Reichert <mail@axel-reichert.de> - 2026-02-26 19:13 +0100
Re: srand(0) Michael S <already5chosen@yahoo.com> - 2026-02-24 18:36 +0200
Re: srand(0) Axel Reichert <mail@axel-reichert.de> - 2026-02-24 20:00 +0100
Re: srand(0) Tristan Wibberley <tristan.wibberley+netnews2@alumni.manchester.ac.uk> - 2026-02-24 18:00 +0000
| From | Tristan Wibberley <tristan.wibberley+netnews2@alumni.manchester.ac.uk> |
|---|---|
| Date | 2026-02-18 11:21 +0000 |
| Subject | Re: srand(0) |
| Message-ID | <10n47bi$2io53$3@dont-email.me> |
On 18/02/2026 07:47, Tim Rentsch wrote: > The key property of a (pseudo) random number generator is that the > values produced exhibit no discernible pattern. For a PRNG, they exhibit the pattern of following the sequence of the PRNG! Is it that, for any finite sequence of numbers from a PRNG, without information about where it came from and how many numbers came before you can't predict the next number better than chance? -- Tristan Wibberley The message body is Copyright (C) 2026 Tristan Wibberley except citations and quotations noted. All Rights Reserved except that you may, of course, cite it academically giving credit to me, distribute it verbatim as part of a usenet system or its archives, and use it to promote my greatness and general superiority without misrepresentation of my opinions other than my opinion of my greatness and general superiority which you _may_ misrepresent. You definitely MAY NOT train any production AI system with it but you may train experimental AI that will only be used for evaluation of the AI methods it implements.
[toc] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-19 10:01 +0100 |
| Message-ID | <10n6jgg$3bj7e$1@dont-email.me> |
| In reply to | #34096 |
On 18/02/2026 12:21, Tristan Wibberley wrote: > On 18/02/2026 07:47, Tim Rentsch wrote: > >> The key property of a (pseudo) random number generator is that the >> values produced exhibit no discernible pattern. > > For a PRNG, they exhibit the pattern of following the sequence of the PRNG! > As a deterministic function, a PRNG will obviously follow the pattern of its generating function. But the aim is to have no /discernible/ pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that could be identified without knowledge of where they came from - and thus no way to predict the next number, 9, in the sequence. But there is a pattern there - it's the 90th - 100th digits of the decimal expansion of pi. > Is it that, for any finite sequence of numbers from a PRNG, without > information about where it came from and how many numbers came before > you can't predict the next number better than chance? > That's the general aim, yes. But Michael is absolutely correct that only the consumer can say what they want to measure in order to judge the quality of any piece of code. It is the customer that gives the requirement specifications, and the programmer's job is to write code that fulfils those specifications. PRNGs are no different. (In practice, many customers need help figuring out what their requirements are, and how to express those, but that's another matter.) In the case of PRNGs, there are many possible requirements beyond the "it's hard to predict the next number in the sequence". These include : * Simplicity of implementation * Cryptographic security of implementation * Running speed * Statistical distribution of values (with many possible patterns, and consideration of length of samples) * Repeat cycle length * Psychological factors (sometimes you roll five sixes in a row, but that might look like a loaded dice. Randomised playlists often use modifications to their PRNGs to avoid repetition of songs, and plotting random points in a 2-D space does not look "random" to most people) * Interaction with added entropy sources As with most requirements for most software, turning most of these into some kind of directly and objectively measurable "quality" function is difficult or impossible in practice. As Michael says, the only thing you can do is when a consumer complains that it is not good enough for their purposes, ask them how to identify when it would be good enough.
[toc] | [prev] | [next] | [standalone]
| From | James Kuyper <jameskuyper@alumni.caltech.edu> |
|---|---|
| Date | 2026-02-19 14:33 -0500 |
| Message-ID | <10n7oie$289ca$2@dont-email.me> |
| In reply to | #34097 |
On 2026-02-19 04:01, David Brown wrote: > On 18/02/2026 12:21, Tristan Wibberley wrote: >> On 18/02/2026 07:47, Tim Rentsch wrote: >> >>> The key property of a (pseudo) random number generator is that the >>> values produced exhibit no discernible pattern. >> >> For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >> > > As a deterministic function, a PRNG will obviously follow the pattern of > its generating function. But the aim is to have no /discernible/ > pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that > could be identified without knowledge of where they came from - and thus > no way to predict the next number, 9, in the sequence. But there is a > pattern there - it's the 90th - 100th digits of the decimal expansion of pi. I think you're being overoptimistic. I suspect that the pattern could be identified, exactly, without knowing how it was generated. That's because every possible pattern has infinitely many different ways in which in can be produced. One of those other ways might be easier to describe than the way in which the numbers were actually produced, in which case that simpler way might be guessed more easily that the actual one - possibly a lot easier.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-19 20:47 +0100 |
| Message-ID | <10n7pct$3puc5$1@dont-email.me> |
| In reply to | #34098 |
On 19/02/2026 20:33, James Kuyper wrote: > On 2026-02-19 04:01, David Brown wrote: >> On 18/02/2026 12:21, Tristan Wibberley wrote: >>> On 18/02/2026 07:47, Tim Rentsch wrote: >>> >>>> The key property of a (pseudo) random number generator is that the >>>> values produced exhibit no discernible pattern. >>> >>> For a PRNG, they exhibit the pattern of following the sequence of the PRNG! >>> >> >> As a deterministic function, a PRNG will obviously follow the pattern of >> its generating function. But the aim is to have no /discernible/ >> pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that >> could be identified without knowledge of where they came from - and thus >> no way to predict the next number, 9, in the sequence. But there is a >> pattern there - it's the 90th - 100th digits of the decimal expansion of pi. > > I think you're being overoptimistic. I suspect that the pattern could be > identified, exactly, without knowing how it was generated. That's > because every possible pattern has infinitely many different ways in > which in can be produced. One of those other ways might be easier to > describe than the way in which the numbers were actually produced, in > which case that simpler way might be guessed more easily that the actual > one - possibly a lot easier. How likely is it that someone would guess a formula that happened to generate the decimal digits of pi, without more knowledge than a part of the sequence? I don't believe it is possible to quantify such a probability, but I would expect it to be very low.
[toc] | [prev] | [next] | [standalone]
| From | James Kuyper <jameskuyper@alumni.caltech.edu> |
|---|---|
| Date | 2026-02-20 16:01 -0500 |
| Message-ID | <10nai2e$289cb$1@dont-email.me> |
| In reply to | #34099 |
On 2026-02-19 14:47, David Brown wrote: ... > How likely is it that someone would guess a formula that happened to > generate the decimal digits of pi, without more knowledge than a part > of the sequence? I don't believe it is possible to quantify such a > probability, but I would expect it to be very low. I'm thinking of the kind of software that looks for patterns in something, such as compression utilities. A compression utility basically converts a long string of numbers into a much shorter string that can be expanded by the decompression utility to recover the original pattern. If you look at the algorithms such code uses, you realize that they do not attempt to recreate the process that originally generated the long string, they just, in effect, characterize the resulting sequence.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-21 11:09 +0100 |
| Message-ID | <10nc088$15gea$1@dont-email.me> |
| In reply to | #34102 |
On 20/02/2026 22:01, James Kuyper wrote: > On 2026-02-19 14:47, David Brown wrote: > ... >> How likely is it that someone would guess a formula that happened to >> generate the decimal digits of pi, without more knowledge than a part >> of the sequence? I don't believe it is possible to quantify such a >> probability, but I would expect it to be very low. > > I'm thinking of the kind of software that looks for patterns in > something, such as compression utilities. A compression utility > basically converts a long string of numbers into a much shorter string > that can be expanded by the decompression utility to recover the > original pattern. If you look at the algorithms such code uses, you > realize that they do not attempt to recreate the process that originally > generated the long string, they just, in effect, characterize the > resulting sequence. > One of the characteristics of a good PRNG is that its unpredictability and its statistical properties (typically they aim to be like white noise, but other distributions are sometimes useful) make them uncompressible with generic algorithms. Since the PRNG's sequence is generated from an algorithm, it is of course always possible to re-create that algorithm as a "compressed" form of the sequence. But generic algorithms will never manage it. Indeed, you can /define/ a random sequence as a series x(i) such that for any given compression algorithm Z, there is an integer n such that the Z-compressed version of x is always bigger than the original x for a sequence length n or more. And for any given compression algorithm Z, you can find a PRNG algorithm that Z cannot compress (after a possible initial segment). I haven't dug through the details, but I am confident that you could show this with a diagonalisation algorithm over Turing machines, or something akin to the halting problem proofs. Or you can just try it yourself: $ dd if=/dev/urandom of=rand bs=1M count=1 $ cp rand rand1 $ cp rand rand2 $ gzip rand1 $ bzip2 rand2 $ ls -l -rw-rw-r-- 1 david david 1048576 Feb 20 09:12 rand -rw-rw-r-- 1 david david 1048760 Feb 20 09:12 rand1.gz -rw-rw-r-- 1 david david 1053414 Feb 20 09:12 rand2.bz2
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2026-02-19 14:39 -0800 |
| Message-ID | <87ikbsgxy2.fsf@example.invalid> |
| In reply to | #34097 |
David Brown <david.brown@hesbynett.no> writes:
[...]
> As a deterministic function, a PRNG will obviously follow the pattern
> of its generating function. But the aim is to have no /discernible/
> pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
> could be identified without knowledge of where they came from - and
> thus no way to predict the next number, 9, in the sequence. But there
> is a pattern there - it's the 90th - 100th digits of the decimal
> expansion of pi.
[...]
A Google search for 342117067 gives numerous hits referring to the
digits of pi.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-20 09:16 +0100 |
| Message-ID | <10n9584$7int$1@dont-email.me> |
| In reply to | #34100 |
On 19/02/2026 23:39, Keith Thompson wrote: > David Brown <david.brown@hesbynett.no> writes: > [...] >> As a deterministic function, a PRNG will obviously follow the pattern >> of its generating function. But the aim is to have no /discernible/ >> pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that >> could be identified without knowledge of where they came from - and >> thus no way to predict the next number, 9, in the sequence. But there >> is a pattern there - it's the 90th - 100th digits of the decimal >> expansion of pi. > [...] > > A Google search for 342117067 gives numerous hits referring to the > digits of pi. > That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.
[toc] | [prev] | [next] | [standalone]
| From | Paul <nospam@needed.invalid> |
|---|---|
| Date | 2026-02-23 08:32 -0500 |
| Message-ID | <10nhktn$31293$1@dont-email.me> |
| In reply to | #34101 |
On Fri, 2/20/2026 3:16 AM, David Brown wrote:
> On 19/02/2026 23:39, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>> [...]
>>> As a deterministic function, a PRNG will obviously follow the pattern
>>> of its generating function. But the aim is to have no /discernible/
>>> pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that
>>> could be identified without knowledge of where they came from - and
>>> thus no way to predict the next number, 9, in the sequence. But there
>>> is a pattern there - it's the 90th - 100th digits of the decimal
>>> expansion of pi.
>> [...]
>>
>> A Google search for 342117067 gives numerous hits referring to the
>> digits of pi.
>>
>
> That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle.
>
"In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-)
PI=3.
1415926535 8979323846 2643383279 5028841971 6939937510
...
7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834
I suspect seeing that, that's not good.
Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file,
I get this:
(text version of PI) 100,000,022 bytes
PI-Binary.bin 41,524,121 bytes exponent and limbs
PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core
The entropy property looks pretty good, but I doubt I would
be using that for my supply of random numbers :-)
https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html
https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type
https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html
gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm
Paul
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-23 16:05 +0100 |
| Message-ID | <10nhqc9$32dop$1@dont-email.me> |
| In reply to | #34104 |
On 23/02/2026 14:32, Paul wrote: > On Fri, 2/20/2026 3:16 AM, David Brown wrote: >> On 19/02/2026 23:39, Keith Thompson wrote: >>> David Brown <david.brown@hesbynett.no> writes: >>> [...] >>>> As a deterministic function, a PRNG will obviously follow the pattern >>>> of its generating function. But the aim is to have no /discernible/ >>>> pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 has no pattern that >>>> could be identified without knowledge of where they came from - and >>>> thus no way to predict the next number, 9, in the sequence. But there >>>> is a pattern there - it's the 90th - 100th digits of the decimal >>>> expansion of pi. >>> [...] >>> >>> A Google search for 342117067 gives numerous hits referring to the >>> digits of pi. >>> >> >> That is using knowledge of where the sequence comes from - something else's knowledge rather than your own, but it's the same principle. >> > > "In the following sequence, what is the next digit 7,7,7,7,7,7,7,7,7 ? " :-) > > PI=3. > > 1415926535 8979323846 2643383279 5028841971 6939937510 > ... > 7777777772 4846769425 9310468643 5260899021 0266057232 # Line 517834 > > I suspect seeing that, that's not good. > > Using pgmp-chudnovsky.c , and dumping pi as a binary float to a file, > I get this: > > (text version of PI) 100,000,022 bytes > PI-Binary.bin 41,524,121 bytes exponent and limbs > PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra compression, running on 1 core > > The entropy property looks pretty good, but I doubt I would > be using that for my supply of random numbers :-) > In a random sequence of decimal digits, you would expect a sequence of nine identical digits to turn up on average every 10 ^ 8 digits or so. You calculated 10 ^ 8 digits, so it's not surprising to see that here. As for your compression, remember that your text file contains only the digits 0 to 9, spaces and newlines - 12 different characters in 8-bit bytes. If these were purely randomly distributed, you'd expect a best compression ratio of log(12) / log(256), or 0.448. But they are not completely random - your space characters and newlines are predictably spaced so you get marginally better compression ratios. Without spaces and newlines, you'd expect log(10) / log(256) compression - 0.415241012. What a coincidence - this matches your "exponent and limbs", and your compressor can't improve on it. (I downloaded a billion digits of pi and gzip'ed it, for a compression ration of 0.469.) It turns out that the pseudo-randomness here is extremely good. While it has not been proven that pi is "normal" (that is to say, its digits are all evenly distributed), it is strongly believed to be so. Of course it's not a great source of entropy for secure random numbers, but the digits of pi form a fine pseudo-random generator function (if you don't mind the calculation time). > https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html > https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type > https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html > > gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky pgmp-chudnovsky.c -lgmp -lm > > Paul >
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2026-02-23 19:59 +0200 |
| Message-ID | <20260223195917.000028af@yahoo.com> |
| In reply to | #34105 |
On Mon, 23 Feb 2026 16:05:45 +0100 David Brown <david.brown@hesbynett.no> wrote: > On 23/02/2026 14:32, Paul wrote: > > On Fri, 2/20/2026 3:16 AM, David Brown wrote: > >> On 19/02/2026 23:39, Keith Thompson wrote: > >>> David Brown <david.brown@hesbynett.no> writes: > >>> [...] > >>>> As a deterministic function, a PRNG will obviously follow the > >>>> pattern of its generating function. But the aim is to have no > >>>> /discernible/ pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 > >>>> has no pattern that could be identified without knowledge of > >>>> where they came from - and thus no way to predict the next > >>>> number, 9, in the sequence. But there is a pattern there - it's > >>>> the 90th - 100th digits of the decimal expansion of pi. > >>> [...] > >>> > >>> A Google search for 342117067 gives numerous hits referring to the > >>> digits of pi. > >>> > >> > >> That is using knowledge of where the sequence comes from - > >> something else's knowledge rather than your own, but it's the same > >> principle. > > > > "In the following sequence, what is the next digit > > 7,7,7,7,7,7,7,7,7 ? " :-) > > > > PI=3. > > > > 1415926535 8979323846 2643383279 5028841971 6939937510 > > ... > > 7777777772 4846769425 9310468643 5260899021 0266057232 # Line > > 517834 > > > > I suspect seeing that, that's not good. > > > > Using pgmp-chudnovsky.c , and dumping pi as a binary float to a > > file, I get this: > > > > (text version of PI) 100,000,022 bytes > > PI-Binary.bin 41,524,121 bytes exponent and limbs > > PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra > > compression, running on 1 core > > > > The entropy property looks pretty good, but I doubt I would > > be using that for my supply of random numbers :-) > > > > In a random sequence of decimal digits, you would expect a sequence > of nine identical digits to turn up on average every 10 ^ 8 digits or > so. You calculated 10 ^ 8 digits, so it's not surprising to see that > here. > > As for your compression, remember that your text file contains only > the digits 0 to 9, spaces and newlines - 12 different characters in > 8-bit bytes. If these were purely randomly distributed, you'd expect > a best compression ratio of log(12) / log(256), or 0.448. But they > are not completely random - your space characters and newlines are > predictably spaced so you get marginally better compression ratios. > Without spaces and newlines, you'd expect log(10) / log(256) > compression - 0.415241012. What a coincidence - this matches your > "exponent and limbs", and your compressor can't improve on it. (I > downloaded a billion digits of pi and gzip'ed it, for a compression > ration of 0.469.) > > It turns out that the pseudo-randomness here is extremely good. > While it has not been proven that pi is "normal" (that is to say, its > digits are all evenly distributed), it is strongly believed to be so. > > Of course it's not a great source of entropy for secure random > numbers, but the digits of pi form a fine pseudo-random generator > function (if you don't mind the calculation time). > Would be interesting to find out if it passes Big Crash of L’Ecuyer. Of course, one would need far more than a billion of decimal digits to have a chance. Something like 100B hexadecimal digits appears to be a minimum. > > > https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html > > https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type > > https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html > > > > gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky > > pgmp-chudnovsky.c -lgmp -lm > > > > Paul > > >
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-23 20:06 +0100 |
| Message-ID | <10ni8ff$38dk0$1@dont-email.me> |
| In reply to | #34106 |
On 23/02/2026 18:59, Michael S wrote: > On Mon, 23 Feb 2026 16:05:45 +0100 > David Brown <david.brown@hesbynett.no> wrote: > >> On 23/02/2026 14:32, Paul wrote: >>> On Fri, 2/20/2026 3:16 AM, David Brown wrote: >>>> On 19/02/2026 23:39, Keith Thompson wrote: >>>>> David Brown <david.brown@hesbynett.no> writes: >>>>> [...] >>>>>> As a deterministic function, a PRNG will obviously follow the >>>>>> pattern of its generating function. But the aim is to have no >>>>>> /discernible/ pattern. The sequence 3, 4, 2, 1, 1, 7, 0, 6, 7 >>>>>> has no pattern that could be identified without knowledge of >>>>>> where they came from - and thus no way to predict the next >>>>>> number, 9, in the sequence. But there is a pattern there - it's >>>>>> the 90th - 100th digits of the decimal expansion of pi. >>>>> [...] >>>>> >>>>> A Google search for 342117067 gives numerous hits referring to the >>>>> digits of pi. >>>>> >>>> >>>> That is using knowledge of where the sequence comes from - >>>> something else's knowledge rather than your own, but it's the same >>>> principle. >>> >>> "In the following sequence, what is the next digit >>> 7,7,7,7,7,7,7,7,7 ? " :-) >>> >>> PI=3. >>> >>> 1415926535 8979323846 2643383279 5028841971 6939937510 >>> ... >>> 7777777772 4846769425 9310468643 5260899021 0266057232 # Line >>> 517834 >>> >>> I suspect seeing that, that's not good. >>> >>> Using pgmp-chudnovsky.c , and dumping pi as a binary float to a >>> file, I get this: >>> >>> (text version of PI) 100,000,022 bytes >>> PI-Binary.bin 41,524,121 bytes exponent and limbs >>> PI-Binary.bin.7Z 41,526,823 bytes 7Z Ultra >>> compression, running on 1 core >>> >>> The entropy property looks pretty good, but I doubt I would >>> be using that for my supply of random numbers :-) >>> >> >> In a random sequence of decimal digits, you would expect a sequence >> of nine identical digits to turn up on average every 10 ^ 8 digits or >> so. You calculated 10 ^ 8 digits, so it's not surprising to see that >> here. >> >> As for your compression, remember that your text file contains only >> the digits 0 to 9, spaces and newlines - 12 different characters in >> 8-bit bytes. If these were purely randomly distributed, you'd expect >> a best compression ratio of log(12) / log(256), or 0.448. But they >> are not completely random - your space characters and newlines are >> predictably spaced so you get marginally better compression ratios. >> Without spaces and newlines, you'd expect log(10) / log(256) >> compression - 0.415241012. What a coincidence - this matches your >> "exponent and limbs", and your compressor can't improve on it. (I >> downloaded a billion digits of pi and gzip'ed it, for a compression >> ration of 0.469.) >> >> It turns out that the pseudo-randomness here is extremely good. >> While it has not been proven that pi is "normal" (that is to say, its >> digits are all evenly distributed), it is strongly believed to be so. >> >> Of course it's not a great source of entropy for secure random >> numbers, but the digits of pi form a fine pseudo-random generator >> function (if you don't mind the calculation time). >> > > Would be interesting to find out if it passes Big Crash of L’Ecuyer. > Of course, one would need far more than a billion of decimal digits to > have a chance. Something like 100B hexadecimal digits appears to be a > minimum. > Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits. At least, it is possible to calculate them independently, without having to calculate and remember all the previous digits. So it should be possible to split the task up and run it in parallel on multiple systems. Of course, confirming that the hexadecimal digits of pi are random enough to pass such a test does not ensure that the decimal digits would do so too. >> >>> https://gmplib.org/list-archives/gmp-discuss/2008-November/003444.html >>> https://stackoverflow.com/questions/3318979/how-to-serialize-the-gmp-mpf-type >>> https://gmplib.org/list-archives/gmp-discuss/2007-November/002981.html >>> >>> gcc -DNO_FACTOR -fopenmp -Wall -O2 -o pgmp-chudnovsky >>> pgmp-chudnovsky.c -lgmp -lm >>> >>> Paul >>> >> > >
[toc] | [prev] | [next] | [standalone]
| From | Paul <nospam@needed.invalid> |
|---|---|
| Date | 2026-02-23 15:24 -0500 |
| Message-ID | <10nid1m$3a655$1@dont-email.me> |
| In reply to | #34107 |
On Mon, 2/23/2026 2:06 PM, David Brown wrote:
>>>> (text version of PI) 100,000,022 bytes
>>>> PI-Binary.bin 41,524,121 bytes exponent and limbs
>
> Weirdly (at least /I/ think it is weird), it is easier to calculate hexadecimal digits of pi than decimal digits.
I computed the decimal representation and the hex representation
(by dumping the exponent and limbs), in the same run.
int mpf_out_raw (FILE *f, mpf_t X) {
int expt; mpz_t Z; size_t nz;
expt = X->_mp_exp;
fwrite(&expt, sizeof(int), 1, f);
nz = X->_mp_size;
Z->_mp_alloc = nz;
Z->_mp_size = nz;
Z->_mp_d = X->_mp_d;
return (mpz_out_raw(f, Z) + sizeof(int));
}
And that's called this way.
/* Open the destination file in binary write mode */
FILE *destination = fopen("PI-Binary.bin", "wb");
if (!destination) {
perror("Error opening PI-Binary.bin file");
} else {
mpf_out_raw(destination, qi); /* qi happens to hold 100 million digits of PI */
fflush(destination);
fclose(destination);
}
You can do them in the same run.
The 7,7,7,7,7,7,7,7,2 sequence was detected in a 32 million digit run
of SuperPi 1.5 XS. The 100 million digit sequence is too large
for SuperPI, and pgmp-chudnovsky.c (with OpenMP) was
used for that, with a little extra code thrown in so I could get
the floating point storage in raw format. It takes a bit more
than one minute, to generate 100 million digits (16 cores). The
compression attempt was done on a single core, to ensure the best
attempt at compression with 7ZIP.
The order of the PI method O() used is covered here.
https://en.wikipedia.org/wiki/Chudnovsky_algorithm
Paul
[toc] | [prev] | [next] | [standalone]
| From | Axel Reichert <mail@axel-reichert.de> |
|---|---|
| Date | 2026-02-24 07:08 +0100 |
| Message-ID | <87342qlll7.fsf@axel-reichert.de> |
| In reply to | #34107 |
David Brown <david.brown@hesbynett.no> writes: > Of course, confirming that the hexadecimal digits of pi are random > enough to pass such a test does not ensure that the decimal digits > would do so too. I was puzzled by the "Of course": To me, this is not intuitively clear. Is there any easy (not too technical) way to "see this"/make it plausible? My gut feeling (wrongly?) said that the base should not affect the randomness of a numerical pattern. "Of course" I am aware (and taught to dozens of numerical beginners) that, say, 0.5 in base 10 has a non terminating representation in base 2, but "random" is neither representation. Pointers or simple counter-examples highly welcome! Axel
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2026-02-24 10:24 +0100 |
| Message-ID | <10njqoi$3n9gd$1@dont-email.me> |
| In reply to | #34109 |
On 24/02/2026 07:08, Axel Reichert wrote: > David Brown <david.brown@hesbynett.no> writes: > >> Of course, confirming that the hexadecimal digits of pi are random >> enough to pass such a test does not ensure that the decimal digits >> would do so too. > > I was puzzled by the "Of course": To me, this is not intuitively clear. > Is there any easy (not too technical) way to "see this"/make it > plausible? My gut feeling (wrongly?) said that the base should not > affect the randomness of a numerical pattern. "Of course" I am aware > (and taught to dozens of numerical beginners) that, say, 0.5 in base 10 > has a non terminating representation in base 2, but "random" is neither > representation. > > Pointers or simple counter-examples highly welcome! > > Axel Numbers can be normal in some bases and not in others. This is easy to see if we pick related bases, such as base 2 and base 16. For example, let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly normal in base 2. But in base 16, it is 0.55555555..., which is clearly very far from normal. I think (but I am not sure) that if a number is normal in base B, then it will be normal in any other bases co-prime to B. If the bases are not co-prime, then things are not as clear (as shown by my simple example). Almost all (in the technical mathematical sense) real numbers are normal in all bases. And lots of numbers (including pi) are believed to be normal, and have been checked to various lengths in many bases. But it is extremely difficult to prove that any given number is normal, unless it can be seen from its construction. Being normal in a base is not sufficient to have the digits form a good pseudo-random sequence, but it is a necessary condition for a uniform distribution random sequence. (I know you set the follow-ups to exclude comp.lang.c, since it is off-topic in that group, but I added it again as I don't follow sci.math.num-analysis, so I would not see any replies. People have different opinions on the pros and cons of limiting follow-up groups. My opinion is that it is better to leave all groups there as long as there are people from different groups in the discussion, even if it is moving off-topic for a group, because limiting follow-up groups can lead to fragmentation. It is better for comp.lang.c to have one off-topic thread than to have multiple threads that are part of the same discussion but appear separately as groups are added and removed. If the discussion goes on for long, and becomes dominated by s.m.n regulars and of no interest to c.l.c regulars, then it becomes time to move it over to just that one group. Others can have different opinions on such matters.)
[toc] | [prev] | [next] | [standalone]
| From | Axel Reichert <mail@axel-reichert.de> |
|---|---|
| Date | 2026-02-26 19:13 +0100 |
| Message-ID | <87342njru2.fsf@axel-reichert.de> |
| In reply to | #34110 |
David Brown <david.brown@hesbynett.no> writes: > Numbers can be normal in some bases and not in others. This is easy to > see if we pick related bases, such as base 2 and base 16. For example, > let x be 1/3. Then x is 0.0101010101... in base 2. That is clearly > normal in base 2. But in base 16, it is 0.55555555..., which is > clearly very far from normal. I had to look up "normal" on Wikipedia and was in for a delightful read. Thanks for the pointer and the nice explanation! Best regards Axel
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2026-02-24 18:36 +0200 |
| Message-ID | <20260224183648.00005218@yahoo.com> |
| In reply to | #34109 |
On Tue, 24 Feb 2026 07:08:52 +0100 Axel Reichert <mail@axel-reichert.de> wrote: > "Of course" I am aware > (and taught to dozens of numerical beginners) that, say, 0.5 in base > 10 has a non terminating representation in base 2, but "random" is > neither representation. ??? That's true for 0.1, 0.2, 0.3, 0.4, 0.6, 0.7, 0.8 and 0.9. But 0.5 is represented on base 2 just fine.
[toc] | [prev] | [next] | [standalone]
| From | Axel Reichert <mail@axel-reichert.de> |
|---|---|
| Date | 2026-02-24 20:00 +0100 |
| Message-ID | <87o6lej7bj.fsf@axel-reichert.de> |
| In reply to | #34111 |
Michael S <already5chosen@yahoo.com> writes: > On Tue, 24 Feb 2026 07:08:52 +0100 > Axel Reichert <mail@axel-reichert.de> wrote: > >> "Of course" I am aware >> (and taught to dozens of numerical beginners) that, say, 0.5 in base >> 10 has a non terminating representation in base 2, but "random" is >> neither representation. > > ??? > That's true for 0.1, 0.2, 0.3, 0.4, 0.6, 0.7, 0.8 and 0.9. But 0.5 is > represented on base 2 just fine. Sorry, mixed it up (now, not when teaching ...). I meant 0.1, not 0.5. Thanks for the heads up. Axel
[toc] | [prev] | [next] | [standalone]
| From | Tristan Wibberley <tristan.wibberley+netnews2@alumni.manchester.ac.uk> |
|---|---|
| Date | 2026-02-24 18:00 +0000 |
| Message-ID | <10nkovf$2ocp$1@dont-email.me> |
| In reply to | #34109 |
On 24/02/2026 06:08, Axel Reichert wrote: > David Brown <david.brown@hesbynett.no> writes: > >> Of course, confirming that the hexadecimal digits of pi are random >> enough to pass such a test does not ensure that the decimal digits >> would do so too. > > I was puzzled by the "Of course": To me, this is not intuitively clear. That the opposite is not intuitively clear is sufficient to say that the confirmation of the fact for hexadecimal is not a confirmation of the fact for decimal. -- Tristan Wibberley The message body is Copyright (C) 2026 Tristan Wibberley except citations and quotations noted. All Rights Reserved except that you may, of course, cite it academically giving credit to me, distribute it verbatim as part of a usenet system or its archives, and use it to promote my greatness and general superiority without misrepresentation of my opinions other than my opinion of my greatness and general superiority which you _may_ misrepresent. You definitely MAY NOT train any production AI system with it but you may train experimental AI that will only be used for evaluation of the AI methods it implements.
[toc] | [prev] | [standalone]
Back to top | Article view | sci.math.num-analysis
csiph-web