Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #92267

Re: Testing random

References <87oaksowwg.fsf@Equus.decebal.nl> <1451048.pW9z17ilMA@PointedEars.de> <mailman.242.1433677915.13271.python-list@python.org> <3158703.Lr4HFMbMOd@PointedEars.de>
Date 2015-06-08 02:25 +1000
Subject Re: Testing random
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.257.1433694323.13271.python-list@python.org> (permalink)

Show all headers | View raw


On Mon, Jun 8, 2015 at 1:51 AM, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:
> Chris Angelico wrote:
>
>> On Sun, Jun 7, 2015 at 8:40 PM, Thomas 'PointedEars' Lahn
>> <PointedEars@web.de> wrote:
>>> Cecil Westerhof wrote:
>>>> I wrote a very simple function to test random:
>>>>     def test_random(length, multiplier = 10000):
>>>>         number_list = length * [0]
>>>>         for i in range(length * multiplier):
>>>>             number_list[random.randint(0, length - 1)] += 1
>>>>         minimum = min(number_list)
>>>>         maximum = max(number_list)
>>>>         return (minimum, maximum, minimum / maximum)
>>>
>>> As there is no guarantee that every number will occur randomly, using a
>>> dictionary at first should be more efficient than a list:
>>
>> Hmm, I'm not sure that's actually so. His code is aiming to get
>> 'multiplier' values in each box; for any serious multiplier (he starts
>> with 10 in the main code), you can be fairly confident that every
>> number will come up at least once.
>
> The wording shows a common misconception: that random distribution would
> mean that it is guaranteed or more probable that every element of the set
> will occur at least once.  It is another common misconception that
> increasing the number of trials would increase the probability of that
> happening.  But that is not so.

The greater the multiplier, the lower the chance that any element will
have no hits. With uniform distribution, a length of 10, and a
multiplier of 10, there are 100 attempts with a 90% chance each that
any given number will be avoided - which works out to 0.9**100 ==
2.6561398887587544e-05 probability that (say) there'll be no 4s in the
list. Invert that and raise to the 10th power, and you get a
probability of 0.9997344177567317 that there'll be at least one in
every bucket. Raise the multiplier to 100, and the probability of a
single whiff becomes 1.7478712517226947e-46; invert and raise to the
tenth power, and it becomes closer to certainty than IEEE double
precision can represent. Raise the length to 100 and the numbers get
lower again; with multiplier 10, probability 0.9956920878572284 of
having one in every bucket (that's a half a percent chance of a zero
anywhere), and at multiplier 100, still underflows to certainty.

But you'll notice that I wasn't actually talking about certainty here.
I was talking about confidence, at levels sufficient to make data-type
decisions on. Sure, there's no guarantee that every number will occur;
but if there's a 0.4% chance that any number will be omitted, I think
the list is going to work out more efficient.

You'll notice that some of the other posts have been concerned more
about correctness (for instance, using collections.Counter and then
making sure there's a zero slot for every element - otherwise empty
slots would be omitted), and then they _do_ acknowledge the chance
that something will underflow. But with the numbers the OP gave, I
would be fully satisfied with optimizing for the case where every
bucket gets at least something.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Testing random Cecil Westerhof <Cecil@decebal.nl> - 2015-06-07 08:27 +0200
  Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 12:40 +0200
    Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-07 21:51 +1000
      Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 17:51 +0200
        Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 02:25 +1000
          Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 18:36 +0200
            Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 02:44 +1000
              Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 20:23 +0200
                Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 04:52 +1000
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 21:41 +0200
                Re: Testing random Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2015-06-07 22:08 +0300
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 21:29 +0200
                Re: Testing random random832@fastmail.us - 2015-06-07 15:44 -0400
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 22:09 +0200
                Re: Testing random random832@fastmail.us - 2015-06-07 16:41 -0400
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 22:59 +0200
                Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-08 11:26 +1000
                Re: Testing random random832@fastmail.us - 2015-06-07 21:34 -0400
                Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 11:42 +1000
                Re: Testing random MRAB <python@mrabarnett.plus.com> - 2015-06-08 02:49 +0100
                Re: Testing random random832@fastmail.us - 2015-06-07 21:57 -0400
                Re: Testing random Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2015-06-08 10:40 +0300
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-10 19:03 +0200
                Re: Testing random sohcahtoa82@gmail.com - 2015-06-10 10:52 -0700
                Re: Testing random Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2015-06-10 23:00 +0300
                Re: Testing random Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-10 12:02 -0600
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-12 23:32 +0200
                Re: Testing random alister <alister.nospam.ware@ntlworld.com> - 2015-06-12 21:46 +0000
                Re: Testing random random832@fastmail.us - 2015-06-12 17:52 -0400
                Re: Testing random Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-12 16:00 -0600
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-13 00:09 +0200
                Re: Testing random sohcahtoa82@gmail.com - 2015-06-12 15:55 -0700
                Re: Testing random random832@fastmail.us - 2015-06-12 18:57 -0400
                Re: Testing random Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-13 08:53 +0100
                Re: Testing random random832@fastmail.us - 2015-06-10 14:26 -0400
                Re: Testing random Ned Batchelder <ned@nedbatchelder.com> - 2015-06-07 14:21 -0700
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-16 21:18 +0200
                Re: Testing random random832@fastmail.us - 2015-06-16 16:23 -0400
                Re: Testing random Ned Batchelder <ned@nedbatchelder.com> - 2015-06-16 13:48 -0700
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-16 23:57 +0200
                Re: Testing random sohcahtoa82@gmail.com - 2015-06-16 15:30 -0700
                Re: Testing random Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-16 16:58 -0600
                Re: Testing random Laura Creighton <lac@openend.se> - 2015-06-17 11:28 +0200
                Re: Testing random Ned Batchelder <ned@nedbatchelder.com> - 2015-06-16 16:26 -0700
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-17 01:45 +0200
                Re: Testing random sohcahtoa82@gmail.com - 2015-06-16 17:36 -0700
                Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-17 11:01 +1000
                Re: Testing random Ethan Furman <ethan@stoneleaf.us> - 2015-06-16 18:32 -0700
                Re: Testing random Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-06-17 09:41 +0100
                Re: Testing random Grant Edwards <invalid@invalid.invalid> - 2015-06-17 14:04 +0000
                Re: Testing random Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-17 09:01 -0600
                Re: Testing random MRAB <python@mrabarnett.plus.com> - 2015-06-17 01:42 +0100
                Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-17 08:53 +0200
                Re: Testing random Christian Gollwitzer <auriocus@gmx.de> - 2015-06-17 09:22 +0200
                Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-17 17:28 +1000
                Re: Testing random Tim Golden <mail@timgolden.me.uk> - 2015-06-17 08:30 +0100
                Re: Testing random Cecil Westerhof <Cecil@decebal.nl> - 2015-06-17 11:57 +0200
                Re: Testing random Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-17 01:36 +0000
                Re: Testing random Laura Creighton <lac@openend.se> - 2015-06-17 12:33 +0200
                Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-17 22:47 +1000
                Re: Testing random Laura Creighton <lac@openend.se> - 2015-06-17 15:50 +0200
                Re: Testing random Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2015-06-17 01:35 +0000
                Re: Testing random Jussi Piitulainen <jpiitula@ling.helsinki.fi> - 2015-06-17 07:41 +0300
                Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-08 11:11 +1000
            Re: Testing random Ian Kelly <ian.g.kelly@gmail.com> - 2015-06-07 11:07 -0600
            Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 03:20 +1000
            Re: Testing random "C.D. Reimer" <chris@cdreimer.com> - 2015-06-07 10:36 -0700
            Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-08 04:28 +1000
              Re: Testing random Chris Angelico <rosuav@gmail.com> - 2015-06-08 04:40 +1000
        Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-08 04:24 +1000
  Re: Testing random Jonas Wielicki <jonas@wielicki.name> - 2015-06-07 12:41 +0200
  Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-07 22:52 +1000
    Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-07 23:06 +1000
    Re: Testing random Peter Otten <__peter__@web.de> - 2015-06-07 15:35 +0200
      Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 18:36 +0200
        Re: Testing random Peter Otten <__peter__@web.de> - 2015-06-07 18:48 +0200
          Re: Testing random Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-06-07 22:15 +0200
      Re: Testing random Steven D'Aprano <steve@pearwood.info> - 2015-06-08 11:35 +1000
  Re: Testing random Christian Gollwitzer <auriocus@gmx.de> - 2015-06-07 14:53 +0200
  Re: Testing random Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2015-06-07 11:04 -0400

csiph-web