Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #21915 > unrolled thread

Distribution

Started by"prince.pangeni" <prince.ram85@gmail.com>
First post2012-03-19 21:31 -0700
Last post2012-03-20 20:19 +0000
Articles 9 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Distribution "prince.pangeni" <prince.ram85@gmail.com> - 2012-03-19 21:31 -0700
    Re: Distribution Peter Otten <__peter__@web.de> - 2012-03-20 10:15 +0100
    Re: Distribution Ben Finney <ben+python@benfinney.id.au> - 2012-03-20 22:21 +1100
      Re: Distribution Robert Kern <robert.kern@gmail.com> - 2012-03-20 12:00 +0000
      Re: Distribution Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-03-20 11:29 -0400
      Re: Distribution Laurent Claessens <moky.math@gmail.com> - 2012-03-20 18:00 +0100
      Re: Distribution Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-03-20 13:47 -0400
    Re: Distribution Robert Kern <robert.kern@gmail.com> - 2012-03-20 12:52 +0000
    Re: Distribution duncan smith <buzzard@urubu.freeserve.co.uk> - 2012-03-20 20:19 +0000

#21915 — Distribution

From"prince.pangeni" <prince.ram85@gmail.com>
Date2012-03-19 21:31 -0700
SubjectDistribution
Message-ID<33044d51-c481-4796-a722-406f412cf994@x7g2000pbi.googlegroups.com>
Hi all,
   I am doing a simulation project using Python. In my project, I want
to use some short of distribution to generate requests to a server.
The request should have two distributions. One for request arrival
rate (should be poisson) and another for request mix (i.e. out of the
total requests defined in request arrival rate, how many requests are
of which type).
   Example: Suppose the request rate is - 90 req/sec (generated using
poisson distribution) at time t and we have 3 types of requests (i.e.
r1, r2, r2). The request mix distribution output should be similar to:
{r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
type, 30 are of r2 type and 10 are of r3 type).
   As I an new to python distribution module, I am not getting how to
code this situation. Please help me out for the same.

  Thanks in advance

Prince

[toc] | [next] | [standalone]


#21920

FromPeter Otten <__peter__@web.de>
Date2012-03-20 10:15 +0100
Message-ID<mailman.823.1332234906.3037.python-list@python.org>
In reply to#21915
prince.pangeni wrote:

> Hi all,
>    I am doing a simulation project using Python. In my project, I want
> to use some short of distribution to generate requests to a server.
> The request should have two distributions. One for request arrival
> rate (should be poisson) and another for request mix (i.e. out of the
> total requests defined in request arrival rate, how many requests are
> of which type).
>    Example: Suppose the request rate is - 90 req/sec (generated using
> poisson distribution) at time t and we have 3 types of requests (i.e.
> r1, r2, r2). The request mix distribution output should be similar to:
> {r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
> type, 30 are of r2 type and 10 are of r3 type).
>    As I an new to python distribution module, I am not getting how to
> code this situation. Please help me out for the same.

You don't say what distribution module you're talking of, and I guess I'm 
not the only one who'd need to know that detail.

However, with sufficient resolution and duration the naive approach sketched 
below might be good enough.

# untested
DURATION = 3600 # run for one hour
RATE = 90 # requests/sec
RESOLUTION = 1000 # one msec

requests = ([r1]*50 + [r2]*30 + [r3]*10)
time_slots = [0]*(RESOLUTION*DURATION)
times = range(RESOLUTION*DURATION)

for _ in range(DURATION*RATE):
   time_slots[random.choice(times)] += 1

for time, count in enumerate(time_slots):
    for _ in range(count):
        issue_request_at(random.choice(requests), time)

[toc] | [prev] | [next] | [standalone]


#21925

FromBen Finney <ben+python@benfinney.id.au>
Date2012-03-20 22:21 +1100
Message-ID<877gyfmsbl.fsf@benfinney.id.au>
In reply to#21915
"prince.pangeni" <prince.ram85@gmail.com> writes:

>    I am doing a simulation project using Python. In my project, I want
> to use some short of distribution to generate requests to a server.

What is a distribution? That term already means something in Python
jargon, and it doesn't match the rest of your use case.

So what do you mean by “distribution”? Maybe we can find a less
confusing term.

-- 
 \     “I used to think that the brain was the most wonderful organ in |
  `\   my body. Then I realized who was telling me this.” —Emo Philips |
_o__)                                                                  |
Ben Finney

[toc] | [prev] | [next] | [standalone]


#21927

FromRobert Kern <robert.kern@gmail.com>
Date2012-03-20 12:00 +0000
Message-ID<mailman.824.1332244868.3037.python-list@python.org>
In reply to#21925
On 3/20/12 11:21 AM, Ben Finney wrote:
> "prince.pangeni"<prince.ram85@gmail.com>  writes:
>
>>     I am doing a simulation project using Python. In my project, I want
>> to use some short of distribution to generate requests to a server.
>
> What is a distribution? That term already means something in Python
> jargon, and it doesn't match the rest of your use case.
>
> So what do you mean by “distribution”? Maybe we can find a less
> confusing term.

Judging from the context, he means a probability distribution.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#21933

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-03-20 11:29 -0400
Message-ID<mailman.832.1332257387.3037.python-list@python.org>
In reply to#21925
On Tue, 20 Mar 2012 12:00:50 +0000, Robert Kern <robert.kern@gmail.com>
declaimed the following in gmane.comp.python.general:

> On 3/20/12 11:21 AM, Ben Finney wrote:
> > "prince.pangeni"<prince.ram85@gmail.com>  writes:
> >
> >>     I am doing a simulation project using Python. In my project, I want
> >> to use some short of distribution to generate requests to a server.
> >
> > What is a distribution? That term already means something in Python
> > jargon, and it doesn't match the rest of your use case.
> >
> > So what do you mean by “distribution”? Maybe we can find a less
> > confusing term.
> 
> Judging from the context, he means a probability distribution.
>
	Futhermore, he explicitly mentions Poisson later in the text.

	That should just be a case of finding the mathematical definition of
the distribution and coding something using random (since the random
module has gaussian/normal, but not Poisson -- maybe numpy/simpy?) to
obtain the time variant.

	I suspect the real question is how to handle the weighted selection
of r1-r3. And for such a relatively short sample set (90 entries), the
chunked list shown previously, with randint to index into it is viable.

	The alternative, more generalized (in a way), would be to use a
sorted list of the probabilities (or sums)... (pseudo-code)

probs =	[	(numR1, r1),
				(numR1 + numR2, r2),
				(numR1 + numR2 + numR3, r3)	]
#obviously one is unlikely to actually create constants for the numRs
#more likely to create in line
probs.sort()

ran = random.randint(numR1 + numR2 + numR3)
for (p, r) in probs:
	if ran > p: continue
	theR = r
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#21937

FromLaurent Claessens <moky.math@gmail.com>
Date2012-03-20 18:00 +0100
Message-ID<jkadda$mmi$1@news.univ-fcomte.fr>
In reply to#21925
Il 20/03/2012 12:21, Ben Finney ha scritto:
> "prince.pangeni"<prince.ram85@gmail.com>  writes:
>
>>     I am doing a simulation project using Python. In my project, I want
>>  to use some short of distribution to generate requests to a server.

I guess scipy is also available in plain python (didn't check), but the 
following works with Sage :

----------------------------------------------------------------------
| Sage Version 4.8, Release Date: 2012-01-20                         |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
sage: from scipy import stats
sage: X=stats.poisson.rvs
sage: X(4)
5
sage: X(4)
2
sage: X(4)
3


Hope it helps
Laurent

[toc] | [prev] | [next] | [standalone]


#21938

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-03-20 13:47 -0400
Message-ID<mailman.836.1332265683.3037.python-list@python.org>
In reply to#21925
On Tue, 20 Mar 2012 11:29:37 -0400, Dennis Lee Bieber
<wlfraed@ix.netcom.com> declaimed the following in
gmane.comp.python.general:


> module has gaussian/normal, but not Poisson -- maybe numpy/simpy?) to
> obtain the time variant.
>
	Whoops -- scipy, not simpy


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#21929

FromRobert Kern <robert.kern@gmail.com>
Date2012-03-20 12:52 +0000
Message-ID<mailman.827.1332247978.3037.python-list@python.org>
In reply to#21915
On 3/20/12 4:31 AM, prince.pangeni wrote:
> Hi all,
>     I am doing a simulation project using Python. In my project, I want
> to use some short of distribution to generate requests to a server.
> The request should have two distributions. One for request arrival
> rate (should be poisson) and another for request mix (i.e. out of the
> total requests defined in request arrival rate, how many requests are
> of which type).
>     Example: Suppose the request rate is - 90 req/sec (generated using
> poisson distribution)

Just a note on terminology to be sure we're clear: a Poisson *distribution* 
models the number of arrivals in a given time period if the events are from a 
Poisson *process* with a given mean rate. To model the inter-event arrival 
times, you use an exponential distribution. If you want to handle events 
individually in your simulation, you will need to use the exponential 
distribution to figure out the exact times for each. If you are handling all of 
the events in each second "in bulk" without regard to the exact times or 
ordering within that second, then you can use a Poisson distribution.

> at time t and we have 3 types of requests (i.e.
> r1, r2, r2). The request mix distribution output should be similar to:
> {r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
> type, 30 are of r2 type and 10 are of r3 type).
>     As I an new to python distribution module, I am not getting how to
> code this situation. Please help me out for the same.

I am going to assume that you want to handle each event independently. A basic 
strategy is to keep a time variable starting at 0 and use a while loop until the 
time reaches the end of the simulation time. Increment it using a draw from the 
exponential distribution each loop. Each iteration of the loop is an event. To 
determine the kind of event, you will need to draw from a weighted discrete 
distribution. What you want to do here is to do a cumulative sum of the weights, 
draw a uniform number from 0 to the total sum, then use bisect to find the item 
that matches.

import bisect
import random


# Use a seeded PRNG for repeatability. Use the methods on the Random
# object rather than the functions in the random module.
prng = random.Random(1234567890)

avg_rate = 90.0  # reqs/sec

kind_weights = [50.0, 30.0, 10.0]
kind_cumsum = [sum(kind_weights[:i+1]) for i in range(len(kind_weights))]
kind_max = kind_cumsum[-1]

max_time = 10.0  # sec
t = 0.0  # sec
events = []  # (t, kind)
while t < max_time:
     dt = prng.expovariate(avg_rate)
     u = prng.uniform(0.0, kind_max)
     kind = bisect.bisect_left(kind_cumsum, u)
     events.append((t, kind))
     t += dt


-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#21946

Fromduncan smith <buzzard@urubu.freeserve.co.uk>
Date2012-03-20 20:19 +0000
Message-ID<fD5ar.4711$K92.1917@newsfe10.ams2>
In reply to#21915
On 20/03/12 04:31, prince.pangeni wrote:
> Hi all,
>     I am doing a simulation project using Python. In my project, I want
> to use some short of distribution to generate requests to a server.
> The request should have two distributions. One for request arrival
> rate (should be poisson) and another for request mix (i.e. out of the
> total requests defined in request arrival rate, how many requests are
> of which type).
>     Example: Suppose the request rate is - 90 req/sec (generated using
> poisson distribution) at time t and we have 3 types of requests (i.e.
> r1, r2, r2). The request mix distribution output should be similar to:
> {r1 : 50 , r2 : 30 , r3 : 10} (i.e. out of 90 requests - 50 are of r1
> type, 30 are of r2 type and 10 are of r3 type).
>     As I an new to python distribution module, I am not getting how to
> code this situation. Please help me out for the same.
>
>    Thanks in advance
>
> Prince

Robert has given you a very good answer. The easiest way is to generate 
interarrival times using an exponential distribution, then for each 
event select the type from a categorical probability mass function. 
Perhaps the easiest and most efficient approach for the latter using 
your 'mix distribution' above is to create a list containing 5 instances 
of r1, 3 of r2 and 1 of r3. Then select the type by generating a random 
index into the list. It is not an ideal solution generally, but good 
when the parameters do not change and the required list is small.

Duncan

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web