Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #89399 > unrolled thread

[Matplotlib] Ploting an exponential distribution frequency curve

Started byMario Figueiredo <marfig@gmail.com>
First post2015-04-25 23:33 +0100
Last post2015-04-26 20:57 +0100
Articles 11 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-25 23:33 +0100
    Re: [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-25 23:36 +0100
      Re: [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-26 00:15 +0100
        Re: [Matplotlib] Ploting an exponential distribution frequency curve John Ladasky <john_ladasky@sbcglobal.net> - 2015-04-25 22:08 -0700
        Re: [Matplotlib] Ploting an exponential distribution frequency curve Dave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> - 2015-04-26 11:17 +0100
          Re: [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-26 13:32 +0100
    Re: [Matplotlib] Ploting an exponential distribution frequency curve Denis McMahon <denismfmcmahon@gmail.com> - 2015-04-25 23:12 +0000
      Re: [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-26 00:35 +0100
    Re: [Matplotlib] Ploting an exponential distribution frequency curve Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-04-26 01:31 +0100
    Re: [Matplotlib] Ploting an exponential distribution frequency curve Denis McMahon <denismfmcmahon@gmail.com> - 2015-04-26 18:55 +0000
      Re: [Matplotlib] Ploting an exponential distribution frequency curve Mario Figueiredo <marfig@gmail.com> - 2015-04-26 20:57 +0100

#89399 — [Matplotlib] Ploting an exponential distribution frequency curve

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-25 23:33 +0100
Subject[Matplotlib] Ploting an exponential distribution frequency curve
Message-ID<bi4ojap79mlcjq05skiar1ftrfh42j3efh@4ax.com>
I'm trying to plot the curve of an exponential distribution without
much success. I'm missing something very basic I feel, but just can't
figure it out after numerous tries, so I'm turning out to you.

This is the function generating the frequency of individual outcomes:

    import decimal
    from random import expovariate
    from collections import defaultdict

    decimal.getcontext().prec = 4
    Dec = decimal.Decimal

    samples = 100000   # 100,000

    def generate(lambd):
        res = defaultdict(int)
        for _ in range(samples):
            res[Dec(expovariate(lambd)).quantize(Dec('0.01'))] += 1
        return res

Trying to plot this data into a frequency curve is proving too
challenging and I just can't understand why.

    plot(list(results.keys()), list(results.values()))

This results in strange line graph where there is the outline of an
exponential curve but the line crisscrosses all over the place. I
can't understand why I am getting this graph result and not just the
smooth line I can infer from looking at the hard data.

[toc] | [next] | [standalone]


#89400

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-25 23:36 +0100
Message-ID<sj5ojalp0ilnnip6914dp52ok44pj848ql@4ax.com>
In reply to#89399
On Sat, 25 Apr 2015 23:33:10 +0100, Mario Figueiredo
<marfig@gmail.com> wrote:

>
>Trying to plot this data into a frequency curve is proving too
>challenging and I just can't understand why.
>
>    plot(list(results.keys()), list(results.values()))
>

The above should read:

    results = generate(1)
    plot(list(results.keys()), list(results.values()))

[toc] | [prev] | [next] | [standalone]


#89403

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-26 00:15 +0100
Message-ID<4m7oja1sgl70l3vfiftb3r8kl74kcdf7b4@4ax.com>
In reply to#89400
Ok. Ermm, it seems I needed to ask to finally have an epiphany. The
problem is that defaultdict is unordered. Once I get the data ordered,
I can finally plot the curve. Although this presents another
problem...

import decimal
from random import expovariate
from collections import defaultdict

decimal.getcontext().prec = 4
Dec = decimal.Decimal

samples = 100000   # 100,000

def generate(lambd):
    res = defaultdict(int)
    for _ in range(samples):
        res[Dec(expovariate(lambd)).quantize(Dec('0.01'))] += 1
    return sorted(res.items())

results = generate(1)
x, y = zip(*results)
plot(x, y)

This works as intended. But plots a jagged curve due to the small
discrepancies normal of a random number generation.

Other than replacing the random module with the probability density
function for the exponential distribution, do you have a suggestion of
how I could smooth the curve?

[toc] | [prev] | [next] | [standalone]


#89412

FromJohn Ladasky <john_ladasky@sbcglobal.net>
Date2015-04-25 22:08 -0700
Message-ID<5b6e36b5-e58c-474b-b619-2bbabbc6c33a@googlegroups.com>
In reply to#89403
On Saturday, April 25, 2015 at 4:16:04 PM UTC-7, Mario Figueiredo wrote:
[snip]
> This works as intended. But plots a jagged curve due to the small
> discrepancies normal of a random number generation.
> 
> Other than replacing the random module with the probability density
> function for the exponential distribution, do you have a suggestion of
> how I could smooth the curve?

Since you have a finite data set which only approximates the exponential distribution, a visual representation which connects the dots with line segments will definitely emphasize the noise in your data.

Matplotlib.pyplot has another useful function, scatter().  Scatter does not draw lines between adjacent values.  Start there.  See how your eye likes that.

If you really want to show a curve, and you do not want to calculate the limiting (infinite data samples) exponential variate on which your sample would eventually converge, you could construct moving averages using a sliding-window sample of the points (ordered by x value, of course) and then connect those averages.

[toc] | [prev] | [next] | [standalone]


#89413

FromDave Farrance <DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk>
Date2015-04-26 11:17 +0100
Message-ID<g8epjappuvj9e6n65fbv9apg74ta1mp7pb@4ax.com>
In reply to#89403
Mario Figueiredo <marfig@gmail.com> wrote:

>Other than replacing the random module with the probability density
>function for the exponential distribution, do you have a suggestion of
>how I could smooth the curve?

Moving average. Try:

def movingaverage(interval, window_size):
    window= numpy.ones(int(window_size))/float(window_size)
    return numpy.convolve(interval, window, 'same')

y_av = movingaverage(y,10)

Note that you'd get problems at the start and end of the curve if it's
non-zero there, which is difficult to avoid with noise-reduction.

[toc] | [prev] | [next] | [standalone]


#89414

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-26 13:32 +0100
Message-ID<k6mpjat2678it1dcbjfnrhn485q4da3ftj@4ax.com>
In reply to#89413
On Sun, 26 Apr 2015 11:17:07 +0100, Dave Farrance
<DaveFarrance@OMiTTHiSyahooANDTHiS.co.uk> wrote:

>
>Moving average. Try:
>
>def movingaverage(interval, window_size):
>    window= numpy.ones(int(window_size))/float(window_size)
>    return numpy.convolve(interval, window, 'same')
>
>y_av = movingaverage(y,10)
>
>Note that you'd get problems at the start and end of the curve if it's
>non-zero there, which is difficult to avoid with noise-reduction.

Thanks for the suggestion, Dave.

I'm also considering the interpolation sub-package in SciPy.
http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html

I was a bit surprised that matplotlib didn't offer me interpolation
out-of-the-box like gnuplot's `smooth csplines` option.

[toc] | [prev] | [next] | [standalone]


#89402

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2015-04-25 23:12 +0000
Message-ID<mhh70j$e4v$1@dont-email.me>
In reply to#89399
On Sat, 25 Apr 2015 23:33:10 +0100, Mario Figueiredo wrote:

>     plot(list(results.keys()), list(results.values()))

matplotlib supports at least (from searching the website) 5 plot methods.

Which one are you using?

My first guess would be that the data format that plot expects isn't the 
format it's getting, as you appear to be passing a list of x values and a 
list of y values, is it possible that it expects a list of value pairs?

Sorry, but given a choice of 5 plot methods in matplotlib and no hint as 
to which one you're calling, I'm not inclined to go and look at the 
arguments of all of them.

One suggestion I would make, though:

try plot([0,1,2,3,4,5,6,7,8,9,10],[0,1,2,3,4,5,6,7,8,9,10])

and see if you get a straight line running through the co-ord pairs:

0,0; 1,1; 2,2; 3,3; 4,4; 5,5; 6,6; 7,7; 8,8; 9,9 and 10,10

If not, then try:

plot(zip([0,1,2,3,4,5,6,7,8,9,10],[0,1,2,3,4,5,6,7,8,9,10]))

And see what that produces.

If the second plot produces the line I described, try:

plot(zip(list(results.keys()), list(results.values())))

in your code.

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#89404

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-26 00:35 +0100
Message-ID<1l8ojat2f8ajclvbfh738l4k2rqn37igbl@4ax.com>
In reply to#89402
On Sat, 25 Apr 2015 23:12:19 +0000 (UTC), Denis McMahon
<denismfmcmahon@gmail.com> wrote:

>Sorry, but given a choice of 5 plot methods in matplotlib and no hint as 
>to which one you're calling, I'm not inclined to go and look at the 
>arguments of all of them.

There's actually around 8 I think. The individual graphs types are
defined by the matplotlib.pyplot class. The plot() method you see in
my code draws a line graph, the bar() method a bar graph, hist() an
histogram, and so forth.

That was my fault. I should have mentioned I was using ipython
notebook with pylab inline option, or change the code to show the
import mechanism for the plot() function.

[toc] | [prev] | [next] | [standalone]


#89407

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-04-26 01:31 +0100
Message-ID<mailman.21.1430008291.3680.python-list@python.org>
In reply to#89399
On 25/04/2015 23:33, Mario Figueiredo wrote:
> I'm trying to plot the curve of an exponential distribution without
> much success. I'm missing something very basic I feel, but just can't
> figure it out after numerous tries, so I'm turning out to you.
>
> This is the function generating the frequency of individual outcomes:
>
>      import decimal
>      from random import expovariate
>      from collections import defaultdict
>
>      decimal.getcontext().prec = 4
>      Dec = decimal.Decimal
>
>      samples = 100000   # 100,000
>
>      def generate(lambd):
>          res = defaultdict(int)
>          for _ in range(samples):
>              res[Dec(expovariate(lambd)).quantize(Dec('0.01'))] += 1
>          return res
>
> Trying to plot this data into a frequency curve is proving too
> challenging and I just can't understand why.
>
>      plot(list(results.keys()), list(results.values()))
>
> This results in strange line graph where there is the outline of an
> exponential curve but the line crisscrosses all over the place. I
> can't understand why I am getting this graph result and not just the
> smooth line I can infer from looking at the hard data.
>

Anything that can hep you here 
http://matplotlib.org/gallery.html#statistics ?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#89427

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2015-04-26 18:55 +0000
Message-ID<mhjcav$4o4$3@dont-email.me>
In reply to#89399
On Sat, 25 Apr 2015 23:33:10 +0100, Mario Figueiredo wrote:

>     plot(list(results.keys()), list(results.values()))

I found multiple plots in matplotlib. You need to specify which one 
you're using.

The first thing you need to do is create a small self contained example 
of your problem.

State the problem: Plot does not create the output you expect.

Give an example:

plot( [1,11], [5,5] )

Explain what you expect the output to be:

You expect a line to be plotted from (1,5) to (11,5)

Explain what you actually see: ???

Note that it may be possible to correlate the data values and the output 
of the simple case in a way that shows you that you have in some way 
fundamentally misunderstood how the arguments should be passed to the 
plot function. If this is the case, work out what you need to do to fix 
the simple case, and then apply the same solution to your more complex 
data set.

If, for example, you see a line from (1,11) to (5,5) instead of a line 
from (1,5) to (11,5), then it might be that you need to combine the two 
lists into a single list of co-ordinate tuples, using eg:

plot(zip(list(results.keys()), list(results.values())))

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#89429

FromMario Figueiredo <marfig@gmail.com>
Date2015-04-26 20:57 +0100
Message-ID<2cgqjadqlbdeiq3lm8qen1lsqb8r17sdo1@4ax.com>
In reply to#89427
On Sun, 26 Apr 2015 18:55:27 +0000 (UTC), Denis McMahon
<denismfmcmahon@gmail.com> wrote:

>The first thing you need to do is create a small self contained example 
>of your problem.
>
>State the problem: Plot does not create the output you expect.
>
>Give an example:
>
>plot( [1,11], [5,5] )
>
>Explain what you expect the output to be:
>
>You expect a line to be plotted from (1,5) to (11,5)
>
>Explain what you actually see: ???
>

I think you missed several posts in this thread, including my reply to
your earlier post. This problem was never about anythiong of that
sort. It is also solved.

I suggest, since this is a NNTP group, that you use a desktop or web
newsreader that is capable of displaying treaded messages, or set your
email client to organize emails from comp.lang.python in a threaded
way. I suspect you are using flat list email and that may be
inadequate if you wish to participate in any discussion in here.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web