Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #6959 > unrolled thread

Standard Deviation One-liner

Started byBilly Mays <noway@nohow.com>
First post2011-06-03 13:55 -0400
Last post2011-06-05 12:17 -0700
Articles 6 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Standard Deviation One-liner Billy Mays <noway@nohow.com> - 2011-06-03 13:55 -0400
    Re: Standard Deviation One-liner Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2011-06-03 20:50 +0200
      Re: Standard Deviation One-liner Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2011-06-03 21:10 +0200
    Re: Standard Deviation One-liner Raymond Hettinger <python@rcn.com> - 2011-06-03 13:09 -0700
      Re: Standard Deviation One-liner Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-05 17:26 +0000
        Re: Standard Deviation One-liner Ethan Furman <ethan@stoneleaf.us> - 2011-06-05 12:17 -0700

#6959 — Standard Deviation One-liner

FromBilly Mays <noway@nohow.com>
Date2011-06-03 13:55 -0400
SubjectStandard Deviation One-liner
Message-ID<isb75r$80h$1@speranza.aioe.org>
I'm trying to shorten a one-liner I have for calculating the standard 
deviation of a list of numbers.  I have something so far, but I was 
wondering if it could be made any shorter (without imports).


Here's my function:

a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5


The functions is invoked as follows:

 >>> a([1,2,3,4])
1.2909944487358056

[toc] | [next] | [standalone]


#6960

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2011-06-03 20:50 +0200
Message-ID<87oc2erbpf.fsf@dpt-info.u-strasbg.fr>
In reply to#6959
Billy Mays <noway@nohow.com> writes:

> I'm trying to shorten a one-liner I have for calculating the standard
> deviation of a list of numbers.  I have something so far, but I was
> wondering if it could be made any shorter (without imports).

> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5

You should make it two half-liners, because this one repeatedly computes
sum(d). I would suggest:

aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1)
sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d))

(after some algebra). Completely untested, assumes data come in as
floats. You get the idea.

-- Alain.

[toc] | [prev] | [next] | [standalone]


#6962

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2011-06-03 21:10 +0200
Message-ID<87k4d2rasn.fsf@dpt-info.u-strasbg.fr>
In reply to#6960
Alain Ketterlin <alain@dpt-info.u-strasbg.fr> writes:

> aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1)
> sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d))

Err, sorry, the final square root is missing.

-- Alain.

[toc] | [prev] | [next] | [standalone]


#6964

FromRaymond Hettinger <python@rcn.com>
Date2011-06-03 13:09 -0700
Message-ID<58f86ea2-b654-4240-a5f5-c5c7e503bcf1@s41g2000prb.googlegroups.com>
In reply to#6959
On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote:
> I'm trying to shorten a one-liner I have for calculating the standard
> deviation of a list of numbers.  I have something so far, but I was
> wondering if it could be made any shorter (without imports).
>
> Here's my function:
>
> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5
>
> The functions is invoked as follows:
>
>  >>> a([1,2,3,4])
> 1.2909944487358056

Besides trying to do it one line, it is also interesting to write an
one-pass version with incremental results:

  http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html

Another interesting avenue to is aim for highest possible accuracy.
Consider using math.fsum() to avoid rounding errors in the summation
of large numbers of nearly equal values.


Raymond

-------------
follow my python tips on twitter: @raymondh

[toc] | [prev] | [next] | [standalone]


#7052

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2011-06-05 17:26 +0000
Message-ID<4debbc27$0$29996$c3e8da3$5496439d@news.astraweb.com>
In reply to#6964
On Fri, 03 Jun 2011 13:09:43 -0700, Raymond Hettinger wrote:

> On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote:
>> I'm trying to shorten a one-liner I have for calculating the standard
>> deviation of a list of numbers.  I have something so far, but I was
>> wondering if it could be made any shorter (without imports).
>>
>> Here's my function:
>>
>> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in
>> d)/(1.*(len(d)-1)))**.5
>>
>> The functions is invoked as follows:
>>
>>  >>> a([1,2,3,4])
>> 1.2909944487358056
> 
> Besides trying to do it one line, it is also interesting to write an
> one-pass version with incremental results:
> 
>   http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html

I'm not convinced that's a good approach, although I haven't tried it. In 
general, the so-called "computational formula" for variance is optimized 
for pencil and paper calculations of small amounts of data, but is 
numerically unstable.

See 

http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-
computing-standard-deviation/

http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance



I'll also take this opportunity to plug my experimental stats package, 
which includes coroutine-based running statistics, including standard 
deviation:

>>> s = stats.co.stdev()
>>> s.send(3)
nan
>>> s.send(2)
0.7071067811865476
>>> s.send(5)
1.5275252316519465
>>> s.send(5)
1.4999999999999998

The non-running calculation of stdev gives this:

>>> stats.stdev([3, 2, 5, 5])
1.5


http://pypi.python.org/pypi/stats/
http://code.google.com/p/pycalcstats/

Be warned that the version on Google Code is unstable, and currently 
broken.

Feedback is welcome!


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#7055

FromEthan Furman <ethan@stoneleaf.us>
Date2011-06-05 12:17 -0700
Message-ID<mailman.2472.1307301527.9059.python-list@python.org>
In reply to#7052
Steven D'Aprano wrote:
> On Fri, 03 Jun 2011 13:09:43 -0700, Raymond Hettinger wrote:
> 
>> On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote:
>>> I'm trying to shorten a one-liner I have for calculating the standard
>>> deviation of a list of numbers.  I have something so far, but I was
>>> wondering if it could be made any shorter (without imports).
>>>
>>> Here's my function:
>>>
>>> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in
>>> d)/(1.*(len(d)-1)))**.5
>>>
>>> The functions is invoked as follows:
>>>
>>>  >>> a([1,2,3,4])
>>> 1.2909944487358056
>> Besides trying to do it one line, it is also interesting to write an
>> one-pass version with incremental results:
>>
>>   http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html
> 
> I'm not convinced that's a good approach, although I haven't tried it. In 
> general, the so-called "computational formula" for variance is optimized 
> for pencil and paper calculations of small amounts of data, but is 
> numerically unstable.
> 
> See 
> 
> http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-
> computing-standard-deviation/
> 
> http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
> 
> 
> 
> I'll also take this opportunity to plug my experimental stats package, 
> which includes coroutine-based running statistics, including standard 
> deviation:
> 
>--> s = stats.co.stdev()
>--> s.send(3)
> nan

Look!  A NaN in the wild!  :)

~Ethan~

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web