Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #6959 > unrolled thread
| Started by | Billy Mays <noway@nohow.com> |
|---|---|
| First post | 2011-06-03 13:55 -0400 |
| Last post | 2011-06-05 12:17 -0700 |
| Articles | 6 — 5 participants |
Back to article view | Back to comp.lang.python
Standard Deviation One-liner Billy Mays <noway@nohow.com> - 2011-06-03 13:55 -0400
Re: Standard Deviation One-liner Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2011-06-03 20:50 +0200
Re: Standard Deviation One-liner Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2011-06-03 21:10 +0200
Re: Standard Deviation One-liner Raymond Hettinger <python@rcn.com> - 2011-06-03 13:09 -0700
Re: Standard Deviation One-liner Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-06-05 17:26 +0000
Re: Standard Deviation One-liner Ethan Furman <ethan@stoneleaf.us> - 2011-06-05 12:17 -0700
| From | Billy Mays <noway@nohow.com> |
|---|---|
| Date | 2011-06-03 13:55 -0400 |
| Subject | Standard Deviation One-liner |
| Message-ID | <isb75r$80h$1@speranza.aioe.org> |
I'm trying to shorten a one-liner I have for calculating the standard deviation of a list of numbers. I have something so far, but I was wondering if it could be made any shorter (without imports). Here's my function: a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5 The functions is invoked as follows: >>> a([1,2,3,4]) 1.2909944487358056
[toc] | [next] | [standalone]
| From | Alain Ketterlin <alain@dpt-info.u-strasbg.fr> |
|---|---|
| Date | 2011-06-03 20:50 +0200 |
| Message-ID | <87oc2erbpf.fsf@dpt-info.u-strasbg.fr> |
| In reply to | #6959 |
Billy Mays <noway@nohow.com> writes: > I'm trying to shorten a one-liner I have for calculating the standard > deviation of a list of numbers. I have something so far, but I was > wondering if it could be made any shorter (without imports). > a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5 You should make it two half-liners, because this one repeatedly computes sum(d). I would suggest: aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1) sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d)) (after some algebra). Completely untested, assumes data come in as floats. You get the idea. -- Alain.
[toc] | [prev] | [next] | [standalone]
| From | Alain Ketterlin <alain@dpt-info.u-strasbg.fr> |
|---|---|
| Date | 2011-06-03 21:10 +0200 |
| Message-ID | <87k4d2rasn.fsf@dpt-info.u-strasbg.fr> |
| In reply to | #6960 |
Alain Ketterlin <alain@dpt-info.u-strasbg.fr> writes: > aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1) > sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d)) Err, sorry, the final square root is missing. -- Alain.
[toc] | [prev] | [next] | [standalone]
| From | Raymond Hettinger <python@rcn.com> |
|---|---|
| Date | 2011-06-03 13:09 -0700 |
| Message-ID | <58f86ea2-b654-4240-a5f5-c5c7e503bcf1@s41g2000prb.googlegroups.com> |
| In reply to | #6959 |
On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote: > I'm trying to shorten a one-liner I have for calculating the standard > deviation of a list of numbers. I have something so far, but I was > wondering if it could be made any shorter (without imports). > > Here's my function: > > a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5 > > The functions is invoked as follows: > > >>> a([1,2,3,4]) > 1.2909944487358056 Besides trying to do it one line, it is also interesting to write an one-pass version with incremental results: http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html Another interesting avenue to is aim for highest possible accuracy. Consider using math.fsum() to avoid rounding errors in the summation of large numbers of nearly equal values. Raymond ------------- follow my python tips on twitter: @raymondh
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2011-06-05 17:26 +0000 |
| Message-ID | <4debbc27$0$29996$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #6964 |
On Fri, 03 Jun 2011 13:09:43 -0700, Raymond Hettinger wrote: > On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote: >> I'm trying to shorten a one-liner I have for calculating the standard >> deviation of a list of numbers. I have something so far, but I was >> wondering if it could be made any shorter (without imports). >> >> Here's my function: >> >> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in >> d)/(1.*(len(d)-1)))**.5 >> >> The functions is invoked as follows: >> >> >>> a([1,2,3,4]) >> 1.2909944487358056 > > Besides trying to do it one line, it is also interesting to write an > one-pass version with incremental results: > > http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html I'm not convinced that's a good approach, although I haven't tried it. In general, the so-called "computational formula" for variance is optimized for pencil and paper calculations of small amounts of data, but is numerically unstable. See http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of- computing-standard-deviation/ http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance I'll also take this opportunity to plug my experimental stats package, which includes coroutine-based running statistics, including standard deviation: >>> s = stats.co.stdev() >>> s.send(3) nan >>> s.send(2) 0.7071067811865476 >>> s.send(5) 1.5275252316519465 >>> s.send(5) 1.4999999999999998 The non-running calculation of stdev gives this: >>> stats.stdev([3, 2, 5, 5]) 1.5 http://pypi.python.org/pypi/stats/ http://code.google.com/p/pycalcstats/ Be warned that the version on Google Code is unstable, and currently broken. Feedback is welcome! -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Ethan Furman <ethan@stoneleaf.us> |
|---|---|
| Date | 2011-06-05 12:17 -0700 |
| Message-ID | <mailman.2472.1307301527.9059.python-list@python.org> |
| In reply to | #7052 |
Steven D'Aprano wrote: > On Fri, 03 Jun 2011 13:09:43 -0700, Raymond Hettinger wrote: > >> On Jun 3, 10:55 am, Billy Mays <no...@nohow.com> wrote: >>> I'm trying to shorten a one-liner I have for calculating the standard >>> deviation of a list of numbers. I have something so far, but I was >>> wondering if it could be made any shorter (without imports). >>> >>> Here's my function: >>> >>> a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in >>> d)/(1.*(len(d)-1)))**.5 >>> >>> The functions is invoked as follows: >>> >>> >>> a([1,2,3,4]) >>> 1.2909944487358056 >> Besides trying to do it one line, it is also interesting to write an >> one-pass version with incremental results: >> >> http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html > > I'm not convinced that's a good approach, although I haven't tried it. In > general, the so-called "computational formula" for variance is optimized > for pencil and paper calculations of small amounts of data, but is > numerically unstable. > > See > > http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of- > computing-standard-deviation/ > > http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance > > > > I'll also take this opportunity to plug my experimental stats package, > which includes coroutine-based running statistics, including standard > deviation: > >--> s = stats.co.stdev() >--> s.send(3) > nan Look! A NaN in the wild! :) ~Ethan~
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web