Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54349 > unrolled thread

linregress and polyfit

Started byKrishnan <chitturk@uah.edu>
First post2013-09-17 19:34 -0700
Last post2013-09-19 04:32 -0700
Articles 6 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  linregress and polyfit Krishnan <chitturk@uah.edu> - 2013-09-17 19:34 -0700
    Re: linregress and polyfit Josef Pktd <josef.pktd@gmail.com> - 2013-09-17 21:25 -0700
    Re: linregress and polyfit chitturk@uah.edu - 2013-09-18 06:38 -0700
      Re: linregress and polyfit Dave Angel <davea@davea.name> - 2013-09-18 19:57 +0000
      Re: linregress and polyfit Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2013-09-19 15:37 +0100
    Re: linregress and polyfit chitturk@uah.edu - 2013-09-19 04:32 -0700

#54349 — linregress and polyfit

FromKrishnan <chitturk@uah.edu>
Date2013-09-17 19:34 -0700
Subjectlinregress and polyfit
Message-ID<73ee6fa3-baa6-4178-8596-5f88bbd0bfa2@googlegroups.com>
I created an xy pair

y = slope*x + intercept

then I added some noise to "y" using

numpy.random.normal - call it z 

I could recover the slope, intercept from (x,y) using linregress
BUT cannot calculate the slope, intercept from (x, z)

What is puzzling is that for both pairs (x,y) and (x,z) the
polyfit (order 1) works just fine (gives me the slope, intercept)
----------------------------------------------------------------------
import numpy as np
import scipy
# create a straight line, y= slope*x + intercept 
x = np.linspace(0.0,10.0,21)  
slope = 1.0  
intercept = 3.0   
y = []  
for i in range(len(x)):  
    y.append(slope*x[i] + intercept)  
# now create a data file with noise
z= []  
for i in range(len(x)):  
    z.append(y[i] + 0.1*np.random.normal(0.0,1.0,1))  
# now calculate the slope, intercept using linregress
from scipy import stats  
# No error here this is OK, works for x, y
cslope, cintercept, r_value, p_value, std_err = stats.linregress(x,y)
print cslope, cintercept
# I get an error here
#ValueError: array dimensions must agree except for d_0
nslope, nintercept, nr_value, np_value, nstd_err = stats.linregress(x,z)  
print nslope, nintercept
# BUT polyfit works fine, polynomial or order 1 with both data sets
ncoeffs = scipy.polyfit(x,z,1)  
print ncoeffs
coeffs = scipy.polyfit(x,y,1)  
print coeffs

[toc] | [next] | [standalone]


#54353

FromJosef Pktd <josef.pktd@gmail.com>
Date2013-09-17 21:25 -0700
Message-ID<d57f881d-3166-4ecc-adce-41d124728aac@googlegroups.com>
In reply to#54349
On Tuesday, September 17, 2013 10:34:39 PM UTC-4, Krishnan wrote:
> I created an xy pair
> 
> 
> 
> y = slope*x + intercept
> 
> 
> 
> then I added some noise to "y" using
> 
> 
> 
> numpy.random.normal - call it z 
> 
> 
> 
> I could recover the slope, intercept from (x,y) using linregress
> 
> BUT cannot calculate the slope, intercept from (x, z)
> 
> 
> 
> What is puzzling is that for both pairs (x,y) and (x,z) the
> 
> polyfit (order 1) works just fine (gives me the slope, intercept)
> 
> ----------------------------------------------------------------------
> 
> import numpy as np
> 
> import scipy
> 
> # create a straight line, y= slope*x + intercept 
> 
> x = np.linspace(0.0,10.0,21)  
> 
> slope = 1.0  
> 
> intercept = 3.0   
> 
> y = []  
> 
> for i in range(len(x)):  
> 
>     y.append(slope*x[i] + intercept)  
> 
> # now create a data file with noise
> 
> z= []  
> 
> for i in range(len(x)):  
> 
>     z.append(y[i] + 0.1*np.random.normal(0.0,1.0,1))  

When z is converted to a numpy array then it has an extra dimension that linregress cannot handle, because np.random.normal(0.0,1.0, 1) returns an array and not a scalar.

much easier: use vectorized numpy instead of loop

z = y + 0.1*np.random.normal(0.0,1.0, len(y))

which is a one dimensional array and works with linregress

Josef

> 
> # now calculate the slope, intercept using linregress
> 
> from scipy import stats  
> 
> # No error here this is OK, works for x, y
> 
> cslope, cintercept, r_value, p_value, std_err = stats.linregress(x,y)
> 
> print cslope, cintercept
> 
> # I get an error here
> 
> #ValueError: array dimensions must agree except for d_0
> 
> nslope, nintercept, nr_value, np_value, nstd_err = stats.linregress(x,z)  
> 
> print nslope, nintercept
> 
> # BUT polyfit works fine, polynomial or order 1 with both data sets
> 
> ncoeffs = scipy.polyfit(x,z,1)  
> 
> print ncoeffs
> 
> coeffs = scipy.polyfit(x,y,1)  
> 
> print coeffs

[toc] | [prev] | [next] | [standalone]


#54381

Fromchitturk@uah.edu
Date2013-09-18 06:38 -0700
Message-ID<f1b29ea6-b4db-4636-8042-7c531ed8e99a@googlegroups.com>
In reply to#54349
Thanks - that helps ... but it is puzzling because

np.random.normal(0.0,1.0,1) returns exactly one 
and when I checked the length of "z", I get 21 (as before) ... 


On Tuesday, September 17, 2013 9:34:39 PM UTC-5, Krishnan wrote:
> I created an xy pair
> 
> 
> 
> y = slope*x + intercept
> 
> 
> 
> then I added some noise to "y" using
> 
> 
> 
> numpy.random.normal - call it z 
> 
> 
> 
> I could recover the slope, intercept from (x,y) using linregress
> 
> BUT cannot calculate the slope, intercept from (x, z)
> 
> 
> 
> What is puzzling is that for both pairs (x,y) and (x,z) the
> 
> polyfit (order 1) works just fine (gives me the slope, intercept)
> 
> ----------------------------------------------------------------------
> 
> import numpy as np
> 
> import scipy
> 
> # create a straight line, y= slope*x + intercept 
> 
> x = np.linspace(0.0,10.0,21)  
> 
> slope = 1.0  
> 
> intercept = 3.0   
> 
> y = []  
> 
> for i in range(len(x)):  
> 
>     y.append(slope*x[i] + intercept)  
> 
> # now create a data file with noise
> 
> z= []  
> 
> for i in range(len(x)):  
> 
>     z.append(y[i] + 0.1*np.random.normal(0.0,1.0,1))  
> 
> # now calculate the slope, intercept using linregress
> 
> from scipy import stats  
> 
> # No error here this is OK, works for x, y
> 
> cslope, cintercept, r_value, p_value, std_err = stats.linregress(x,y)
> 
> print cslope, cintercept
> 
> # I get an error here
> 
> #ValueError: array dimensions must agree except for d_0
> 
> nslope, nintercept, nr_value, np_value, nstd_err = stats.linregress(x,z)  
> 
> print nslope, nintercept
> 
> # BUT polyfit works fine, polynomial or order 1 with both data sets
> 
> ncoeffs = scipy.polyfit(x,z,1)  
> 
> print ncoeffs
> 
> coeffs = scipy.polyfit(x,y,1)  
> 
> print coeffs

[toc] | [prev] | [next] | [standalone]


#54398

FromDave Angel <davea@davea.name>
Date2013-09-18 19:57 +0000
Message-ID<mailman.133.1379534267.18130.python-list@python.org>
In reply to#54381
On 18/9/2013 09:38, chitturk@uah.edu wrote:

> Thanks - that helps ... but it is puzzling because
>
> np.random.normal(0.0,1.0,1) returns exactly one 
> and when I checked the length of "z", I get 21 (as before) ... 
>
>

I don't use Numpy, so this is just a guess, plus reading one web page.

According to:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html

the 3rd argument to normal should be a tuple.  So if you want a single
element, you should have made it (1,)

As for checking the 'length of "z"' did you just use the len() function?
That just tells you the first dimension.  Have you tried simply printing
out "z" ?

-- 
DaveA

[toc] | [prev] | [next] | [standalone]


#54423

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2013-09-19 15:37 +0100
Message-ID<mailman.149.1379601457.18130.python-list@python.org>
In reply to#54381
On 18 September 2013 20:57, Dave Angel <davea@davea.name> wrote:
> On 18/9/2013 09:38, chitturk@uah.edu wrote:
>
>> Thanks - that helps ... but it is puzzling because
>>
>> np.random.normal(0.0,1.0,1) returns exactly one
>> and when I checked the length of "z", I get 21 (as before) ...
>>
>>
>
> I don't use Numpy, so this is just a guess, plus reading one web page.
>
> According to:
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html
>
> the 3rd argument to normal should be a tuple.  So if you want a single
> element, you should have made it (1,)

Numpy accepts ints in place of shape tuples so this makes no difference.

> As for checking the 'length of "z"' did you just use the len() function?
> That just tells you the first dimension.  Have you tried simply printing
> out "z" ?

Exactly. What you need to check is the shape attribute (converting to
numpy array first if necessary):

>>> import numpy as np
>>> a = np.random.normal(0, 1, 1)
>>> a
array([-0.90292348])
>>> a.shape
(1,)
>>> a[0]
-0.90292348393433797
>>> np.array(a[0])
array(-0.902923483934338)
>>> np.array(a[0]).shape
()
>>> [a, a]
[array([-0.90292348]), array([-0.90292348])]
>>> np.array([a, a])
array([[-0.90292348],
       [-0.90292348]])
>>> np.array([a, a]).shape
(2, 1)
>>> np.random.normal(0, 1, 2).shape
(2,)

The square brackets in 'array([-0.90292348])' indicate that numpy
considers this to be a 1-dimensional array of length 1 rather than a
scalar value (which would have an empty shape tuple).


Oscar

[toc] | [prev] | [next] | [standalone]


#54419

Fromchitturk@uah.edu
Date2013-09-19 04:32 -0700
Message-ID<3d938bae-bd60-4039-9571-2eaead5b8dad@googlegroups.com>
In reply to#54349
tried (1,) - still same error ... 
printed "z" and looks right, len(z) OK 
(puzzling)

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web