Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Ingo Thies Newsgroups: comp.graphics.apps.gnuplot Subject: Re: Fitting: How does gnuplot calculate the covariance matrix? Date: Fri, 08 Apr 2011 17:18:17 +0200 Lines: 97 Message-ID: <908n9pF2jaU1@mid.individual.net> References: <9088euFi3iU1@mid.individual.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: individual.net il7d8bKGC/BDMSYS2bZA1gilfeHkAo5oMuAfMjnGiO9ynYHgNt Cancel-Lock: sha1:ziATYMNm4bR/xTnUYe1kx1Z75cI= User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 In-Reply-To: <9088euFi3iU1@mid.individual.net> Xref: x330-a1.tempe.blueboxinc.net comp.graphics.apps.gnuplot:224 On 08.04.2011 13:05, I wrote: > The background: I am still mistrusting the error ellipses one gets from > the eigenvalues. As already mentioned last year (aroung May or so), I > suspect that the resulting error contours are underestimated since the > best-fit chi^2 is assumed to be zero there, but it isn't. Another point seems to be even more problematic, that is the usage of the "asymptotic standard error". For example, one can use the 1-sigma error contour as a min-max estimator for the error of a,b in case of a two-parameter fit. I have done this for e.g. a linear fit, f(x)=a+b*x, and compared the results from gnuplot with those from an independent approach. While the best-fit values as well as the chi^2 and the a,b correlation values are the same within the visibile digits (these aren't very much in gnuplot, though), the error estimate from the 1-sig contour is typically by an order of magnitude larger than the asymptotic standard error. One could argue, the outline of the error contour overestimates the error (since error ellipses are often highly elongated but slim along their minor axis; especially for |cor_ab| \approx 1), but in the test case, this would only reduce the a,b erros only by a factor of about 1/2. The gnuplot.pdf states that the asymptotic error underestimates the true error, but it wasn't clear to me up to now that it could do this by such a vast amount. Here are sample data I have been using for the test: #x y dy 0.000000 1.078039 0.100000 0.200000 1.012137 0.100000 0.400000 0.994650 0.100000 0.600000 1.210933 0.100000 0.800000 1.228788 0.100000 1.000000 1.279018 0.100000 1.200000 1.510823 0.100000 1.400000 1.466704 0.100000 1.600000 1.523061 0.100000 1.800000 1.768180 0.100000 2.000000 1.894179 0.100000 2.200000 2.006980 0.100000 2.400000 1.994247 0.100000 2.600000 2.212952 0.100000 2.800000 2.123002 0.100000 3.000000 2.341036 0.100000 3.200000 2.321149 0.100000 3.400000 2.469557 0.100000 3.600000 2.310980 0.100000 3.800000 2.451646 0.100000 4.000000 2.652370 0.100000 Gnuplot finds: Final set of parameters Asymptotic Standard Error ======================= ========================== a = 0.948209 +/- 0.04083 (4.306%) b = 0.427096 +/- 0.01746 (4.089%) correlation matrix of the fit parameters: a b a 1.000 b -0.855 1.000 while an independent approach finds a = 9.482095E-01 -/+-1.947044E-01, 1.947043E-01 b = 4.270960E-01 -/+-8.327496E-02, 8.327495E-02 chi^2_red = 0.939417 cor_ab =-0.855398 i.e. the a error is +/- 0.195 instead of +/- 0.04, while the b error is +/- 0.083 instead of +/- 0.017 Using the errors of each only for the other parameter at its best-fit value, the errorbars are about +/- 0.1 and +/-0.05, respectively. I did not do a large study of fitting functions, but the given results suggest, in my opinion, that the asymptotic standard error should not be used for an error discussion at all. Instead, I would recommend to calculate the error ellipse from the (eigenvalues of the) covariance matrix, corrected for non-zero minimum chi^2, and use it as an (additional) error estimate. Any opionions? -- Gruß, Ingo