Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!fu-berlin.de!uni-berlin.de!news.dfncis.de!not-for-mail From: =?ISO-8859-1?Q?Hans-Bernhard_Br=F6ker?= Newsgroups: comp.graphics.apps.gnuplot Subject: Re: Fitting: How does gnuplot calculate the covariance matrix? Date: Sun, 10 Apr 2011 19:31:56 +0200 Lines: 50 Message-ID: <90e7s7F9faU1@mid.dfncis.de> References: <9088euFi3iU1@mid.individual.net> <908n9pF2jaU1@mid.individual.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.dfncis.de YpGodoU7kka4g7kgMJpfAwJCy21i95LuC1yqMSOpKTm+VfTAlBrDdLvxVO Cancel-Lock: sha1:HVRua1FeWTvKrnpMb2YSlp9s3wQ= User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 In-Reply-To: <908n9pF2jaU1@mid.individual.net> Xref: x330-a1.tempe.blueboxinc.net comp.graphics.apps.gnuplot:227 On 08.04.2011 17:18, Ingo Thies wrote: > Another point seems to be even more problematic, that is the usage of > the "asymptotic standard error". That's actually not another point, it's exactly the same. > Gnuplot finds: > > Final set of parameters Asymptotic Standard Error > ======================= ========================== > > a = 0.948209 +/- 0.04083 (4.306%) > b = 0.427096 +/- 0.01746 (4.089%) > > > correlation matrix of the fit parameters: > > a b > a 1.000 > b -0.855 1.000 > > > while an independent approach finds > > a = 9.482095E-01 -/+-1.947044E-01, 1.947043E-01 > b = 4.270960E-01 -/+-8.327496E-02, 8.327495E-02 Those errors are _way_ too big. Do yourself a favour and plot your data along with the model, using parameters modified by those errors: [assuming 'set fit errorvar' active, and fit done:] gnuplot> p 'fit.dat' u 1:2:3 w err, a*x+b w l gnuplot> rep (a+a_err)*x+b+b_err, (a-a_err)*x+b-b_err gnuplot> a_thies=0.195 gnuplot> b_thies=.083 gnuplot> rep (a+a_thies)*x+b+b_thies, (a-a_thies)*x+b-b_thies You'll see that gnuplot's a_err and b_err yield a corridor somewhat tightly containing almost all data points and their errors, just like it should be. Your independent approach defines a corridor that is way larger than it needs to be. Particularly the error on 'a' leads to a massive overestimation of the data errors at large 'x'. E.g. at x==4, the model with those parameter errors yields an interval of 2.6 +/- 0.8, while your data claimed 2.6 +/- 0.1