Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.graphics.apps.gnuplot > #4539

MSE of a linear regression line + prediction intervals

From Alex van der Spek <amvds@xs4all.nl>
Subject MSE of a linear regression line + prediction intervals
Newsgroups comp.graphics.apps.gnuplot
Message-ID <nnd$0cc4cb6a$6d5d137d@a08940df5950d1eb> (permalink)
Organization KPN B.V.
Date 2023-08-29 14:54 +0200

Show all headers | View raw


I worked out a way to compute and plot the wide and narrow prediction 
intervals of a linear regression line. A self contained working plot 
script goes below.

I have two questions.

1. It seems as if the MSE could have been computed by the stats command. 
It don't think it is so I put in a work around which is rather convoluted. 
Am I overlooking something?

2. The shading between the prediction intervals doesn't clip correctly at 
the plot boundaries. I believe this is a known issue for which no work 
around exists? It would be a nice to have only, currently I just plot the 
lines instead of a fill between curves (which is in the example code 
below).

Thanks in advance,
Alex

+++++++++++++++++++++
#!/usr/bin/env gnuplot
#Make a Weibull plot of DC motor data taken from Philips databook C18 
(1986)

#Clean slate
unset multiplot
unset out
reset
set term qt size 640,480
set encoding utf8

#Inline data N=20
$MDAT << EOF
#Benard Hours Exact
0.034   1750  0.0341
0.083   2000  0.0825
0.132   2525  0.1315
0.181   2600  0.1805
0.230   2815  0.2297
EOF

#Linear fit
stats $MDAT using (log10(column(2))):(log(-log(1-column(1))))
n = STATS_records
x_mean = STATS_mean_x; y_mean = STATS_mean_y
x_std = STATS_stddev_x; y_std = STATS_stddev_y
a = STATS_slope; b = STATS_intercept
a_err = STATS_slope_err; b_err = STATS_intercept_err

#Mean Square Error
set table $MSED
plot $MDAT using (($1)-(1-exp(-exp(a*log10($2)+b))))**2
unset table
stats $MSED using 2
MSE = STATS_sum / (n-2)

#Wide prediction intervals
set table $WPINT
set xrange [200:20000]
set samples 1000
t = 3.18
plot '+' using ($1):(1-exp(-exp(a*log10($1)+b))) + t * \
     sqrt(MSE * (1+1/n+(log10($1)-x_mean)**2/(n*x_std**2))): \
     (1-exp(-exp(a*log10($1)+b))) - t * \
     sqrt(MSE * (1+1/n+(log10($1)-x_mean)**2/(n*x_std**2))) \
     with filledcurves
unset table

#Narrow prediction intervals
set table $NPINT
set xrange [200:20000]
set samples 1000
t = 3.18
plot '+' using ($1):(1-exp(-exp(a*log10($1)+b))) + t * \
     sqrt(MSE * (1/n+(log10($1)-x_mean)**2/(n*x_std**2))):\
     (1-exp(-exp(a*log10($1)+b))) - t * \
     sqrt(MSE * (1/n+(log10($1)-x_mean)**2/(n*x_std**2))) \
     with filledcurves
unset table

#Compute B10, MTBF and FR via shape (k) and scale (l) parameters
log10exp1 = log10(exp(1))
k = a * log10(exp(1))
l = exp(-b / k)

#Quantile function for Weibull distribution
Q(p, k, l) = l * (-log(1-p))**(1.0/k)

#B10 hours, 90% lives longer
B10 = Q(0.1, k, l)

#Characteristic time and failure rate
CT  = Q(1-exp(-1), k, l)
FR  = 1 / CT

#Mean Time Between Failures
MTBF = l * gamma(1 + 1.0/k)

#Plot specifics
set samples 100
set grid x y mx
set tics out
set xlabel 'Operating hours'
set ylabel 'Median rank'
set key top left opaque
set style fill transparent solid 0.25

#Non linear, Weibull distributed y axis, x axis log
set yrange [0.01:0.99]
set xrange [1000:7000]
set nonlinear y via log(-log(1-y)) inverse 1-exp(-exp(y))
set logscale x
set ytics (0.01, 0.02, 0.05, 0.1, 0.2, 0.5, \
           "1-e^{-1}" 0.632, 0.9, 0.99)
set xtics (1000, '' 1500, 2000, 3000, 4000, 5000, 6000, 7000)

set title sprintf("B_{10}=%.fh; MTBF=%.fh; FR = %.2f/1000h", B10, MTBF, 
FR*1000)
plot $WPINT using 1:2:3 with filledcurves lc 2 notitle, \
     $NPINT using 1:2:3 with filledcurves lc 4 notitle, \
     $MDAT using 2:1 with points pt 7 ps 2 lc 8 title 'Data', \
     $MDAT using 2:3 with points pt 6 ps 2 lc 8 title 'Data', \
     1-exp(-1) with lines dt 2 title '{/Symbol t}', \
     0.1 with lines dt 3 title 'B_{10}', \
     (1-exp(-exp(a*log10(x)+b))) with lines lc 7 lw 2 title 'Fit Weibull', 
\
     (1-exp(-exp(log(x/l)))) with lines lc 7 dt 3 title 'Fit Exponential'

Back to comp.graphics.apps.gnuplot | Previous | Next | Find similar


Thread

MSE of a linear regression line + prediction intervals Alex van der Spek <amvds@xs4all.nl> - 2023-08-29 14:54 +0200

csiph-web