Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.graphics.apps.gnuplot > #1496
| From | Ivan Shmakov <oneingray@gmail.com> |
|---|---|
| Newsgroups | alt.sources, comp.graphics.apps.gnuplot, comp.lang.awk |
| Subject | linear fit in GNU Awk |
| Followup-To | alt.sources.d, comp.lang.awk |
| Date | 2012-11-23 12:30 +0700 |
| Organization | Aioe.org NNTP Server |
| Message-ID | <86a9u87uoh.fsf@gray.siamics.net> (permalink) |
Cross-posted to 3 groups.
Followups directed to: alt.sources.d, comp.lang.awk
Archive-name: awk-linear-fit-is
Submitted-by: oneingray@gmail.com
Last-modified: 2012-11-23 +00:00
Copyright-Notice: Both the README and the code are under the
CC0 Public Domain Dedication.
Once, I had to "verify" the Gnuplot (iterative) non-linear least
squares algorithm against a linear least squares for a linear
function. Thus, I've written the code below.
The code calculates the fit parameters (offset, coefficient) for
each data "block" in the file, with blocks being separated with
at least two empty lines. By default, first input column is
used as the independent variable, and the second as dependent,
subject to change with -vxcol=, -vycol= GNU Awk options.
All data points are considered of equal weight.
Examples
Using flat-distribution random data:
$ (for i in a b c d e f ; do
for ((x = 0; x < 128; x++)) ; do
printf %s\\n "${x} ${RANDOM}"
done
printf \\n\\n
done) | gawk -f linear-fit.awk
18001.6 -11.2342
18399.9 -50.597
16128.5 10.2013
16592.5 -8.59861
17240.7 -10.3474
16284.8 -4.78138
$
Superimposing linear dependency on the random data:
$ (for i in 1 3 3 7 ; do
for ((x = 0; x < 128; x++)) ; do
printf %s\\n "${x} $((10000 * ${i} * ${x} + ${RANDOM}))"
done
printf \\n\\n
done) | gawk -f linear-fit.awk
17379 9975.24
14182 30027.4
15867.6 29993.9
15284 70019.7
$
Increasing the number of the random data points, so that the
dependency becomes apparent:
$ (for i in 1 3 3 7 ; do
for ((x = 0; x < 1024; x++)) ; do
printf %s\\n "${x} $((10000 * ${i} * ${x} + ${RANDOM} - 16384))"
done
printf \\n\\n
done) | gawk -f linear-fit.awk
-48.5561 9999.55
-81.2604 30000.2
-696.383 30000
-332.214 70000.7
$
### linear-fit.awk --- Linear fit -*- Awk -*-
### Ivan Shmakov, 2012
## To the extent possible under law, the author(s) have dedicated all
## copyright and related and neighboring rights to this software to the
## public domain worldwide. This software is distributed without any
## warranty.
## You should have received a copy of the CC0 Public Domain Dedication
## along with this software. If not, see
## <http://creativecommons.org/publicdomain/zero/1.0/>.
### Code:
BEGIN {
## the independent and dependent variables' columns
if (xcol == "") { xcol = 1; }
if (ycol == "") { ycol = 2; }
## we assume that the state variables are "", thus 0
}
## skip over the comments
/^#/ { print $0; next; }
## count empty lines, so that we can split on blocks
! /./ { el++; next; }
## NB: ignore y == NaN
$ycol == 0 + $ycol {
x = $xcol; y = $ycol;
if (el >= 2 && n != "") {
q = 1 / (n * s2 - s1 ^2);
## print the result for the previous block
print q * (s2 * u1 - s1 * u2), q * (n * u2 - s1 * u1);
## initialize the state
n = 1;
s1 = x; s2 = x ^2;
u1 = y; u2 = x * y;
} else {
n++;
s1 += x; s2 += x ^2;
u1 += y; u2 += x * y;
}
el = 0;
}
END {
if (n != "") {
q = 1 / (n * s2 - s1 ^2);
## print the result for the last block
print q * (s2 * u1 - s1 * u2), q * (n * u2 - s1 * u1);
}
}
### linear-fit.awk ends here
--
FSF associate member #7257
Back to comp.graphics.apps.gnuplot | Previous | Next | Find similar
linear fit in GNU Awk Ivan Shmakov <oneingray@gmail.com> - 2012-11-23 12:30 +0700
csiph-web