Groups | Search | Server Info | Login | Register


Groups > comp.soft-sys.sas > #971

Re: interpolating missing values

From Ya <huang8012@gmail.com>
Newsgroups comp.soft-sys.sas
Subject Re: interpolating missing values
Date 2011-06-21 08:13 -0700
Organization http://groups.google.com
Message-ID <e9aa09d9-e88a-49d0-9c17-e71f0c721eab@j20g2000yqi.googlegroups.com> (permalink)
References <9262d880-4ace-4634-bfd4-8691d2abc49c@d26g2000prn.googlegroups.com>

Show all headers | View raw


On Jun 20, 1:25 pm, wyldsoul <wylds...@gmail.com> wrote:
> Hi all,
>  I want to interpolate missing values from gage data.  This is easy to
> do manually in excel, however, I have over 60 years of water level
> data from dozens of stage gages, and would like to find a way to batch
> interpolate in SAS.   In order to interpolate, for each set of missing
> values I need to  calculate the difference between the last
> observation before the first blank value and the first observation
> after the last blank value.  Then divide the result of this by the
> total number of missing values+1.  Simply put:
> (first obs± last obs)/(# missing obs+1)
> or using the sample data from below: (6.05-6.01)/4=0.01
>
> Then I need to populate each missing observation by cumulatively
> adding or subtracting the result of the previous calculation to each
> successive blank until you reach the next non-blank observation.
> The sample dataset is below.  Stage is the original value, int_factor
> is the result of the first calculation, and interp_stage is the
> interpolated stage based on cumulatively adding the int_factor to each
> observation .
>
> Station date    stage   int_factor      interp_stage
> NP-P33  1-Jan-53        6.04            6.04
> NP-P33  2-Jan-53        6.03            6.03
> NP-P33  3-Jan-53        6.02            6.02
> NP-P33  4-Jan-53        6.01            6.01
> NP-P33  5-Jan-53                0.0100  6.02
> NP-P33  6-Jan-53                        6.03
> NP-P33  7-Jan-53                        6.04
> NP-P33  8-Jan-53        6.05            6.05
> NP-P33  9-Jan-53        6.15            6.15
> NP-P33  10-Jan-53       6.20            6.2
> NP-P33  11-Jan-53       6.20            6.2
> NP-P33  12-Jan-53       6.19            6.19
> NP-P33  13-Jan-53       6.18            6.18
> NP-P33  14-Jan-53       6.17            6.17
> NP-P33  15-Jan-53       6.16            6.16
> NP-P33  16-Jan-53       6.15            6.15
> NP-P33  17-Jan-53       6.14            6.14
> NP-P33  18-Jan-53       6.13            6.13
> NP-P33  19-Jan-53       6.12            6.12
> NP-P33  20-Jan-53               0.0091  6.11
> NP-P33  21-Jan-53                       6.10
> NP-P33  22-Jan-53                       6.09
> NP-P33  23-Jan-53                       6.08
> NP-P33  24-Jan-53                       6.07
> NP-P33  25-Jan-53                       6.07
> NP-P33  26-Jan-53                       6.06
> NP-P33  27-Jan-53                       6.05
> NP-P33  28-Jan-53                       6.04
> NP-P33  29-Jan-53                       6.03
> NP-P33  30-Jan-53       6.02            6.02
> NP-P33  31-Jan-53       6.02            6.02

This should work:

data have;
input Station $ stage;
cards;
NP-P33  6.04
NP-P33  6.03
NP-P33  6.02
NP-P33  6.01
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  6.05
NP-P33  6.15
NP-P33  6.20
NP-P33  6.20
NP-P33  6.19
NP-P33  6.18
NP-P33  6.17
NP-P33  6.16
NP-P33  6.15
NP-P33  6.14
NP-P33  6.13
NP-P33  6.12
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  .
NP-P33  6.02
NP-P33  6.02
;

data have1;
 set have;
n_=_n_;
run;

data have1;
 set have1 (where=(^missing(stage)));
lagn=lag(n_);
lagstage=lag(stage);
if _n_=1 then lagn=0;
if n_ ^= lagn + 1 then do;
 do i=1 to (n_-lagn);
 stage1=lagstage + (stage - lagstage)/(n_-lagn)*i;
 output;
 end;
end;
else do;
 stage1=stage;
 output;
end;
drop i n_ lagn lagstage;
format stage1 6.3;
run;


proc print;
run;

 Obs    Station    stage    stage1

   1    NP-P33      6.04     6.040
   2    NP-P33      6.03     6.030
   3    NP-P33      6.02     6.020
   4    NP-P33      6.01     6.010
   5    NP-P33      6.05     6.020
   6    NP-P33      6.05     6.030
   7    NP-P33      6.05     6.040
   8    NP-P33      6.05     6.050
   9    NP-P33      6.15     6.150
  10    NP-P33      6.20     6.200
  11    NP-P33      6.20     6.200
  12    NP-P33      6.19     6.190
  13    NP-P33      6.18     6.180
  14    NP-P33      6.17     6.170
  15    NP-P33      6.16     6.160
  16    NP-P33      6.15     6.150
  17    NP-P33      6.14     6.140
  18    NP-P33      6.13     6.130
  19    NP-P33      6.12     6.120
  20    NP-P33      6.02     6.111
  21    NP-P33      6.02     6.102
  22    NP-P33      6.02     6.093
  23    NP-P33      6.02     6.084
  24    NP-P33      6.02     6.075
  25    NP-P33      6.02     6.065
  26    NP-P33      6.02     6.056
  27    NP-P33      6.02     6.047
  28    NP-P33      6.02     6.038
  29    NP-P33      6.02     6.029
  30    NP-P33      6.02     6.020
  31    NP-P33      6.02     6.020

stage1 is the interpolated based on your rule.
I dropped date, since it is irrelavent here.

HTH

Ya

Back to comp.soft-sys.sas | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

interpolating missing values wyldsoul <wyldsoul@gmail.com> - 2011-06-20 13:25 -0700
  Re: interpolating missing values Ya <huang8012@gmail.com> - 2011-06-21 08:13 -0700
    Re: interpolating missing values wyldsoul <wyldsoul@gmail.com> - 2011-06-22 08:16 -0700
  Re: interpolating missing values "ajs2004@bigfoot.com" <ajs2004@bigfoot.com> - 2011-06-22 12:01 -0700

csiph-web