Modeling AR 1 Term with Missing Observations


I am trying to fit a simple AR 1 model, as shown below. I read over the missing values section in the Stan manual, but it is not clear to me how to fit this model if I have missing observations in my data. Would it be as straight forward as setting \tau = 0 if there is a missing observation at time i-1? Or draw another value of \tau_0 at the time point before each missing observation? In some sense, treating it again as requiring an initial value.

Y_{i} = f(\theta)_{i} + \tau_{i}
\tau_i = \rho \tau_{i-1} + \epsilon_i
\epsilon_i \sim N(0, \sigma^2)
\tau_0 \sim N(0, a)

Thank you,

The easiest way to do do it is to declare in the parameters block a vector, say y_miss whose size is equal to the number of missing observations. Then, weave together a “complete data” vector in temporal order that consists of the observed observations and y_miss. The “complete data” vector then has an AR1 “complete data likelihood”.


Hi Ben,

Thanks for you reply. As long as my y’s are in the right temporal order, it looks like I can add indexing to identify the responses that are missing, and then combine them into one big vector in the transformed parameters block.

int<lower = 1, upper = N_obs + N_mis> ii_obs[N_obs]; 
int<lower = 1, upper = N_obs + N_mis> ii_mis[N_mis];

transformed parameters{
real y[N]; 
y[ii_obs] = y_obs; 
y[ii_mis] = y_mis; 

One more quick follow up, I am looping over the all observations rather than doing nested loops for each time series. I have a time vector that that repeats the time indices–e.g., 1 2 3 1 2 3…1 2 3 (if I had 3 time points for each series). Without the AR term it looks basically like the following (assume x[i] indicates the time point).

for(i in 1:N){
y[i] ~ normal(f[x[i]], sigma)

Is there simple way to let the model know that at each time point i = 1, we need to reset the AR process for the next series? I was thinking something like if x[i]==1 let \eta \sim N(0, a); otherwise, \eta = \rho\eta_{i-1}. Does this make sense?

Thank you,

You can put in if {} else {} statements but for observations where i = 1, the unconditional variance is not sigma but rather sigma / sqrt(1 - square(rho)).