Multivariate time series with missing data

I’m trying to write a multivariate-outcome, AR(1) time-series, missing-data model. An additional challenge is that for my data only one component of the outcome is observable at once. The users guide section, “Missing Data & Partially Known Parameters” offers some useful ideas on how to do this, but I don’t know how to deal with the only-one-component-at-once issue. The best I could come up with is the following (bivariate only), expecting it to not work:


data {
     int<lower=0> N;   // number of observations
     real y_obs[N];     // observed data, missing one component
     real<lower=1, upper=2> obs[N];  // indicates which component is observed
  }
  parameters {
     real alpha;
     real beta;
     cov_matrix[2] SIG;
  }
  transformed parameters {
     real y_mis[N];
     vector[2] y[N];
     for (i in 1:N) {
        if (obs[i] == 1)
           y[i] = [y_obs[i], y_mis[i]]';
        else
           y[i] = [y_mis[i], y_obs[i]]';       
     }
  } 
  model {
     y[1] ~ normal(0,100);
     for (n in 2:N)
        y[n] ~ multi_normal(rep_vector(alpha,2)
           + rep_vector(beta,2) .* y[n-1], SIG);
  }

And indeed it doesn’t work; I get “Exception: normal_lpdf: Random variable[2] is nan, but must not be nan!”, presumably because Stan gets to the first line of the model and finds that one component of y[1] is nan.

The manual section “Missing Multivariate Data” has the following code snipet:

for (n in 1:N) {
  if (y_observed[n, 1] && y_observed[n, 2])
    y[n] ~ multi_normal(mu, Sigma);
  else if (y_observed[n, 1])
    y[n, 1] ~ normal(mu[1], sqrt(Sigma[1, 1]));
  else if (y_observed[n, 2])
    y[n, 2] ~ normal(mu[2], sqrt(Sigma[2, 2]));
}

I’ve tried to think of a way to adapt that to my model, but since in my data y[n,1] and y[n,2] are never observed simultaneously I guessing the multi_normal part of the model would never get estimated; the two components would in effect be estimated independently (i.e. no correlation).

Perhaps this means that this can’t be done, but intuitively I’m thinking that this being a time series somehow makes it possible, since the data are correlated time-wise.

Anyway, here’s some sample data:
temp.data.R (3.4 KB)

Any hints, comments, insights, etc. are appreciated.

Thanks

1 Like

Almost there, you just need to move real y_mis[N]; into the parameters block (not transformed).

I recommend adding a prior for SIG, maybe diagonal(SIG) ~ normal(0, 100);.
Also, you don’t need those rep_vector calls, alpha + beta * y[n-1] works too.

2 Likes

In addition to Niko’s suggestion, especially if you’re planning on expanding the model to more covariates, you probably want to re-parameterize so you have a prior on the SD of each covariate that is independent of the prior on the correlation matrix. I find that parameterization easier to think about and set sensible priors on (ex. half-normal or Weibull for SDs, LKJ for correlations) than a full covariance matrix.

Also, unless your data indeed has an SD on the order of 100, I suspect that normal(0,100) is far too uninformed. I think the general recommendation for a weakly-data-informed prior is to use an SD that is within an order of magnitude of the observed data; otherwise you’re putting a lot of credibility out in regions of the parameter space that are very extreme. A prior predictive check is certainly recommended to make sure that your prior generates reasonable data.

1 Like

Yup, that did the trick. And indeed, the rep_vector stuff appears to be unnecessary.

Thanks

Now that I have the basic model working, I’ll work on specifying more sensible priors per your suggestions.

Thank you

Please have a look at my updated model, incorporating your suggestions. I also expanded it to accommodate any number of covariates. It appears to run properly, but a bit slow. Perhaps there are opportunities for improved efficiency.

Thanks

 data {
     int<lower=0> N;   // number of observations
     int<lower=0> K;   // number of covariates
     real y_obs[N];        // one component of observed data
     int<lower=1> obs[N];  // indicates which component is observed
  }
  parameters {
     real alpha;       // AR(1) model alpha
     real beta;        // AR(1) model beta
     vector[K-1] y_mis[N];    // missing data
     vector<lower=0>[K] SD;     // standard deviations
     corr_matrix[K] R;         // correlation matrix
  }
  transformed parameters {
     vector[K] y[N];
     cov_matrix[K] SIG;
     SIG = diag_matrix(SD) * R * diag_matrix(SD);
     
     // Construct array combining data and missing data:
     for (i in 1:N) {
        int k = 1;
        for (j in 1:K) {
           if (obs[i] == j)
              y[i,j] = y_obs[i];
           else {
              y[i,j] = y_mis[i,k];
              k += 1;
           }
        }
     }
  }
  model {
     alpha ~ normal(0,2);
     beta ~ normal(1,1);
     y[1] ~ normal(0,20);
     SD ~ normal(0,20);
     R ~ lkj_corr(1.5);
     for (n in 2:N)
        y[n] ~ multi_normal(alpha + beta * y[n-1], SIG);
  }

I think this could be SIG = quad_form_diag(R, SD);

1 Like

OK, thanks.

You can also use the Cholesky form of the MVN to gain a bit of speed:

data {
     int<lower=0> N;   // number of observations
     int<lower=0> K;   // number of covariates
     real y_obs[N];        // one component of observed data
     int<lower=1> obs[N];  // indicates which component is observed
  }
  parameters {
     real alpha;       // AR(1) model alpha
     real beta;        // AR(1) model beta
     vector[K-1] y_mis[N];    // missing data
     vector<lower=0>[K] SD;     // standard deviations
     cholesky_factor_corr[K] L_R;         // Cholesky factor of correlation matrix
  }
  transformed parameters {
     vector[K] y[N];
     cholesky_factor_cov[K] L_SIG = diag_pre_multiply(SD, L_R); // Cholesky factor of covariance matrix
     
     // Construct array combining data and missing data:
     for (i in 1:N) {
        int k = 1;
        for (j in 1:K) {
           if (obs[i] == j)
              y[i,j] = y_obs[i];
           else {
              y[i,j] = y_mis[i,k];
              k += 1;
           }
        }
     }
  }
  model {
     alpha ~ normal(0,2);
     beta ~ normal(1,1);
     y[1] ~ normal(0,20);
     SD ~ normal(0,20);
     L_R ~ lkj_corr_cholesky(1.5); // Prior on Cholesky factor correlation matrix
     for (n in 2:N)
        y[n] ~ multi_normal_cholesky(alpha + beta * y[n-1], L_SIG); // Cholesky form of the MVN
  }
5 Likes

Yes that appears to speed things up a bit.

Thank you

1 Like