Estimating a covariance matrix from first difference MRP estimates

The title is a bit confuse, but what this presumably boils down to is estimating a covariance matrix from data that are itself estimates (here to estimate how the units change together over time). Hence, the model first does its multilevel model, then post stratifies, and then uses those estimates at the units level after first-differencing to estimate a covariance matrix of these differences (to eventually plug them into a random walk model).

I built this based on the SUR description in the manual with phi_diff being my outcome. Yet something is wrong with that part and I can’t quite figure it out. Upon investigation the covariance matrix resulting from L_Sigma = diag_pre_multiply(L_sigma, L_Omega); has several elements in the last columns that are equal to 0 (e.g. -0.0, 0.0, etc) which I think is the issue but I don’t know where this arises from. Thoughts?

Stan code

data {
  int T;
  int<lower = 0> G;
  int<lower = 1, upper = G> g[T]; // end points
  int<lower = 1, upper = 4>    age[G, T];
  int<lower = 1, upper = 4>   race[G, T];
  int<lower = 1, upper = 51>  state[G, T];
  int<lower = 1, upper = 2> sex[G, T];
  int<lower = 0> sm_N[G, T];
  int<lower = 0> bi_N[G, T];
  int P[4, 4, 51, 2];
}
transformed data{
  vector[51] zeros = rep_vector(0.0, 51);
}
parameters {
  real alpha[T];
  real<lower = 0> sigma_beta[T];
  matrix<multiplier = sigma_beta[T]>[4, T] beta;
  real<lower = 0> sigma_gamma[T];
  matrix<multiplier = sigma_gamma[T]>[4, T] gamma;
  real<lower = 0> sigma_delta[T];
  matrix<multiplier = sigma_delta[T]>[51, T] delta;
  real epsilon[T];
  //corr_matrix[51] L_Omega;        // prior correlation
  //vector<lower=0>[51] tau;      // prior scale
  cholesky_factor_corr[51] L_Omega;
  vector<lower=0>[51] L_sigma;
  vector[51] mu;
}
transformed parameters {
  matrix[51, T] phi;
  vector[51] phi_diff[T - 1];
  matrix[51, T] expect_pos = rep_matrix(0, 51, T);
  matrix[51,T] total = rep_matrix(0, 51, T);
  matrix[51, 51] L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
  //matrix[51, 51] Sigma_beta;
  for (t in 1:T){
    for (b in 1:4){
      for (c in 1:4){
        for (e in 1:2){
          total[,t] += to_vector(P[b, c, :, e]);
          expect_pos[,t]
            += to_vector(P[b, c, :, e])
              .* to_vector(inv_logit(alpha[t] + beta[b,t] +
                                     gamma[c,t] + delta[:,t] +
                                     {epsilon[t], -epsilon[t]}[e]));
        }
      }
    }
    phi[:,t] = expect_pos[:,t] ./ total[:,t];
  }
  for (t in 2:T) phi_diff[t - 1] = phi[:, t] - phi[:, t - 1];
  // covmat
}
model {
  // covmat
  target += multi_normal_cholesky_lpdf(phi_diff | zeros, L_Sigma);
  // mrp model
  for (t in 1:T){
    sm_N[1:g[t], t] ~
      binomial_logit(bi_N[1:g[t], t],
        alpha[t] +
        beta[age[1:g[t],t], t] +
        gamma[race[1:g[t],t], t] +
        delta[state[1:g[t],t], t] +
        to_vector({epsilon[t], -epsilon[t]}[sex[1:g[t],t]]));
    beta[:,t] ~ normal(0, sigma_beta[t]);
    gamma[:,t] ~ normal(0, sigma_gamma[t]);
    delta[:,t] ~ normal(0, sigma_delta[t]);
  }
  // mrp priors
  alpha ~ normal(0, 2);
  epsilon ~ normal(0, 2);
  sigma_beta  ~ normal(0, 1);
  sigma_gamma ~ normal(0, 1);
  sigma_delta ~ normal(0, 1);
  // covmat priors
  // tau ~ normal(0, 2.5);
  // L_Omega ~ lkj_corr_cholesky(2);
  L_Omega ~ lkj_corr_cholesky(2);
  L_sigma ~ normal(0, 2.5);
}

Hey there! It looks like L_Sigma is the Cholesky factor of the covariance matrix. (This is also what the L_* naming convention in these kinds of models signify: it is the Lower triangular Cholesky factor. Sidenote: this is also why I would not name the vector of marginal standard deviations L_sigma… The lower doesn’t really have a meaning here, if anything the Cholesky can be thought of as a matrix square root, but the standard deviations \sigma are already square roots of the variances \sigma^2… Sorry for that little rant, haha 😅) You can get Sigma by \Sigma=LL^T, for which there is a specialized function in Stan: multiply_lower_tri_self_transpose (matrix). You can simply compute this in the generated quantities block.

I hope this is what you are looking for!
Cheers,
Max

Hi Max,

thanks. Getting the covariance matrix wouldn’t really be the issue. The problem is more so that the model itself isn’t running due to an error Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. which I feel I understand where it is coming from (the zeros in the cholesky factor) but I don’t know how to solve this as I am unaware how the zeroes get there in the first place. Does this make sense?

Hey,

sorry I didn’t catch that that is the issue. Rejecting initial values can be due to a number of reasons, but I’m pretty sure that it’s not because of the zeroes in L_Sigma since the upper triangle is supposed to be zero and from what I see in your model you specified the data types correctly and also used the Cholesky version of the LKJ prior.
I’d suggest you try more informative priors, but really with a model like this it’s hard to tell looking just at the code. Did you try the whole build up the model step by step (in terms of complexity) and see when/where it stops working thing?

Cheers,
Max

No worries. The MRP model is working fine. The output is also reasonable. It fell apart when I added the part that is supposed to estimate the covariance matrix. It’s also not due to the way phi_diff is created as it doesn’t work with a data input for phi_diff from a MVN either.

So this works, but it is certainly inefficient. Hence, the problem is probably with the implementation of the cholesky factor.

data {
  int T;
  int<lower = 0> G;
  int<lower = 1, upper = G> g[T]; // end points
  int<lower = 1, upper = 4>    age[G, T];
  int<lower = 1, upper = 4>   race[G, T];
  int<lower = 1, upper = 51>  state[G, T];
  int<lower = 1, upper = 2> sex[G, T];
  int<lower = 0> sm_N[G, T];
  int<lower = 0> bi_N[G, T];
  int P[4, 4, 51, 2];
  //vector[51] phi_diff [T - 1];
}
transformed data{
  vector[51] zeros = rep_vector(0.0, 51);
}
parameters {
  real alpha[T];
  real<lower = 0> sigma_beta[T];
  matrix<multiplier = sigma_beta[T]>[4, T] beta;
  real<lower = 0> sigma_gamma[T];
  matrix<multiplier = sigma_gamma[T]>[4, T] gamma;
  real<lower = 0> sigma_delta[T];
  matrix<multiplier = sigma_delta[T]>[51, T] delta;
  real epsilon[T];
  //corr_matrix[51] L_Omega;        // prior correlation
  //vector<lower=0>[51] tau;      // prior scale
  //cholesky_factor_corr[51] L_Omega;
  //vector<lower=0>[51] L_sigma;
  cov_matrix[51] Sigma;
}
transformed parameters {
  matrix[51, T] phi;
  vector[51] phi_diff[T - 1];
  matrix[51, T] expect_pos = rep_matrix(0, 51, T);
  matrix[51,T] total = rep_matrix(0, 51, T);
  //matrix[51, 51] Sigma_beta;
  for (t in 1:T){
    for (b in 1:4){
      for (c in 1:4){
        for (e in 1:2){
          total[,t] += to_vector(P[b, c, :, e]);
          expect_pos[,t]
            += to_vector(P[b, c, :, e])
              .* to_vector(inv_logit(alpha[t] + beta[b,t] +
                                     gamma[c,t] + delta[:,t] +
                                     {epsilon[t], -epsilon[t]}[e]));
        }
      }
    }
    phi[:,t] = expect_pos[:,t] ./ total[:,t];
  }
  for (t in 2:T) phi_diff[t - 1] = phi[:, t] - phi[:, t - 1];
  // covmat
}
model {
  // mrp model
  for (t in 1:T){
    sm_N[1:g[t], t] ~
      binomial_logit(bi_N[1:g[t], t],
        alpha[t] +
        beta[age[1:g[t],t], t] +
        gamma[race[1:g[t],t], t] +
        delta[state[1:g[t],t], t] +
        to_vector({epsilon[t], -epsilon[t]}[sex[1:g[t],t]]));
    beta[:,t] ~ normal(0, sigma_beta[t]);
    gamma[:,t] ~ normal(0, sigma_gamma[t]);
    delta[:,t] ~ normal(0, sigma_delta[t]);
  }
  // mrp priors
  alpha ~ normal(0, 2);
  epsilon ~ normal(0, 2);
  sigma_beta  ~ normal(0, 1);
  sigma_gamma ~ normal(0, 1);
  sigma_delta ~ normal(0, 1);
  // covmat priors
  // tau ~ normal(0, 2.5);
  // L_Omega ~ lkj_corr_cholesky(2);
  //L_Omega ~ lkj_corr_cholesky(2);
  //L_sigma ~ normal(0, 0.5);
  target += multi_normal_lpdf(phi_diff | zeros, Sigma);
}

On the other hand, this has initialization problems still. Specifically, LDLT_Factor of covariance parameter is not positive definite. last conditional variance is -8.88178e-16.