Estimating a covariance matrix from first difference MRP estimates

Merlin_Heidemanns · December 6, 2020, 4:32pm

The title is a bit confuse, but what this presumably boils down to is estimating a covariance matrix from data that are itself estimates (here to estimate how the units change together over time). Hence, the model first does its multilevel model, then post stratifies, and then uses those estimates at the units level after first-differencing to estimate a covariance matrix of these differences (to eventually plug them into a random walk model).

I built this based on the SUR description in the manual with phi_diff being my outcome. Yet something is wrong with that part and I can’t quite figure it out. Upon investigation the covariance matrix resulting from L_Sigma = diag_pre_multiply(L_sigma, L_Omega); has several elements in the last columns that are equal to 0 (e.g. -0.0, 0.0, etc) which I think is the issue but I don’t know where this arises from. Thoughts?

Stan code

data {
  int T;
  int<lower = 0> G;
  int<lower = 1, upper = G> g[T]; // end points
  int<lower = 1, upper = 4>    age[G, T];
  int<lower = 1, upper = 4>   race[G, T];
  int<lower = 1, upper = 51>  state[G, T];
  int<lower = 1, upper = 2> sex[G, T];
  int<lower = 0> sm_N[G, T];
  int<lower = 0> bi_N[G, T];
  int P[4, 4, 51, 2];
}
transformed data{
  vector[51] zeros = rep_vector(0.0, 51);
}
parameters {
  real alpha[T];
  real<lower = 0> sigma_beta[T];
  matrix<multiplier = sigma_beta[T]>[4, T] beta;
  real<lower = 0> sigma_gamma[T];
  matrix<multiplier = sigma_gamma[T]>[4, T] gamma;
  real<lower = 0> sigma_delta[T];
  matrix<multiplier = sigma_delta[T]>[51, T] delta;
  real epsilon[T];
  //corr_matrix[51] L_Omega;        // prior correlation
  //vector<lower=0>[51] tau;      // prior scale
  cholesky_factor_corr[51] L_Omega;
  vector<lower=0>[51] L_sigma;
  vector[51] mu;
}
transformed parameters {
  matrix[51, T] phi;
  vector[51] phi_diff[T - 1];
  matrix[51, T] expect_pos = rep_matrix(0, 51, T);
  matrix[51,T] total = rep_matrix(0, 51, T);
  matrix[51, 51] L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
  //matrix[51, 51] Sigma_beta;
  for (t in 1:T){
    for (b in 1:4){
      for (c in 1:4){
        for (e in 1:2){
          total[,t] += to_vector(P[b, c, :, e]);
          expect_pos[,t]
            += to_vector(P[b, c, :, e])
              .* to_vector(inv_logit(alpha[t] + beta[b,t] +
                                     gamma[c,t] + delta[:,t] +
                                     {epsilon[t], -epsilon[t]}[e]));
        }
      }
    }
    phi[:,t] = expect_pos[:,t] ./ total[:,t];
  }
  for (t in 2:T) phi_diff[t - 1] = phi[:, t] - phi[:, t - 1];
  // covmat
}
model {
  // covmat
  target += multi_normal_cholesky_lpdf(phi_diff | zeros, L_Sigma);
  // mrp model
  for (t in 1:T){
    sm_N[1:g[t], t] ~
      binomial_logit(bi_N[1:g[t], t],
        alpha[t] +
        beta[age[1:g[t],t], t] +
        gamma[race[1:g[t],t], t] +
        delta[state[1:g[t],t], t] +
        to_vector({epsilon[t], -epsilon[t]}[sex[1:g[t],t]]));
    beta[:,t] ~ normal(0, sigma_beta[t]);
    gamma[:,t] ~ normal(0, sigma_gamma[t]);
    delta[:,t] ~ normal(0, sigma_delta[t]);
  }
  // mrp priors
  alpha ~ normal(0, 2);
  epsilon ~ normal(0, 2);
  sigma_beta  ~ normal(0, 1);
  sigma_gamma ~ normal(0, 1);
  sigma_delta ~ normal(0, 1);
  // covmat priors
  // tau ~ normal(0, 2.5);
  // L_Omega ~ lkj_corr_cholesky(2);
  L_Omega ~ lkj_corr_cholesky(2);
  L_sigma ~ normal(0, 2.5);
}

Max_Mantei · December 6, 2020, 5:45pm

Hey there! It looks like L_Sigma is the Cholesky factor of the covariance matrix. (This is also what the L_* naming convention in these kinds of models signify: it is the Lower triangular Cholesky factor. Sidenote: this is also why I would not name the vector of marginal standard deviations L_sigma… The lower doesn’t really have a meaning here, if anything the Cholesky can be thought of as a matrix square root, but the standard deviations \sigma are already square roots of the variances \sigma^2… Sorry for that little rant, haha 😅) You can get Sigma by \Sigma=LL^T, for which there is a specialized function in Stan: multiply_lower_tri_self_transpose (matrix). You can simply compute this in the generated quantities block.

I hope this is what you are looking for!
Cheers,
Max

Merlin_Heidemanns · December 6, 2020, 5:48pm

Hi Max,

thanks. Getting the covariance matrix wouldn’t really be the issue. The problem is more so that the model itself isn’t running due to an error Rejecting initial value: Log probability evaluates to log(0), i.e. negative infinity. which I feel I understand where it is coming from (the zeros in the cholesky factor) but I don’t know how to solve this as I am unaware how the zeroes get there in the first place. Does this make sense?

Max_Mantei · December 6, 2020, 6:01pm

Hey,

sorry I didn’t catch that that is the issue. Rejecting initial values can be due to a number of reasons, but I’m pretty sure that it’s not because of the zeroes in L_Sigma since the upper triangle is supposed to be zero and from what I see in your model you specified the data types correctly and also used the Cholesky version of the LKJ prior.
I’d suggest you try more informative priors, but really with a model like this it’s hard to tell looking just at the code. Did you try the whole build up the model step by step (in terms of complexity) and see when/where it stops working thing?

Cheers,
Max

Merlin_Heidemanns · December 6, 2020, 6:21pm

No worries. The MRP model is working fine. The output is also reasonable. It fell apart when I added the part that is supposed to estimate the covariance matrix. It’s also not due to the way phi_diff is created as it doesn’t work with a data input for phi_diff from a MVN either.

Merlin_Heidemanns · December 6, 2020, 6:40pm

So this works, but it is certainly inefficient. Hence, the problem is probably with the implementation of the cholesky factor.

data {
  int T;
  int<lower = 0> G;
  int<lower = 1, upper = G> g[T]; // end points
  int<lower = 1, upper = 4>    age[G, T];
  int<lower = 1, upper = 4>   race[G, T];
  int<lower = 1, upper = 51>  state[G, T];
  int<lower = 1, upper = 2> sex[G, T];
  int<lower = 0> sm_N[G, T];
  int<lower = 0> bi_N[G, T];
  int P[4, 4, 51, 2];
  //vector[51] phi_diff [T - 1];
}
transformed data{
  vector[51] zeros = rep_vector(0.0, 51);
}
parameters {
  real alpha[T];
  real<lower = 0> sigma_beta[T];
  matrix<multiplier = sigma_beta[T]>[4, T] beta;
  real<lower = 0> sigma_gamma[T];
  matrix<multiplier = sigma_gamma[T]>[4, T] gamma;
  real<lower = 0> sigma_delta[T];
  matrix<multiplier = sigma_delta[T]>[51, T] delta;
  real epsilon[T];
  //corr_matrix[51] L_Omega;        // prior correlation
  //vector<lower=0>[51] tau;      // prior scale
  //cholesky_factor_corr[51] L_Omega;
  //vector<lower=0>[51] L_sigma;
  cov_matrix[51] Sigma;
}
transformed parameters {
  matrix[51, T] phi;
  vector[51] phi_diff[T - 1];
  matrix[51, T] expect_pos = rep_matrix(0, 51, T);
  matrix[51,T] total = rep_matrix(0, 51, T);
  //matrix[51, 51] Sigma_beta;
  for (t in 1:T){
    for (b in 1:4){
      for (c in 1:4){
        for (e in 1:2){
          total[,t] += to_vector(P[b, c, :, e]);
          expect_pos[,t]
            += to_vector(P[b, c, :, e])
              .* to_vector(inv_logit(alpha[t] + beta[b,t] +
                                     gamma[c,t] + delta[:,t] +
                                     {epsilon[t], -epsilon[t]}[e]));
        }
      }
    }
    phi[:,t] = expect_pos[:,t] ./ total[:,t];
  }
  for (t in 2:T) phi_diff[t - 1] = phi[:, t] - phi[:, t - 1];
  // covmat
}
model {
  // mrp model
  for (t in 1:T){
    sm_N[1:g[t], t] ~
      binomial_logit(bi_N[1:g[t], t],
        alpha[t] +
        beta[age[1:g[t],t], t] +
        gamma[race[1:g[t],t], t] +
        delta[state[1:g[t],t], t] +
        to_vector({epsilon[t], -epsilon[t]}[sex[1:g[t],t]]));
    beta[:,t] ~ normal(0, sigma_beta[t]);
    gamma[:,t] ~ normal(0, sigma_gamma[t]);
    delta[:,t] ~ normal(0, sigma_delta[t]);
  }
  // mrp priors
  alpha ~ normal(0, 2);
  epsilon ~ normal(0, 2);
  sigma_beta  ~ normal(0, 1);
  sigma_gamma ~ normal(0, 1);
  sigma_delta ~ normal(0, 1);
  // covmat priors
  // tau ~ normal(0, 2.5);
  // L_Omega ~ lkj_corr_cholesky(2);
  //L_Omega ~ lkj_corr_cholesky(2);
  //L_sigma ~ normal(0, 0.5);
  target += multi_normal_lpdf(phi_diff | zeros, Sigma);
}

Merlin_Heidemanns · December 6, 2020, 6:45pm

On the other hand, this has initialization problems still. Specifically, LDLT_Factor of covariance parameter is not positive definite. last conditional variance is -8.88178e-16.

Topic		Replies	Views
Issues with covariance estimation for matrix derived product Modeling techniques , fitting-issues	4	59	August 14, 2024
Including data in covariance matrix Modeling specification	1	226	May 28, 2024
Variance Component Estimation Modeling fitting-issues	0	376	July 30, 2019
Estimating covariance of random effects Modeling	5	1188	October 12, 2017
Multivariate normal data covariance estimation Modeling performance , covariance , multivariate-normal	3	1449	May 10, 2020

Estimating a covariance matrix from first difference MRP estimates

Related topics