Including data in covariance matrix

levis · May 28, 2024, 11:00am

one of the objectives of my STAN model is to see how age covaries with the parameters of the model. However, in simulations, the model recovers a covariance matrix that is much lower than the actual covariance between parameters. Does anyone have any insight regarding why that could be with an approach like this?

TIA.

data {
  int<lower=1> nS; // number of subjects
  vector[nS] age_zscored; // Z-scored age
}

parameters {
  vector[2] theta_mu; 
  matrix[2, nS] theta_pr_minus2;
  vector<lower=0>[2] sigma; // parameter variance  
  cholesky_factor_corr[3] L; // include age 
}

transformed parameters {
  matrix[2, nS] theta_pr;
  vector<lower=0>[3] sigma_pr;
  matrix[nS, 3] theta;
  //parameters that may covary with age 
  vector<lower=0,upper=1>[nS] alpha;
  vector[nS] beta_1;

  // Assign fixed standard deviation for age (first element)
  sigma_pr[1] = 1;
  sigma_pr[2:3] = sigma;

  // Assign age as the first row of theta_pr
  theta_pr[1, ] = to_row_vector(age_zscored);
  theta_pr[2:3, ] = theta_pr_minus2;

  // Transform theta_pr to get theta
  theta = transpose(diag_pre_multiply(sigma_pr, L) * theta_pr);
  beta_1 = (theta_mu[1] + theta[,2]) * 5;
  alpha = Phi_approx(theta_mu[2] + theta[,3]);
  
}


model {
  // Priors
  target += normal_lpdf(theta_mu | 0, 2);
  target += std_normal_lpdf(to_vector(theta_pr_minus2));
  target += student_t_lpdf(sigma | 3, 0, 1);
  target += lkj_corr_cholesky_lpdf(L | 1);
  
  // model specification goes here 
}

generated quantities {
  matrix[3,3] Omega = multiply_lower_tri_self_transpose(L);
}

Bob_Carpenter · May 28, 2024, 7:12pm

Do you mean a covariance matrix with smaller absolute correlations than you used to simulate? Are you sure you’re simulating from the model’s priors and data-generating process? The reason I ask is that if you simulate data from a Bayesian model, then fit it using the same model, it should be well calibrated if the sampler is functioning correctly. So my question is really whether you’ve tried something like simulation-based calibration and seen it fail?

P.S. This statement is redundant because the 1 parameter makes the distribution over L uniform and thus a no-op.

  target += lkj_corr_cholesky_lpdf(L | 1);

P.P.S. “Stan” is not an acronym—it’s named after Stanislaw Ulam.

Topic		Replies	Views
Covariance matrix priors Modeling rstan , techniques	5	405	December 4, 2023
Non-static covariance matrices? Modeling specification	8	1633	January 22, 2018
Issues with covariance estimation for matrix derived product Modeling techniques , fitting-issues	4	62	August 14, 2024
Issue with covariance matrix in Stan Modeling fitting-issues , specification	13	699	June 27, 2023
Specifying Correlation in Stan Modeling	10	1064	October 9, 2021

Including data in covariance matrix

Related topics