I’m trying to better understand the shape of the prior distribution in my model with a multivariate normal distribution. In this model I am attempting to calculate a covariance matrix.
To make setting priors more straightforward I standardize each variable for which I’m calculating the covariance.
My current problem: I can’t find any set of priors that will yield a stable distribution when sampling from the prior.
I’m trying a test model based closely on that described in the Stan Manual 2.17, p155. I currently have:
data{
int VARCOUNT;
int LENGTH;
}
transformed data{
vector[VARCOUNT] zeros = rep_vector(0,VARCOUNT);
}
parameters {
cholesky_factor_corr[VARCOUNT] L_Omega;
vector<lower=0>[VARCOUNT] L_sigma;
//would normally be data,
//but since I'm sampling from the prior I've made this a parameter instead.
vector[VARCOUNT] covar[LENGTH];
}
transformed parameters {
matrix[VARCOUNT, VARCOUNT] L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
matrix[VARCOUNT, VARCOUNT] Sigma = L_Sigma * L_Sigma';
}
model{
L_Omega ~ lkj_corr_cholesky(1); //parameter was 4 in manual but using 1 because our variables are individually normally distributed
L_sigma ~ cauchy(0,1); //gamma was 2 in manual but using 1 because our variables are normally distributed
covar ~ multi_normal_cholesky(zeros,L_Sigma);
}
When estimating this model with VARCOUNT=7, LENGTH=178, 500 warmup iterations and 100 further samples per chain, I get stable estimates for everything…except L_sigma ~ cauchy(0.1)
.
The Rhat values are high because the efficiency for L_sigma is extremely low, sometimes equal to the number of chains, indicating that the model wasn’t able to sample values for L_sigma
at all.
I’ve repeated this using priors for the cauchy and lkj_corr_cholesky samplers that are identical to those in the manual p155, but it doesn’t make any difference. I have also tried replacing the cauchy distribution with a truncated normal distribution, or an exponential transform normal distribution, as in:
data{
int<lower=1> LENGTH;
int<lower=2> VARCOUNT;
}
transformed data{
vector[VARCOUNT] zeros = rep_vector(0,VARCOUNT);
}
parameters {
cholesky_factor_corr[VARCOUNT] L_Omega;
vector[VARCOUNT] L_sigma_norm;
vector[VARCOUNT] covar[LENGTH];
}
transformed parameters {
vector<lower=0>[VARCOUNT] L_sigma = exp(L_sigma_norm);
matrix[VARCOUNT, VARCOUNT] L_Sigma = diag_pre_multiply(L_sigma, L_Omega);
matrix[VARCOUNT, VARCOUNT] Sigma = L_Sigma * L_Sigma';
}
model {
L_Omega ~ lkj_corr_cholesky(1);
L_sigma_norm ~ normal(0,1);
covar ~ multi_normal_cholesky(zeros,L_Sigma);
}
…to no avail - in each case all parameters except L_sigma
are stably estimated, and there are hardly any good samples for L_sigma
at all, let alone a stable estimate.
I understand perhaps this model could be insufficiently constrained and thus not be able to get a stable estimate for all parameters.
But surely there is a way to get a stable estimate of the prior of this covariance distribution?