Hi all,
I am trying to run a kind of measurement error model below using rstan
, but am still getting around 5-10% divergences. I have read elsewhere on this site that any divergences should trigger alarm bells.
The model setup is below.
Where \bar{\theta}_i, s(\theta_i), \bar{\psi}_i, s(\psi_i) are known quantities. The distributions for \theta_i and \psi_i are approximated by normal distributions from a previous model run. Unfortunately, I cannot get the full model to run, but this setup does work. In particular, convergence based on trace plots looks good and I get all Rhat < 1.02. However, as mentioned above I am still getting some divergent transitions.
data {
int<lower=0> m; // number of sampled areas
int<lower=0> M; // total number of areas
int<lower=0> q_a; // number of parameters in NORMAL MODEL
matrix[M, q_a] x_a; // covariates for NORMAL MODEL
vector[m] direct_mean;
vector<lower=0>[m] direct_sd;
vector<lower=0>[m] psi_mean;
vector<lower=0>[m] psi_sd;
}
transformed data{
int q_a_C = q_a - 1; // number of parameters minus 1
matrix[M, q_a_C] x_a_C; // centered version of X without an intercept
vector[q_a_C] bar_x_a; // columns means of x before centering
for(i in 2:q_a){
bar_x_a[i - 1] = mean(x_a[,i]);
x_a_C[, i - 1] = x_a[,i] - bar_x_a[i - 1];
}
}
parameters {
// NORMAL MODEL
vector[q_a_C] beta_a; // fixed parameters
real intercept_a; // temporary intercept
vector[M] Z_v; // standard normal for error
real<lower=0> sigma_v; // standard deviation of error
// FROM LOGISTIC
vector[m] theta_o;
vector<lower=0>[m] psi_o;
}
transformed parameters{
vector[M] nu = intercept_a + x_a_C * beta_a + Z_v * sigma_v;
}
model{
// PRIORS
target += normal_lpdf(beta_a | 0, 2);
target += student_t_lpdf(intercept_a | 3, 0, 0.5);
target += std_normal_lpdf(Z_v);
target += std_normal_lpdf(sigma_v);
// FROM LOGSITIC
target += normal_lpdf(theta_o | direct_mean, direct_sd);
target += normal_lpdf(psi_o | psi_mean, psi_sd);
// MODEL
for(i in 1:m){
target += normal_lpdf( theta_o[i] | nu[i], sqrt(psi_o[i]) );
}
}
generated quantities{
real r_intercept_a = intercept_a - dot_product(bar_x_a, beta_a);
vector[m] comp_est_weight = sigma_v^2 ./ ( psi_o + sigma_v^2 ); // composite estimator weight
vector[M] mu = inv_logit(nu);
vector[m] il_theta_o = inv_logit(theta_o);
}
Here are the trace plots for the fixed parameters.
And here is a pairs plot showing the divergence transitions. The parameter posteriors for obs 47 gives Rhat \approx 1.02 and insufficient bulk ESS.
I cannot see any discernable patterns with the area of the parameter space where the divergences are appearing, except of course the high correlation between nu
and Z_v
.
In this example I only have about 100 observations so I am sure that the problems have something to do with this. You’ll see that I have used the non-mean centred parameterization where I can.
I would be confident that the problem with divergences could be solved if I could reparameterize the model somehow. I wonder if anyone would have any further tricks or ideas. Thank you in advance!
Jamie