Hi everyone,
I am encountering several warnings (divergences, low BFMI, and low ESS) when running a Bessel regression model that includes spatial random effects (delta and tau).
Crucially, the model works perfectly well and recovers the parameters correctly if I remove the spatial random effects. The issues only arise when I introduce the spatial component.
Here is the Stan model I am using:
functions{
real zeta(real z, real mu) {
return sqrt(1 + (((z-mu)^2) / (z*(1-z))));
}
real bessel_lpdf(real z, real mu, real phi){
return log(mu)+log(1-mu)+log(phi)+phi-log(pi())-(3.0/2.0)*(log(z)+log(1-z))+
log(modified_bessel_second_kind(1, (phi*zeta(z,mu)))) - log(zeta(z,mu));
}
}
data {
int<lower=0> N;
int<lower=0> M1;
int<lower=0> M2;
vector[N] y;
matrix[N,M1] X;
matrix[N,M2] V;
matrix[N,N] Minv;
real a;
real b;
real<lower = 0> sigma1;
real<lower = 0> sigma2;
matrix[3,3] MKap;
vector[M1] VKap;
}
transformed data {
matrix[N,N] L_Minv = cholesky_decompose(Minv);
}
parameters {
vector[M1] kap;
vector[M2] lam;
real<lower = 0> tau;
vector[N] delta_raw;
}
transformed parameters{
vector<lower = 0, upper = 1>[N] mu;
vector<lower = 0>[N] phi;
vector[N] delta = (sqrt(tau) * L_Minv) * delta_raw;
for (i in 1:N){
mu[i] = exp((X[i]*kap)+delta[i])/(1+exp((X[i]*kap)+delta[i]));
phi[i] = exp(V[i]*lam);
}
}
model {
tau ~ gamma(a,b);
delta_raw ~ std_normal();
kap ~ multi_normal(VKap, (MKap));
lam ~ multi_normal(rep_vector(0, M2), (sigma2*identity_matrix(M2)));
for (i in 1:N){
y[i] ~ bessel(mu[i],phi[i]);
}
}
I generated synthetic data Y in R using the Bessel distribution defined above (where 0 < Y < 1). Each observation Y_i has its own \mu_i and \phi_i.
For the covariates, X[,1] is a vector with only 1, X[,2] is generated from a Bernoulli(0.5) distribution, and X[,3] from a Uniform(-1, 1). The matrix V follows the same structure but with different values.
Priors:
-
Kappa and Lambda:
MultiNormal(rep(0,3), 10*I[3x3]) -
\mu_i = \kappa_0 * X[i,1] + \kappa_1 * X[i,2] + \kappa_2 * X[i,3] + \delta[i]
*Same with phi.
Spatial Component:
I am trying to model delta as a CAR process: CAR(0, \tau \times \Sigma), where \Sigma is the covariance matrix derived from the neighbors. I am using a non-centered parameterization for delta because the performance was significantly worse without it.
Warnings in R:
Warning messages:
1: There were 20 divergent transitions after warmup. See
https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
2: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
https://mc-stan.org/misc/warnings.html#bfmi-low
3: Examine the pairs() plot to diagnose sampling problems
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
https://mc-stan.org/misc/warnings.html#tail-ess
I have attached the pairs plot below. Any advice on how to address these divergences and the low BFMI would be greatly appreciated.
