Dear Stan forum members,
I’m trying to set up a simple intercept-only model with a lognormal likelihood and an adaptive prior for the varying a_an intercepts. The latter is done by adding a prior for sigma_an:
data{
int<lower=1> N;
int<lower=1> N_an;
int<lower=1> N_st;
int<lower=1> N_gr;
int idx_an[N];
int idx_st[N];
int idx_gr[N];
real<lower=0> Area_s[N];
}
parameters{
real a;
vector[N_an] a_an;
vector[N_st] a_st;
vector[N_gr] a_gr;
real<lower=0> sigma_an;
real<lower=0> sigma;
}
model{
vector[N] mu;
sigma ~ cauchy(0, 1 );
sigma_an ~ lognormal(0, 1 ); //half-Cauchy, half-normal, exponential priors give bfmi-low warning
a_an ~ normal( 0 , sigma_an );
a_st ~ normal( 0 , 1 );
a_gr ~ normal( 0 , 1 );
a ~ normal( 1 , 1 );
for (i in 1:N) {
mu[i] = a + a_an[idx_an[i]] + a_st[idx_st[i]] + a_gr[idx_gr[i]];
}
Area_s ~ lognormal(mu, sigma);
}
If, for sigma_an, I use the recommended half-Cauchy, half-normal, half-t or exponential prior (with a variety of scale parameters) I always get a “estimated Bayesian Fraction of Missing Information was low” warning (iter = 5000 and warmup = 1000). This, I think, means that the MC’s could not sufficiently explore the posterior. A warning that should not be ignored, probably. The recommendations are 1) reparameterization, 2) more warm-up samples. In my model I can’t think of a way to reparameterize and more warm-up samples did not solve the problem either (i tried up to warmup = 3000).
the pairs plot for the model with a half-Cauchy prior looks like this (energy__ and sigma_an are clearly correlated):
Now when I use a lognormal(0,1) prior for sigma_an the model runs smoothly and the a_an parameters do get regularized. The number of effective parameters according to WAIC is much lower than for the model without an adaptive prior. I used the lognormal because it seemed that the problems were with close-to-zero values of sigma_an and I don’t expect sigma_an to be close to zero anyway. Therefore I used the lognormal(0,1) which has less density close to zero but a fairly heavy tail. The pairs plot looks like this:
However the lognormal prior is not a recommended one, according to the prior choice recommendations and there is still some correlations between energy__ and sigma_an (although check_energy() returns "no pathological behavior). Since I’m fairly new to Stan and Bayesian stats (mainly self-thought, yikes) I am afraid of doing something that I should not be doing (besides meddling with Stan).
My questions are 1) is it “allowed” to use a lognormal prior for the standard deviation? 2) Is there a way I could reparameterize this model so a half-Cauchy/exponential prior might be applied without the bfmi-low warning? 3) Could I ignore the bfmi-low warning?
Thanks for reading,
Seb