Sensitivity of Stan in hierarchical (hyper-) priors

billas · July 13, 2020, 8:29pm

I try to fit a simple hierarchical linear regression example (implemented in BUGS) in RStan but I meet a few divergent transitions and the results are a little different than BUGS results. I think that the problem concerns the priors’ specification, especially in scale parameters (\sigma_{a}, \sigma_{y}). Apart from the half-normal distribution (used here) for the scale parameters, I also used the half-cauchy distribution without solving this problem.

So, the following question arise:

Why is the prior specification so complicated for hierarchical models in RStan? In BUGS, a very weak informative prior distribution for the precision parameter (precision inverse of variance) gives us credible results instead of Rstan in which the prior specification of these parameters is more complicated and sensitive.
Is this issue due to the different algorithm used in BUGS (Gibbs sampler) compared with Stan (Hamiltonian Monte Carlo)?
If I want to spcify very weakly informative prior distributions in similar cases, which are the proper priors distributions for scale parameters? For this issue, I checked this link https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations and I tried to specify the proposed prior choices without avoiding the divergent transitions and the different results.

The model I fit in Stan is:

data{

  int<lower=0> K;
  int<lower=0> nsim;
    matrix[nsim, K] y;
}

parameters{
  real m;
  real a[nsim];
  real<lower=0> sigma_alpha;
  real<lower=0> sigma_y;
}

transformed parameters{
  matrix[nsim, K] mu;
  	for (i in 1:nsim){
	  	for (j in 1:K){
			  mu[i,j]=m+a[i];
	  	}
    }
}

model {
	  	target += normal_lpdf(m | 0, 10);
	  	target += normal_lpdf(sigma_y | 0,3);
  		target += normal_lpdf(sigma_alpha | 0,1);

	target += normal_lpdf(a | 0, sigma_alpha);


	for (i in 1:nsim){
		for (j in 1:K){
				target += normal_lpdf(y[i,j] | mu[i,j],sigma_y);
		}
  	}
		
}

while the corresponding code in BUGS is here: bugscode

. The data are given by the below R code:

####---------Data Specification--------####
y<-matrix(c(108, 98, 91, 94, 93, 96, 104, 99, 99, 97, 95, 98, 93, 97, 99, 96, 90, 100, 92, 95, 101, 89, 97, 97, 97,
100, 96, 95, 106, 100, 100, 98, 90, 99, 88, 98, 92, 92, 100, 101),nrow=20,ncol=2,byrow=T)

model_data<-list(y)

model_data$K<-2
model_data$nsim<-20
names(model_data)[1]<-“y”

bbbales2 · July 14, 2020, 2:11pm

Try rewriting this with a non-centered parameterization: https://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html

That’ll mean adding a new parameter:

real az[nsim];

with prior:

az ~ normal(0, 1);

And moving a to the transformed parameters block:

for(i in 1:nsim) {
  a[i] = az[i] * sigma_alpha;
}

(Edit: a_z -> az)

Also I don’t think you need mu.

I think you can just do:

for (j in 1:K) {
  target += normal_lpdf(y[,j] | a + m, sigma_y);
}

(Edit: I screwed this up the first time and included an unnecessary loop, fixed now)

Topic		Replies	Views
Help translate OpenBugs to Stan model: Hierarchical model Modeling rstan , specification	2	321	July 6, 2023
Setting up priors correctly Modeling rstan , fitting-issues , specification	3	913	May 15, 2023
Divergent transitions with hierarchical model Modeling	5	740	July 15, 2019
Why different priors produce very large results Modeling specification	9	246	December 29, 2024
Sampling from the prior - why am I seeing divergent transitions? Modeling ecology	6	2865	December 10, 2021

Sensitivity of Stan in hierarchical (hyper-) priors

Related topics