Issues when rewrite winbugs code to stan

Hi all, I am trying to write the winbugs code using stan, the winbugs code looks like below

model;
{
for(j in 1:9){             # j indexes trial 
# (1) fixed effect based on normal approximation
  y1[j]<- y[j]
  y1[j]    ~ dnorm(theta[1,j],sigma2.inv[j])
  theta[1,j]  ~ dunif(-10,10)

# (2) pooled effect based on normal approximation
  y2[j]<- y[j]
  y2[j]    ~ dnorm(theta[2,j],sigma2.inv[j])
  theta[2,j]  <- mu[2]

# (3) random effect based on normal approximation
  y3[j]<- y[j]
  y3[j]    ~ dnorm(theta[3,j],sigma2.inv[j])
  theta[3,j]  ~ dnorm(mu[3],tau2.inv[3])

# (4) random, exact binomial, indep uniform
    rt4[j] <- rt[j]; rc4[j] <- rc[j]  
		rt4[j]  ~ dbin(pt4[j],nt[j])
		rc4[j]  ~ dbin(pc4[j],nc[j])
		logit(pt4[j]) <- theta[4,j] + logit(pc4[j])
		pc4[j]         ~ dunif(0,1)
    theta[4,j]  ~ dnorm(mu[4],tau2.inv[4])
 
# (5) random, exact binomial, indep on logodds
    rt5[j] <- rt[j]; rc5[j] <- rc[j]  
		rt5[j]  ~ dbin(pt5[j],nt[j])
		rc5[j]  ~ dbin(pc5[j],nc[j])
		logit(pt5[j]) <- theta[5,j] + phi5[j]
		logit(pc5[j]) <- phi5[j]
		phi5[j]     ~ dunif(-10,10)
    theta[5,j]  ~ dnorm(mu[5],tau2.inv[5])
 
}
phi.mean ~ dunif(-10,10)       # uniform on control group mean
phi.sd ~ dunif(0,50)# Prior: Uniform(0,50) on tau
phi.prec <- 1/(phi.sd*phi.sd)

for(i in 1:5){
mu[i]  ~ dunif(-10,10)       # uniform on overall mean
tau[i] ~ dunif(0,50)# Prior: Uniform(0,50) on tau
tau2.inv[i] <- 1/(tau[i] * tau[i])
}
 
# j again indexes study number

for(j in 1:9)
	{	
	dummy[j]<-id[j]+date[j]  #  just to mention all data     # log-odds ratios		
	y[j] <- log(((rt[j] + 0.5)/(nt[j] - rt[j] + 0.5))/((rc[j] + 0.5)/(nc[j] - rc[j] + 0.5)))
# variances & precisions
	sigma2[j] <- 1/(rt[j] + 0.5) + 1/(nt[j] - rt[j] + 0.5) + 1/(rc[j] + 0.5) + 1/(nc[j] - rc[j] + 0.5)
	sigma2.inv[j]  <- 1/sigma2[j]
	}
}

Here is the stan code I have

dat<-read.table(header = TRUE, text = "
id[]	date[]	nt[]	rt[]	nc[]	rc[]
1	 1976	 175      1      175      1	  
2 	1976 	242 	2 	241 	1
3 	1978 	253 	0 	251 	1
4 	1979 	463 	3 	232 	0
5 	1981 	445 	1 	482 	0
6 	1985 	485 	0 	493 	1
7 	1985    6530    14    6554    14
8 	1987 	122    17 	124    18
9 	1993 	746 	2 	682 	9
")
names(dat) <- gsub(".", "", names(dat), fixed = TRUE)
dataList = list(
  Ntotal = 9,
  nt=dat$nt,
  rt=dat$rt,
  nc=dat$nc,
  rc=dat$rc,
  rt4=dat$rt,
  rc4=dat$rc,
  rt5=dat$rt,
  rc5=dat$rc,
  y=log(((dat$rt + 0.5)/(dat$nt - dat$rt + 0.5))/((dat$rc + 0.5)/(dat$nc - dat$rc + 0.5))),
  y1=log(((dat$rt + 0.5)/(dat$nt - dat$rt + 0.5))/((dat$rc + 0.5)/(dat$nc - dat$rc + 0.5))),
  y2=log(((dat$rt + 0.5)/(dat$nt - dat$rt + 0.5))/((dat$rc + 0.5)/(dat$nc - dat$rc + 0.5))),
  y3=log(((dat$rt + 0.5)/(dat$nt - dat$rt + 0.5))/((dat$rc + 0.5)/(dat$nc - dat$rc + 0.5)))
)

################Stan###############
modelString1 = "
data{
  int<lower=0>    Ntotal;
  int<lower=0>    nt[Ntotal];
  int<lower=0>    rt[Ntotal];
  int<lower=0>    nc[Ntotal];
  int<lower=0>    rc[Ntotal];
  int<lower=0>  rt4[Ntotal];
  int<lower=0>  rc4[Ntotal];
  int<lower=0>  rt5[Ntotal]; 
  int<lower=0>  rc5[Ntotal];
  real y[Ntotal];
  real y1[Ntotal];
  real y2[Ntotal];
  real y3[Ntotal];

}
transformed data {
  vector<lower=0>[9] sigma2;
  for(j in 1:9){ 
    sigma2[j] = 1/(rt[j] + 0.5) + 1/(nt[j]  - rt[j] + 0.5) + 1/(rc[j] + 0.5) + 1/(nc[j] - rc[j] + 0.5);
  }
}
parameters {
  
 
  real<lower=-10,upper=10> phimean;
  real<lower=0,upper=50>  phisd;

  real<lower=-10,upper=10>  theta1[Ntotal];
  real<lower=-10,upper=10>  theta2[Ntotal];

  real<lower=0,upper=1> pc4[Ntotal];
  real<lower=-10,upper=10>  phi5[Ntotal];
 
  real<lower=0,upper=50>  tau4;
  real<lower=0,upper=50>  tau5;
  real<lower=0,upper=50> tau3;
  real<lower=-10,upper=10>  mu4;
  real<lower=-10,upper=10>  mu5;
  real<lower=-10,upper=10>  mu3;


  vector[Ntotal] theta3;
  vector[Ntotal] theta4;
  vector[Ntotal] theta5;

}

  

model
{
  vector[Ntotal] pt4;
  vector[Ntotal] pt5;
  vector[Ntotal] pc5;


for(j in 1:9){             
//# (1) fixed effect based on normal approximation
    target += normal_lpdf(y1[j] | theta1[j],sqrt(sigma2[j]));

//# (2) pooled effect based on normal approximation
  target += normal_lpdf(y2[j] | theta2[j],sqrt(sigma2[j]));

//# (3) random effect based on normal approximation
  target += normal_lpdf(y3[j] | theta3[j],sqrt(sigma2[j]));
  target += normal_lpdf(theta3[j] | mu3,tau3);

  
//# (4) random, exact binomial, indep uniform
    pt4[j] = theta4[j] + logit(pc4[j]);
		rt4[j] ~  binomial_logit(nt[j],pt4[j]);
		rc4[j]  ~ binomial(nc[j],pc4[j]);
    target += normal_lpdf(theta4[j] | mu4,tau4);
    
//# (5) random, exact binomial, indep on logodds
    	pt5[j] = theta5[j] + phi5[j];
   		pc5[j] = phi5[j];
	  	rt5[j]  ~ binomial_logit(nt[j],pt5[j]);
  		rc5[j]  ~ binomial_logit(nc[j],pc5[j]);
    target += normal_lpdf(theta5[j] | mu5,tau5);
}
}


"

writeLines( modelString1 , con="TEMPmodel.stan" )
stanModel <- stan(
  data = dataList ,
  file = "TEMPmodel.stan",  
  chains = 4,            
  warmup = 5000,
  iter = 25000,
  control = list(adapt_delta = 0.999)
  
)

stanModel


I am not sure why I have so many warnings below:
2: There were 1078 divergent transitions after warmup. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
to find out why this is a problem and how to eliminate them.
3: There were 19942 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
4: Examine the pairs() plot to diagnose sampling problems

Any help is appreciated!!!

When you have poor diagnostics with complex models, it’s always recommend to start with the simplest model possible (e.g., only outcome) and slowly build up the model to see where the issues are introduced.

As I’ve mentioned in your previous threads, hierarchical parameters like:

    target += normal_lpdf(theta4[j] | mu4,tau4);

Can result in divergences and so the non-centered parameterisation is a recommended alternative

Thanks for looking into this. There are many strange things:

  1. I am struggle with why should I assign the mu2, mu3, mu4 …, tau3, tau4, tau5… a different name for mu and tau. Can I do mu[j] and tau[j]? maybe not, because after I do that, there are even more issues. Not sure why?

  2. The results from stan is different from the winbugs result, do you know why? Is that what I did is not correct?

                                      Stan results:           Winbugs results:
    

tau 3 ( 95% CI) 0.39(0.02,1.59) 0.32 (0.01,1.5)
tau 4 ( 95% CI) 0.57(0.6,2.21) 0.54(0.01,2.25)
tau 5 ( 95% CI) 0.71(0.8,2.91) 0.75(0.09,2.82)

  1. if I did what is your suggestions from other post like below: vector<offset=mu3,multiplier=tau3>[Ntotal] theta3;
    vector<offset=mu4,multiplier=tau4>[Ntotal] theta4;
    vector<offset=mu5,multiplier=tau5>[Ntotal] theta5;

There are many more issues if I only use adapt_delta = 0.99, so that is what I concern about most.

But as you know after I add the control = list(adapt_delta = 0.999,max_treedepth = 15) there are no issues. Is this to say, we should always use those options? I do not personally think we should always use this kind of options, do we?

Try to start with a much simpler model. Start by picking a single outcome and build a simple model for that that does not run with divergences, then iteratively add complexity (multiple outcomes, random effects, etc).

It’s much easier to get a stable, replicable, model when you start from the ground-up

Yes, I should try to start from the simple thing.

But do you have any clue why there are differences between results of different tools(I mean stan and winbugs)? Because from my point of view that the results are different, but the models are actully the same. Maybe still issues with the code?

The two programs are algorithmically different (HMC vs MH/Gibbs), so they won’t produce identical results. The results you gave:

             Stan results:           Winbugs results:

tau 3 ( 95% CI) 0.39(0.02,1.59) 0.32 (0.01,1.5)
tau 4 ( 95% CI) 0.57(0.6,2.21) 0.54(0.01,2.25)
tau 5 ( 95% CI) 0.71(0.8,2.91) 0.75(0.09,2.82)

Are very similar between the two programs (and the credibility intervals are both very wide), which looks fine (from a brief glance, I haven’t verified that the models are the same).

Additionally, divergent transitions indicate that the sampler has not been able to fully explore the posterior of your model, and so the estimates may be biased or unrepresentative

If control = list(adapt_delta = 0.999,max_treedepth = 15) there are no issues, and without this options there are issues, do we still need to consider divergent transitions when interpreting results? This is to say, if we can omit the issues by the options, will the estimates still be biased or unrepresentative?

There’s no definitive answer to that unfortunately. Sometimes a high adapt_delta is needed for a given model, and sometimes it might just be masking deeper issues.

A good resource for the process of Bayesian model building and checking is the recent Bayesian Workflow paper: [2011.01808] Bayesian Workflow

1 Like

Thanks!!!

Btw, can I use mu[j] and tau[j] like winbugs do in this example? I think assign mu2, mu3, mu4 …, tau3, tau4, tau5… is a not very concise. Any good suggstions for this? ^^

vector[4] mu

Then you can use mu[1], mu[2], etc

1 Like