BYM model : translating from WinBUGS to STAN

Hi All,
It’s my first time using STAN and trying to reproduce someone else’s older WinBUGS analysis (attached).

My STAN sampler (attached) is at the moment very slow, giving also warnings about maximum treedepth, so I would like to make sure that the model defined in STAN is correct and equivalent to that in WinBUGS before setting it to run for more iterations. In summary, I have a Bernoulli GLM where the target variable indicates individual’s disease status and the linear predictor includes an individual’s age, 2 area specific effects as in BYM and a random effect with different prior variance according to whether an individual is a dizygotic or monozygotic twin. So the linear predictor of the i-th individual reads:

logit( p[i] )= b0+b1*age[i] + phi[ area[i] ] + theta[ area[i] ] + twin_re[i]

Also, would it be possible to know what changes should I make in the above code and the function “mungeCARdata4stan” in case of islands? Any help would be much appreciated!

winBUGS_Model.txt (1.4 KB)

stan_Model.txt (4.2 KB)

hi Joan,

by islands do you mean areas which have 0 neighbors, or do you mean disconnected components of the spatial graph?

regarding the parameterization with components phi and theta, have you looked at the discussion around the BYM2 model which uses a different parameterization of the spatial + non-spatial variance and different priors?


the WinBUGS priors don’t work very well in Stan, in my experience.

Hi,
Thank you for you reply. By islands I mean areas with 0 neighbours.

With respect to BYM2 model, I did have look at it but started from trying to implement the simple BYM as it appeared simpler and closer to what already implemented in WinBUGS so that I am able to make comparisons easier between the two. Will definitely give BYM2 a try as soon as I have some reassurance that my coding of BYM is correct.

A big concern of mine was that I may be comparing quantities that are not actually the same between the WinBUGS and STAN code (e.g., what I have defined as phi in one being different in the other due to the scalings etc). Many thanks.

I suggest you consult this paper, section 4: [1705.04854] A note on intrinsic Conditional Autoregressive models for disconnected graphs

“A note on intrinsic Conditional Autoregressive models for disconnected graphs” Sterrantino,Ventrucci, Rue

the constant prior for the singleton makes it difficult for the singleton random effect to shrink to the global mean. Also, this goes against the purpose of using in the first place, which is to do smoothing and borrowing strength.