Difference in variance Gibbs (R,C++) vs HMC (Stan) for simple normal gamma model

Hi all!

I am new to stan so apologies if this is a stupid question. In order to learn more about bayesian computing I implemented a simple independent normal gamma model (as described in Koop’s Bayesian Econometrics, which is used in my school) in R, C++ and stan. I get the same results in R and C++ (unsurprisingly since I just ported the code) but the variance of the prediction is much lower in Stan. Could this be due to the difference of HMC vs Gibbs samplers or am I doing something wrong? Please find attached the stan model (“iNG.stan”), the Gibbs sampler in C++ (“gibbsOnly.cpp”) and the runfile in R (“iNG.r”)

Thank you very much for your help!

Daniel

EDIT: Corrected files (sd instead of var)

gibbsOnly.cpp (1.6 KB)
iNG.r (2.6 KB)
iNG.stan (663 Bytes)

Hi Daniel,

I’ve had this similar question myself: Is it fair to say that Stan consists of a ‘library of validated inference algorithms’?

Following the recommendation of @bgoodri you may want to start here - [Validation of Software for Bayesian Models Using Posterior Quantiles, Cook et al.]: https://amstat.tandfonline.com/doi/abs/10.1198/106186006X136976

1 Like

Cook et al has severe limitations. We currently recommend using the more robust method introduced in https://arxiv.org/abs/1804.06788.

It’s difficult to comment without knowing much more detail. In particular, did any of Stan’s diagnostics indicate problems as described in https://betanalpha.github.io/assets/case_studies/rstan_workflow.html?

The important thing to keep in mind is that MCMC is not guaranteed to give you a reasonable answer in any finite amount of time, and many algorithms return estimates that are biased with respect to the true posterior without any indication. The algorithm used by Stan is more informative, but we can’t say anything without knowing what those informative messages are.

2 Likes

Thank you both for your quick responses! I looked at the shinystan diagnostics and nothing looks out of the ordinary. In addition I ran @betanalpha’s utility script which returned

 check_all_diagnostics(model_stan)
[1] "n_eff / iter looks reasonable for all parameters"
[1] "Rhat looks reasonable for all parameters"
[1] "0 of 20000 iterations ended with a divergence (0%)"
[1] "0 of 20000 iterations saturated the maximum tree depth of 10 (0%)"
[1] "E-BFMI indicated no pathological behavior"

which I seems to provide further indication of the mcmc having worked. May I reasonably assume that stan has run the model successfully and this is a difference between Gibbs and HMC?

I have one more theory: If I do not add a random error to my prediction I get roughly the same result in stan and R/C++. Is it simply not correct to add an error?

EDIT:
Appologies I am just an idiot. The difference is that I need to take the square root in the R version.

Thanks for running this down, good to know it wasn’t us!

1 Like