I’ve been comparing outputs between
stan_gamm4. What I’ve found is that when I try and specify the same priors for both (as both use different default priors) but I’m getting different intercept estimates…
The stan_gamm4 version appears to match that found in gamm4 but the brms estimate (~15.5 vs 11.82) is much lower (see below for a MWE).
# SIMULATE EXAMPLE
## simulate 4 term additive truth
dat <- gamSim(6,n=400,scale=2)
mod1 <- gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),data=dat)
mod2 <- rstanarm::stan_gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),
prior_intercept = normal(0,5),
prior_smooth = normal(0,1),
prior_aux = normal(0,5))
mod3 <- brms::brm(y ~ s(x0)+s(x1) + (1|fac),
prior = c(prior(normal(0,5), class="Intercept"),
What’s stranger is when I fit the two models using default priors I get similar intercepts…
mod4 <- rstanarm::stan_gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),
mod5 <- brms::brm(y ~ s(x0)+s(x1) + (1|fac),
On another note the predictions of each model are very similar… they are basically 1:1… which makes me wonder whether I’m 1) misunderstanding the print output of brms and that they are actually the same or 2) I’ve done something wrong in model specification.
Looking at the median across intercepts in the brms output
- Operating System: OSX Catalina
- brms Version: ‘2.13.0’
Well when I run this the brms spits out error messages:
1: There were 8 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
2: Examine the pairs() plot to diagnose sampling problems
Are the rest of the defaults the same across rstanarm and brms? chains? iters? burnin? tree depth? adapt delta?
I’d start with making all the defaults explicit in both models.
I think that at least the default priors are not the same (even if they look the same on the surface) which might explain the differences.
I THINK rstanarm scales data in the background and applies scaled priors but brms doesn’t. So more pulling of posterior towards prior in brms?
Yes the burnin and chains are the same. I’ve increased the adapt_delta to remove to divergent iterations and that made no difference to the intercept estimate. As far as I can see they should provide an almost identical result when specifying the priors the same in each.
However, on closer examination I think the difference is that rstanarm by default automatically rescales priors whereas brms does not
#brms prior summary
prior class coef group resp dpar nlpar bound
2 b sx0_1
3 b sx1_1
4 normal(0, 5) Intercept
5 normal(0, 5) sd
6 sd fac
7 sd Intercept fac
8 normal(0, 1) sds
9 sds s(x0)
10 sds s(x1)
11 normal(0, 5) sigma
#rstanarm prior summary
Priors for model 'mod2'
Intercept (after predictors centered)
~ normal(location = 0, scale = 5)
~ normal(location = 0, scale = 27)
~ half-normal(location = 0, scale = 5)
~ half-normal(location = 0, scale = 27)
~ decov(reg. = 1, conc. = 1, shape = 1, scale = 1)
Turning off autoscale on the rstanarm priors seems to help with intercepts closer to each other (11.83 vs 12.4). The other difference is that it looks like rstanarm includes a covariance prior by default whereas brms does not.
Yes it looks like that’s happening.