Stan_gamm4 and brms unexplained difference in intercept estimates

jscamac · June 2, 2020, 3:23pm

I’ve been comparing outputs between brms and stan_gamm4. What I’ve found is that when I try and specify the same priors for both (as both use different default priors) but I’m getting different intercept estimates…
The stan_gamm4 version appears to match that found in gamm4 but the brms estimate (~15.5 vs 11.82) is much lower (see below for a MWE).

library(rstanarm)
library(mgcv)
library(brms)
library(gamm4)

# SIMULATE EXAMPLE
set.seed(1234)
## simulate 4 term additive truth
dat <- gamSim(6,n=400,scale=2)
## GAMM4 
mod1 <- gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),data=dat)
#Estimated intercept:
fixed.effects(g4$mer)[[1]]
#[1] 15.6657

# RSTANARM
mod2 <- rstanarm::stan_gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),
                             data=dat,
                             prior_intercept = normal(0,5),
                             prior_smooth = normal(0,1), 
                             prior_aux = normal(0,5))
#Estimated intercept
mod2$coefficients[[1]]
#[1] 15.49564

# BRMS

mod3 <- brms::brm(y ~ s(x0)+s(x1) + (1|fac),
                  data=dat,
                  prior = c(prior(normal(0,5), class="Intercept"),
                            prior(normal(0,1), class="sds"),
                            prior(normal(0,5), class="sd")))

fixef(mod3)[[1]]
#[1] 11.82925

What’s stranger is when I fit the two models using default priors I get similar intercepts…

mod4 <- rstanarm::stan_gamm4(y ~ s(x0)+s(x1), random = ~(1|fac),
                             data=dat)

mod5 <- brms::brm(y ~ s(x0)+s(x1) + (1|fac),
                  data=dat)   

> mod4$coefficients[[1]]
#[1] 15.70542

fixef(mod5)[[1]]
#[1] 15.49703

On another note the predictions of each model are very similar… they are basically 1:1… which makes me wonder whether I’m 1) misunderstanding the print output of brms and that they are actually the same or 2) I’ve done something wrong in model specification.
Looking at the median across intercepts in the brms output

Operating System: OSX Catalina
brms Version: ‘2.13.0’

Ara_Winter · June 2, 2020, 6:01pm

Well when I run this the brms spits out error messages:

Warning messages:
1: There were 8 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems

Are the rest of the defaults the same across rstanarm and brms? chains? iters? burnin? tree depth? adapt delta?

I’d start with making all the defaults explicit in both models.

paul.buerkner · June 2, 2020, 6:14pm

I think that at least the default priors are not the same (even if they look the same on the surface) which might explain the differences.

dilsherdhillon · June 2, 2020, 9:54pm

I THINK rstanarm scales data in the background and applies scaled priors but brms doesn’t. So more pulling of posterior towards prior in brms?

jscamac · June 2, 2020, 9:56pm

Yes the burnin and chains are the same. I’ve increased the adapt_delta to remove to divergent iterations and that made no difference to the intercept estimate. As far as I can see they should provide an almost identical result when specifying the priors the same in each.

However, on closer examination I think the difference is that rstanarm by default automatically rescales priors whereas brms does not

#brms prior summary
prior_summary(mod3)
                  prior     class      coef group resp dpar nlpar bound
1                               b                                      
2                               b     sx0_1                            
3                               b     sx1_1                            
4          normal(0, 5) Intercept                                      
5          normal(0, 5)        sd                                      
6                              sd             fac                      
7                              sd Intercept   fac                      
8          normal(0, 1)       sds                                      
9                             sds     s(x0)                            
10                            sds     s(x1)                            
11         normal(0, 5)              sigma    

#rstanarm prior summary
prior_summary(mod2)
Priors for model 'mod2' 
------
Intercept (after predictors centered)
  Specified prior:
    ~ normal(location = 0, scale = 5)
  Adjusted prior:
    ~ normal(location = 0, scale = 27)

Auxiliary (sigma)
  Specified prior:
    ~ half-normal(location = 0, scale = 5)
  Adjusted prior:
    ~ half-normal(location = 0, scale = 27)

Covariance
 ~ decov(reg. = 1, conc. = 1, shape = 1, scale = 1)

Turning off autoscale on the rstanarm priors seems to help with intercepts closer to each other (11.83 vs 12.4). The other difference is that it looks like rstanarm includes a covariance prior by default whereas brms does not.

jscamac · June 2, 2020, 9:57pm

Yes it looks like that’s happening.

Topic		Replies	Views
Why do Stan model and rstanarm/brms model with same priors have different posterior? Modeling cmdstanr	2	547	January 14, 2023
Same model with different default priors in rstanarm and brms? rstanarm	4	1305	May 7, 2021
Reproducing brms results in rstan yields different estimates brms	7	816	July 30, 2020
Hierarchical random intercept model - Stan vs brms Modeling specification , brms	5	131	May 6, 2025
Reproducing simple rstanarm meta-analysis in brms brms meta-analysis	1	757	March 29, 2020

Stan_gamm4 and brms unexplained difference in intercept estimates

Related topics