Reproducing brms results in rstan yields different estimates

PhDemetri · July 30, 2020, 6:18am

Operating System: OSX Catalina
brms Version: 2.13

I’m trying to understand how brms implements hierarchical models by writing similar models in Stan. From the documentation, prior_summary, and the generated stan code from brms, I understand that the following brms model…

library(brms)
library(tidyverse)
baseline_prior<-set_prior('normal(0.61, 0.12)', class = 'Intercept')
variant_prior<-set_prior('normal(0,1)', class = 'b', coef = 'variantB')

brms_model = brm(y|trials(n) ~ variant + (1|member_id), 
            data = ab_data, 
            prior = baseline_prior + variant_prior,
            family  = binomial())

should be equivalent to the following Stan model

stan_code="
data{
  int N;
  int n[N];
  int y[N];
  vector[N] variant;
}
parameters{
  real Intercept;
  real variantB;
  vector[N] z;
  real<lower=0> sigma;
}
model{
  Intercept~normal(0.61, 0.12);
  variantB~normal(0,1);
  z~normal(0,1);
  sigma~student_t(3,0,2.5);
  y~binomial_logit(n,Intercept + z*sigma + variantB*variant);
}
"

However, running these two models yields different results for the coefficient of variant (code to run models found here

# My Stan Code

ab_data = read_csv('ab_data.csv')
data = list(N = nrow(ab_data),
            y = ab_data$y,
            n = ab_data$n,
            variant = if_else(ab_data$variant=='A',0,1))


model = stan_model(model_code = stan_code)

fit=sampling(model, data = data, pars = c('Intercept','variantB'))

              mean se_mean    sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
Intercept     0.75    0.00  0.08     0.60     0.70     0.75     0.80     0.90  2376    1
variantB      0.24    0.00  0.13    -0.02     0.15     0.24     0.32     0.50  2124    1
lp__      -1735.62    1.97 46.89 -1826.75 -1767.32 -1736.25 -1704.76 -1643.39   567    1

# brms model

brms_model$fit


                                  mean se_mean    sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
b_Intercept                       0.77    0.00  0.09     0.59     0.70     0.77     0.83     0.96  2131 1.00
b_variantB                        0.13    0.00  0.14    -0.14     0.04     0.13     0.23     0.41  2386 1.00

It is entirely possible that I have a) misunderstood the documentation for brms default priors/prior_summary, and/or b) have not properly coded up the model I want in Stan. Which of these happens to be the case?

andrjohns · July 30, 2020, 7:24am

I think this is because brms mean-centers the input data prior to analysis, and so the prior on the intercept is then a prior on the centered intercept. In contrast, your Stan code is then a prior on the non-centered intercept, so you’re implying two different models.

martinmodrak · July 30, 2020, 10:39am

Just to add - you can use make_standata from brms to see the data that is actually passed to Stan (and hunt for differences to what you pass).

But yes, this is a good approach to get better understanding of what is going on under the hood, best of luck!

Guido_Biele · July 30, 2020, 1:02pm

Brms implements mean centering in the Stan model.

PhDemetri · July 30, 2020, 1:37pm

So the binary indicator is mean centered as well?

Guido_Biele · July 30, 2020, 1:51pm

My understanding is that brms uses model.matrix and other functions to generate the design matrix for fixed effects X.
This design matrix is part of the standata list submitted to the Stan model. In the Stan model each column in X will be mean centered.

That should be easy to see in the transformed data block in the Stan model. I recommend that you check a Stan model to verify that my recollection is correct.

PhDemetri · July 30, 2020, 2:32pm

After inspecting the Stan code, it does seem that brms centers the covariates.

Guido_Biele · July 30, 2020, 3:12pm

Yes, that’s what I thought. If you want to compare brms results with e.g. rstanarm or other approaches, you can always center your covariates in the data that is submitted to brms and e.g. rstanarm. brms will still center, but it won’t have an effect.

It’s a bit harder with ordinal or nominal covariates: Then you’d have to first generate a design matrix with model.matrix,.then center all columns, and finally use the centered design matrix as data.

Topic		Replies	Views
Trend analysis: Full stan model does not match resutls from same model in brms Modeling rstan , techniques	5	501	October 1, 2020
Why do Stan model and rstanarm/brms model with same priors have different posterior? Modeling cmdstanr	2	546	January 14, 2023
Translating/comparing brms parameter estimates to rstan estimates brms rstan	9	1025	May 12, 2020
Hierarchical random intercept model - Stan vs brms Modeling specification , brms	5	129	May 6, 2025
Stan and brms output comparison brms fitting-issues , specification	2	493	November 11, 2021

Reproducing brms results in rstan yields different estimates

Related topics