I know I shouldn’t treat this forum like a free course in coding stan but, in order to improve my modeling in stan, I am trying to understand some of the stan code generated by the brm function.
I have a very simple dataset, 20 flips each of two biased coins. Here is the toy data.
set.seed(1234)
df <- data.frame(coins = factor(rep(letters[1:2],each=20)),
flips = c(rbinom(n = 20,
size = 1,
prob = 0.2),
rbinom(n = 20,
size = 1,
prob = 0.7)))
I want to estimate the difference in bias of the two coins so I run a bernoulli regression.
brmMod <- brms::brm(formula = flips ~ coins,
data = df,
family = bernoulli(link = "logit"))
brmMod
# output
# Population-Level Effects:
# Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
# Intercept -1.86 0.64 -3.21 -0.72 1.00 2316 2754
# coinsb 2.75 0.82 1.21 4.45 1.00 2526 2378
Now when I look at the underlying stan code generated by brm() I get the following
brms::stancode(brmMod)
functions {
}
data {
int<lower=1> N; // total number of observations
int Y[N]; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
}
transformed parameters {
}
model {
// likelihood including all constants
if (!prior_only) {
target += bernoulli_logit_glm_lpmf(Y | Xc, Intercept, b);
}
// priors including all constants
target += student_t_lpdf(Intercept | 3, 0, 2.5);
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}
Now I think I understand most of this code and I tried transposing it using rstan like so
# create data
data <- list(N = nrow(df),
Y = df$flips,
K = 2,
X = model.matrix(flips ~ coins, data = df))
# create model string
ms <- "
data{
int<lower=1> N; // total number of observations
int Y[N]; // response variable
int<lower=1> K; // number of population-level effects
matrix[N,K] X; // population-level design matrix
}
transformed data{
int Kc = K-1;
matrix[N, Kc] Xc; //centered version of X without an intercept
vector[Kc] means_X; //column means of X before centering
for (i in 2:K) {
means_X[i-1] = mean(X[,i]);
Xc[,i-1] = X[,i] - means_X[i-1];
}
}
parameters{
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
}
model{
// likelihood including all constants
target += bernoulli_logit_glm_lpmf(Y | Xc, Intercept, b);
// priors including all constants
target += student_t_lpdf(Intercept | 3, 0, 2.5);
}
generated quantities{
//actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}
"
# fit the model
stanMod_binom <- stan(model_code = ms,
data = data,
warmup = 1e3,
iter = 2e3,
chains = 4)
stanMod_binom
# output
# mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
# b[1] 2.74 0.02 0.83 1.22 2.18 2.70 3.28 4.52 2388 1
# Intercept -0.46 0.01 0.42 -1.31 -0.72 -0.45 -0.16 0.32 2598 1
# b_Intercept -1.83 0.01 0.66 -3.25 -2.23 -1.78 -1.37 -0.70 2090 1
# lp__ -23.66 0.02 1.03 -26.45 -24.05 -23.35 -22.93 -22.64 1759 1
As you can see the output is similar, with similar estimates for the intercept and b effects, except there is an extra term in the output. I don’t know why this is happening. I assume it is something to do with the way I specified the design matrix. Could someone help me out?