Apologies for the long data strings and code. I would just like people to be able to reproduce my analysis.
I am having trouble getting what I consider to be good estimates of the parameters in a mixed-effects model run in stan.
df <- data.frame(id = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 35, 35, 35, 36, 36, 36, 36, 36, 36, 36, 37, 37, 37, 37, 37, 37, 38, 38, 39, 39, 39, 39, 39, 39, 39, 40, 40, 40, 40, 40, 40, 40, 41, 41, 41, 41, 41, 41, 41, 42, 42, 42, 42, 42, 42, 42, 43, 43, 44, 44, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 45, 46, 46, 46, 46, 46, 46, 46),
day = c(1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 6, 7, 1, 2, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7),
score = c(5, 4.05, 2.22, 1.63, 2.53, 0.63, 1, 4.84, 0.79, 1.58, 2.16, 3.68, 1.34, 3.33, 0.6, 0.79, 0.37, 0.58, 0.37, 0.53, 0.26, 0.16, 0.58, 0.63, 1.63, 1, 0.58, 0.16, 3.32, 2.26, 2.11, 2.47, 1.28, 2.78, 2.7, 3.26, 0.68, 0.84, 1.95, 0.26, 0.79, 1.89, 1.74, 2, 2.58, 2.16, 2.74, 2.63, 2.42, 4.84, 6.26, 6.26, 5.58, 6.37, 1.26, 1, 2.58, 0.95, 0.37, 0.89, 1.11, 0.89, 0.74, 1.06, 4.05, 5, 7.56, 7.11, 6.26, 4.32, 3.11, 2.32, 2.63, 3, 2.53, 2.74, 1.37, 0.47, 1.68, 1.42, 0.42, 0.42, 0.37, 0.47, 2.63, 2.53, 2.21, 1.74, 1.05, 0.89, 0.74, 3.53, 3.84, 2.47, 0.89, 0.63, 0.68, 0.63, 2.89, 2.05, 1.47, 1.32, 1.47, 2.11, 0.95, 3.42, 1.95, 2.21, 1.74, 2.5, 2, 0.53, 2.53, 3.16, 2.16, 1.16, 2.32, 3.11, 2.42, 0.58, 1.47, 1.42, 0.89, 0.37, 1, 0.32, 1.11, 0.79, 1.05, 1.42, 0.74, 0.63, 1.95, 3.47, 0.95, 3.47, 2.21, 2.42, 2.16, 2.05, 4.21, 2.58, 0.79, 0.63, 0.89, 0.95, 1.11, 0, 0.53, 1.05, 1.05, 1.05, 0.84, 1.16, 4.39, 2.58, 1.63, 2.63, 4.11, 3.74, 2.67, 4.32, 2.79, 5.11, 3.42, 2.63, 2.53, 2.26, 5.47, 4.74, 4.63, 4.84, 4.16, 3.32, 4.05, 2.11, 1.74, 1, 1.16, 0.68, 0.84, 0.89, 0.37, 1.05, 1.89, 1.58, 2.26, 2.11, 3.21, 1.16, 3.84, 3.22, 2.33, 2.16, 3.05, 3.84, 2.47, 3.79, 4.68, 1.74, 1.58, 2.21, 1.58, 1.53, 2.84, 1.26, 0.95, 2, 1.42, 2.32, 0.11, 0, 0.58, 0, 0, 0, 0.47, 6.05, 4.53, 2.79, 4, 1.84, 2.68, 0.37, 1.68, 1.63, 1.11, 1.42, 2.21, 2, 1.56, 1.21, 4.37, 4.63, 4.61, 2.79, 1.89, 0.94, 3.95, 4.47, 3, 4, 4.89, 5.37, 5.84, 0.26, 1.11, 0.26, 0.16, 0.11, 0.21, 0.05, 5.24, 5.63, 6.74, 5.74, 5.47, 2.68, 1.67, 0.58, 0.11, 0.84, 0.21, 0.5, 0.58, 0.58, 1, 2.32, 2.42, 4.89, 1.37, 1.95, 1.21, 0.95, 4.37, 6.21, 6.74, 6.26, 4.95, 4.79, 3.05, 2.63, 3.79, 1.68, 1.79, 3.37, 2.58, 1.95, 3.27, 5.11, 1.21, 3.68, 2.84, 0.89, 0.94, 0.11, 0.11, 2.68, 1.26, 1.37, 0.84, 0.63, 3.37, 2.11, 2.16, 2.68, 3.37, 3.89, 3.42, 2, 1.63))
In this analysis the goal is to model the estimated starting score for 46 participants and their rate of change in score per day of a 7-day trial.
Now the approach I took I adapted from techniques explained in a brilliant series of lectures on stan for Psychologists by Mike Lawrence, specifically this one (https://www.youtube.com/watch?v=woho42axipg&list=PLu77iLvsj_GNmWqDdX-26kZ56dCZqkzBO&index=21). In these models you use the design matrix to model lower-level, within-subjects coefficients and higher-order group coefficients, like so
# matrix of within-subject predictors
W <- model.matrix(score ~ day, df)
Then you specify a simple random-slopes mixed-effects for repeated measures regression, with day of the trial as the sole predictor (range 1-7). We also model the correlation between intercept and rate of change coefficients. Here is the code for the model itself and to run the model in rstan
.
# model string
ms_within <- "
data {
int nP; // number of participants
int nY; // the number of observations
int nW; // the number of within-subject predictors
vector[nY] y; // vector of scores
matrix[nY, nW] W; // matrix of within-subjects predictors
int S[nY]; // integer value of subject associated with each y outcome
}
transformed data {
vector[nY] y_scaled;
y_scaled = (y - mean(y)) / sd(y); // standardising via z-transform
}
parameters {
vector[nW] scaled_coef_means; //scaled coefficients for group means
vector<lower=0>[nW] scaled_coef_sds;
corr_matrix[nW] corr_mat; // correlation among coefficients
matrix[nP,nW] scaled_subj_coefs; // scaled subject coefficients
real<lower=0> noise; // noise parameter
}
model {
// weakly informed priors
scaled_coef_means ~ normal(0,1) ; // flat gaussian prior on mean of subj coefs
scaled_coef_sds ~ weibull(2,1) ; // flat prior on sd of subj coefs
corr_mat ~ lkj_corr(1) ; // flat prior on correlations
noise ~ weibull(2,1) ; //prior on measuement noise
// likelihood for lower-tier in the hierarchy. estimates a matrix of coefficients with as many rows as participants and with as many within-subject coefficients as there are subjects in the model (intercept and effect) for each
for(p in 1:nP){
scaled_subj_coefs[p,] ~ multi_normal(
scaled_coef_means
, quad_form_diag(corr_mat, scaled_coef_sds)
) ;
}
// now for the higher tier in the hierarchy
for (i in 1:nP) {
y_scaled[i] ~ normal(
sum(scaled_subj_coefs[S[i]].*W[i,]),
noise
);
}
}
generated quantities{
//declare variables
vector[nW] coef_means; // transform mean coefficients back to original y scale
vector[nW] coef_sds; // transform sds back to original y scale
// rescale everything
coef_means = scaled_coef_means*sd(y); // for mean coefs multiply by sd
coef_means[1] = coef_means[1] + mean(y); // for intercept also add mean
coef_sds = scaled_coef_sds*sd(y); // for group sds multiply by sd
}
"
Now we run the analyses and calculate the lower and upper 95% HPDI and median estimate for the two parameters we are interested in: overall intercept and rate of change in score per day.
post <- stan(model_code = ms_within,
data = list(nP = max(df$id),
nY = nrow(df),
nW = ncol(W),
y = df$score,
W = W,
S = df$id),
seed = 1,
cores = 3,
chains = 1,
warmup = 1e3,
iter = 2e3)
# extract the estimates of the higher-level effects, intercept and effect
pX <- as.data.frame(as.array(post)[,,grep("^coef_means", dimnames(post)$parameters, perl=T)])
# give the effects sensible names
names(pX) <- c("intercept", "rateOfChange")
# hpdi function
HPDIFunct <- function (vector, HPDI = 0.95) {
sortVec <- sort(vector)
ninetyFiveVec <- ceiling(HPDI*length(sortVec))
fiveVec <- length(sortVec) - length(ninetyFiveVec)
diffVec <- sapply(1:fiveVec, function (i) sortVec[i + ninetyFiveVec] - sortVec[i])
minVal <- sortVec[which.min(diffVec)]
maxVal <- sortVec[which.min(diffVec) + ninetyFiveVec]
median <- median(sortVec)
return(list(est = round(c(minVal, median, maxVal),2)))
}
# lower 95% HPDIx, median, and upper 95% HPDI
summaryDF <- as.data.frame(t(unlist(HPDIFunct(pX$intercept))))
summaryDF[2,] <- t(unlist(HPDIFunct(pX$rateOfChange)))
row.names(summaryDF) <- c("Intercept", "day")
names(summaryDF) <- c("lowerStan", "medianStan", "upperStan")
Now we run an alternative bayesian version of the same model in brms
and an nhst version of the same model in the R-package nlme
# brms model
library(brms)
brmsMod <- brm(formula = score ~ day + (day|id),
data = df,
family = gaussian)
# bind brms estimates to the stan estimates
summaryDF <- cbind(summaryDF,
round(summary(brmsMod)$fixed[,c("l-95% CI", "Estimate", "u-95% CI")],2))
# nhst model
library(nlme)
nhstMod <- lme(fixed = score ~ day,
random = ~day|id,
data = df)
cbind(summaryDF, round(as.data.frame(intervals(nhstMod)$fixed),2))
Now here is the output form those three versions of the same model side by side
# lowerStan medianStan upperStan l-95% CI Estimate u-95% CI lower est. upper
# Intercept 1.11 2.29 3.43 2.23 2.78 3.32 2.26 2.79 3.31
# day -0.39 -0.16 0.13 -0.22 -0.14 -0.05 -0.22 -0.14 -0.06
The left three columns are from the stan model, the middle three from the brms
model, the right three from the nhst model.
So my question is: why are the parameter estimates so much less certain in my stan model than the other two?
The stan code used by brms is much more complex than mine, but I had hoped that I would get close. The day
coefficient in particular is much broader and less certain in the stan model than the other two.
What am I doing wrong?