# Optimizing generalized linear mixed effects model with a logit link

Hi all,

I having a binomial generalized linear mixed effects model that I have written in STAN, but it is incredibly slow compared to RSTANARM. I would like to write everything in STAN proper, but need some help optimizing the model. I think that the loops are killing me, but not sure how to correct. Here is reproducible code. First, the data and the RSTANARM code.

``````library(lme4)
library(rstan)
library(rstanarm)

hdp <- within(hdp, {
Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
DID <- factor(DID)
HID <- factor(HID)
})

m <- glmer(remission ~ IL6 + CRP + CancerStage + LengthofStay + Experience +
(1 | DID), data = hdp, family = binomial, control = glmerControl(optimizer = "bobyqa"),
nAGQ = 10)

# print the mod results without correlations among fixed effects
print(m, corr = FALSE)

m_stan= stan_glmer(remission ~ IL6 + CRP + CancerStage + LengthofStay + Experience +
(1 | DID),family=binomial,data=hdp)
``````

Now, here is the STAN code that I have written:

``````binomCode = 'data{
int N; // number of data points
int remission[N]; // result
int NF; // rows in matrix
int DID[N]; // group effect
int N_DID; // number of groups
matrix [N,NF] Xmat; // design matrix
}

parameters{
vector[NF] beta; //vector of betas
real vary_DID_index[N_DID];
real sigma_DID;
}

model{
vector[N] vary;
vector[N] lm;

// Priors
beta ~ normal( 0 , 100 );
sigma_DID ~ uniform( 0 , 100 );

// Varying effects
for ( j in 1:N_DID ) vary_DID_index[j] ~ normal( 0 , sigma_DID );

// Fixed effects
for ( i in 1:N ) {
vary[i] = vary_DID_index[DID[i]];
lm[i] = vary[i] + Xmat[i]*beta;
remission[i] ~ bernoulli_logit(lm[i]);
}

}'

# now without the crutch
Xmat = model.matrix(m)

## MCMC settings
ni <- 5000
nt <- 1
nb <- 2000
nc <- 3

remission = as.numeric(hdp\$remission)
N = nrow(Xmat)
NF = ncol(Xmat)
DID = as.numeric(hdp\$DID)
N_DID = length(levels(hdp\$DID))

data.list = list(N = N, remission = remission, NF = NF, DID = DID, N_DID = N_DID,Xmat = Xmat)

m2_stan = stan("binomCode",data=data.list,chains = nc,thin = nt,iter = ni, warmup = nb,
seed = 1)
print(m)
print(m_stan)
print(m2_stan, pars=c("beta","beta","beta","beta","beta","beta","beta","beta"))
``````

I get the same results, but my code is way too slow. Any help would be appreciated.

four things come to mind to speed up your model.

1. Get rid of loops
2. Use non-centered parameterization of the varying intercept
3. Use more reasonable priors
4. Use QR decomposition of the mean-centered predictor matrix

Something like this:

``````data{
int N; // number of data points
int remission[N]; // result
int NF; // rows in matrix
int DID[N]; // group effect
int N_DID; // number of groups
matrix [N,NF] Xmat; // design matrix
}
transformed data {
matrix[N, NF] X_centered;
matrix[N, NF] Q;
matrix[NF, NF] R;
matrix[NF, NF] R_inv;

for (nf in 1:NF)
X_centered[, nf] = Xmat[, nf] - mean(Xmat[, nf]);

// Compute, thin, and then scale QR decomposition
Q = qr_thin_Q(X_centered) * sqrt(N - 1);
R = qr_thin_R(X_centered) / sqrt(N - 1);
R_inv = inverse(R);
}
parameters{
vector[NF] beta_tilde; //vector of betas in Q-space
real alpha_0;
vector[N_DID] alpha_tilde;
real<lower=0> sigma_DID;
}
transformed parameters{
vector[N_DID] vary_DID_index = sigma_DID*alpha_tilde; // varying intercept - non-centered
}
model{
// Priors
beta_tilde ~ normal(0, 2.5); // weakly-informative prior
alpha_0 ~ normal(-0.8, 2.5); // informative prior on the intercept: roughly sample mean of remission on logit scale
alpha_tilde ~ std_normal(); // standard normal for non-centered group
sigma_DID ~ exponential(1); // more reasonable prior for group sd

// Likelihood
remission ~ bernoulli_logit(alpha_0 + vary_DID_index[DID] + Q*beta_tilde);

}
generated quantities{
vector[NF] beta = R_inv*beta_tilde; // transform back to "normal" betas
}
``````

Edit: Running the model on my old laptop using your options took 715, 733, and 1031 seconds for each chain respectively. These are the results for `beta`:

``````3 chains, each with iter=5000; warmup=2000; thin=1;
post-warmup draws per chain=3000, total post-warmup draws=9000.

mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
beta  0.04    0.00 0.01  0.02  0.03  0.04  0.04  0.06   655 1.01
beta -0.48    0.00 0.09 -0.66 -0.54 -0.48 -0.41 -0.29  8174 1.00
beta -0.25    0.00 0.10 -0.45 -0.32 -0.25 -0.19 -0.06  8667 1.00
beta  0.14    0.00 0.02  0.11  0.13  0.14  0.15  0.18  5245 1.00
beta -0.02    0.00 0.02 -0.05 -0.03 -0.02 -0.01  0.01  6189 1.00
beta -0.24    0.00 0.02 -0.27 -0.25 -0.24 -0.23 -0.21  7835 1.00
beta -0.07    0.00 0.03 -0.13 -0.09 -0.07 -0.05 -0.01  5023 1.00
beta  1.98    0.02 0.47  1.05  1.66  1.98  2.31  2.89   648 1.01
``````

Edit 2: I corrected the `generated quantities` block to `R_inv*beta_tilde` (see below). The displayed summaries for `beta` are wrong.

The suggestions allowed the code to run in a quarter of the time it took before. I appreciate the helpful hints.

However, the recovered parameters don’t match what I get from stan_glmer or the code I had originally. Do I need to rescale the parameters?

You are right, there is a mistake in the code that I posted. It should be `vector[NF] beta = R_inv*beta_tilde;` in the generated quantities block. Sorry for that!