Optimizing generalized linear mixed effects model with a logit link

Wade · December 20, 2018, 8:33pm

Hi all,

I having a binomial generalized linear mixed effects model that I have written in STAN, but it is incredibly slow compared to RSTANARM. I would like to write everything in STAN proper, but need some help optimizing the model. I think that the loops are killing me, but not sure how to correct. Here is reproducible code. First, the data and the RSTANARM code.

library(lme4)
library(rstan)
library(rstanarm)

hdp <- read.csv("https://stats.idre.ucla.edu/stat/data/hdp.csv")
hdp <- within(hdp, {
  Married <- factor(Married, levels = 0:1, labels = c("no", "yes"))
  DID <- factor(DID)
  HID <- factor(HID)
})

m <- glmer(remission ~ IL6 + CRP + CancerStage + LengthofStay + Experience +
             (1 | DID), data = hdp, family = binomial, control = glmerControl(optimizer = "bobyqa"),
           nAGQ = 10)

# print the mod results without correlations among fixed effects
print(m, corr = FALSE)


m_stan= stan_glmer(remission ~ IL6 + CRP + CancerStage + LengthofStay + Experience +
                              (1 | DID),family=binomial,data=hdp)

Now, here is the STAN code that I have written:

binomCode = 'data{
	int N; // number of data points
	int remission[N]; // result
	int NF; // rows in matrix
	int DID[N]; // group effect
	int N_DID; // number of groups
	matrix [N,NF] Xmat; // design matrix
}

parameters{
	vector[NF] beta; //vector of betas
	real vary_DID_index[N_DID];
	real sigma_DID;
}

model{
	vector[N] vary;
	vector[N] lm;

	// Priors
	beta ~ normal( 0 , 100 );
	sigma_DID ~ uniform( 0 , 100 );

	// Varying effects
	for ( j in 1:N_DID ) vary_DID_index[j] ~ normal( 0 , sigma_DID );

	// Fixed effects
	for ( i in 1:N ) {
	  vary[i] = vary_DID_index[DID[i]];
	  lm[i] = vary[i] + Xmat[i]*beta;
	  remission[i] ~ bernoulli_logit(lm[i]);
	}

}'

# now without the crutch
Xmat = model.matrix(m)

## MCMC settings
ni <- 5000
nt <- 1
nb <- 2000
nc <- 3

remission = as.numeric(hdp$remission)
N = nrow(Xmat)
NF = ncol(Xmat)
DID = as.numeric(hdp$DID)
N_DID = length(levels(hdp$DID))

data.list = list(N = N, remission = remission, NF = NF, DID = DID, N_DID = N_DID,Xmat = Xmat)

m2_stan = stan("binomCode",data=data.list,chains = nc,thin = nt,iter = ni, warmup = nb,
                       seed = 1)
print(m)
print(m_stan)
print(m2_stan, pars=c("beta[1]","beta[2]","beta[3]","beta[4]","beta[5]","beta[6]","beta[7]","beta[8]"))

I get the same results, but my code is way too slow. Any help would be appreciated.

Max_Mantei · December 20, 2018, 10:44pm

Hi Wade,

four things come to mind to speed up your model.

Get rid of loops
Use non-centered parameterization of the varying intercept
Use more reasonable priors
Use QR decomposition of the mean-centered predictor matrix

Something like this:

data{
  int N; // number of data points
  int remission[N]; // result
  int NF; // rows in matrix
  int DID[N]; // group effect
  int N_DID; // number of groups
  matrix [N,NF] Xmat; // design matrix
}
transformed data {
  matrix[N, NF] X_centered;
  matrix[N, NF] Q;
  matrix[NF, NF] R;
  matrix[NF, NF] R_inv;

  for (nf in 1:NF)
    X_centered[, nf] = Xmat[, nf] - mean(Xmat[, nf]);

  // Compute, thin, and then scale QR decomposition
  Q = qr_thin_Q(X_centered) * sqrt(N - 1);
  R = qr_thin_R(X_centered) / sqrt(N - 1);
  R_inv = inverse(R);
}
parameters{
  vector[NF] beta_tilde; //vector of betas in Q-space
  real alpha_0;
  vector[N_DID] alpha_tilde;
  real<lower=0> sigma_DID;
}
transformed parameters{
  vector[N_DID] vary_DID_index = sigma_DID*alpha_tilde; // varying intercept - non-centered
}
model{
  // Priors
  beta_tilde ~ normal(0, 2.5); // weakly-informative prior
  alpha_0 ~ normal(-0.8, 2.5); // informative prior on the intercept: roughly sample mean of remission on logit scale
  alpha_tilde ~ std_normal(); // standard normal for non-centered group
  sigma_DID ~ exponential(1); // more reasonable prior for group sd
  
  // Likelihood
  remission ~ bernoulli_logit(alpha_0 + vary_DID_index[DID] + Q*beta_tilde);
  
}
generated quantities{
  vector[NF] beta = R_inv*beta_tilde; // transform back to "normal" betas
}

Edit: Running the model on my old laptop using your options took 715, 733, and 1031 seconds for each chain respectively. These are the results for beta:

3 chains, each with iter=5000; warmup=2000; thin=1; 
post-warmup draws per chain=3000, total post-warmup draws=9000.

         mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
beta[1]  0.04    0.00 0.01  0.02  0.03  0.04  0.04  0.06   655 1.01
beta[2] -0.48    0.00 0.09 -0.66 -0.54 -0.48 -0.41 -0.29  8174 1.00
beta[3] -0.25    0.00 0.10 -0.45 -0.32 -0.25 -0.19 -0.06  8667 1.00
beta[4]  0.14    0.00 0.02  0.11  0.13  0.14  0.15  0.18  5245 1.00
beta[5] -0.02    0.00 0.02 -0.05 -0.03 -0.02 -0.01  0.01  6189 1.00
beta[6] -0.24    0.00 0.02 -0.27 -0.25 -0.24 -0.23 -0.21  7835 1.00
beta[7] -0.07    0.00 0.03 -0.13 -0.09 -0.07 -0.05 -0.01  5023 1.00
beta[8]  1.98    0.02 0.47  1.05  1.66  1.98  2.31  2.89   648 1.01

Edit 2: I corrected the generated quantities block to R_inv*beta_tilde (see below). The displayed summaries for beta are wrong.

Wade · December 21, 2018, 5:16pm

The suggestions allowed the code to run in a quarter of the time it took before. I appreciate the helpful hints.

Wade · December 21, 2018, 5:33pm

However, the recovered parameters don’t match what I get from stan_glmer or the code I had originally. Do I need to rescale the parameters?

Max_Mantei · December 21, 2018, 6:34pm

You are right, there is a mistake in the code that I posted. It should be vector[NF] beta = R_inv*beta_tilde; in the generated quantities block. Sorry for that!

Topic		Replies	Views
Stan how to optimize the estimation process? Algorithms	6	1860	August 30, 2017
Optimising stan fit for fixed effects and interactions Modeling rstan , specification	1	73	March 21, 2025
Log binomial random intercept model Modeling fitting-issues	8	2698	March 4, 2018
Stan efficient! Modeling	7	1030	February 17, 2022
Speeding up a hierarchical multinomial logit model Modeling	27	7707	October 18, 2022

Optimizing generalized linear mixed effects model with a logit link

Related topics