Logistic regression with two random intercept terms

Dear community,

I am new to the rstanarm package and I have a hard time to recover the true parameters of the synthetic (toy) model below. Is the model actually identifiable? If so how can I obtain posteriors for icpt and id1_values and id2_values?

library(tidyverse)
library(lme4)
library(rstanarm)
set.seed(42)
# Create synthetic data -------------------------------------------------------
ef1 <- rnorm(1000)
ef2 <- rnorm(1000)
id1 <- sample(c(1L,2L,3L), 1000, replace=TRUE)
id2 <- sample(c(1L,2L,3L), 1000, replace=TRUE)
id1_values <- rnorm(3)
id2_values <- rnorm(3)
icpt <- .4711
p1 <- 2.3
p2 <- 1.7
z = icpt + p1*ef1 + p2*ef2 + id1_values[id1] + id2_values[id2]
pr = 1/(1+exp(-z))
y = rbinom(1000,1,pr)
df = tibble(y=y,ef1=ef1,ef2=ef2, id1=id1, id2=id2)
# Fit and simulate -------------------------------------------------------------
rslt<-glmer(formula=y~ef1+ef2+(1|id1)+(1|id2),data=df,family=binomial)
rslt_stan <- stan_glmer(formula=y~ef1+ef2+(1|id1)+(1|id2),data=df,family=binomial,
                        chains = 1, cores = 1, seed = 12345, iter = 2000)

I recall that glmer and stan_glmer estimate varying intercepts in general by estimating a single global intercept and individual deviations from that intercept for each possible value of id. But what in case of two random intercepts terms? Or should I actually go and use maybe a formula term (1|id1:id2)?

I don’t see the results so I’ll have to run this and take a look at the parameter recovery when I have a chance (not at my computer at the moment). But in general two varying intercepts is ok. There will still only be one global intercept estimated, but two different group-specific deviations from that global intercept.

Also, I definitely recommend running more than one Markov chain. No magic number but 4 is a somewhat safe default.

Thanks, I would appreciate this!

When I run

rslt_stan <-
  stan_glmer(
    formula = y ~ ef1 + ef2 + (1 | id1) + (1 | id2),
    data = df,
    family = binomial,
    iter = 2000,
    chains = 4,
    cores = 4
  )

on my laptop with rstanarm v2.17.4, I get results very comparable to the results from glmer:

round(cbind(lmer = fixef(rslt), stan = fixef(rslt_stan)), digits = 2)

            lmer stan
(Intercept) 0.06 0.08
ef1         2.65 2.65
ef2         1.74 1.74
round(cbind(
  lmer = unlist(ranef(rslt)), 
  stan = unlist(ranef(rslt_stan))
), digits = 2)

                   lmer  stan
id1.(Intercept)1  -0.39 -0.37
id1.(Intercept)2   0.45  0.45
id1.(Intercept)3  -0.06 -0.05
id2.(Intercept)1   0.13  0.13
id2.(Intercept)2   0.45  0.45
id2.(Intercept)3  -0.58 -0.57

Those are just point estimates. Of course with rstanarm you also get the whole posterior.

1 Like

Thanks Jonah. And can you actually “recover” the true values stored in id1_values, id2_values, icpt?

library("bayesplot")
color_scheme_set("brightblue")

draws <- as.matrix(rslt_stan, pars = c("(Intercept)", "ef1", "ef2"))
id1_draws <- as.matrix(rslt_stan, regex_pars = " id1:")
id2_draws <- as.matrix(rslt_stan, regex_pars = " id2:")
log_sigmas <- log(as.matrix(rslt_stan, regex_pars = "Sigma"))


mcmc_recover_hist(draws, true = c(icpt, p1, p2))
mcmc_recover_hist(id1_draws, true = id1_values)
mcmc_recover_hist(id2_draws, true = id2_values)
mcmc_recover_hist(log_sigmas, true = log(c(1,1)))




Hi Jonah,

at first the width of the distributions seem very big too me. In fact this is also what I observed and I was rather surprised that this is the case for such a “simple” model with 1000 data points.

Additionally, if I run the following on my Windows machine

rslt_stan <- stan_glmer(formula=y~ef1+ef2+(1|id1)+(1|id2),data=df,family=binomial,
                        chains = 4, cores = 2, seed = 12345, iter = 2000)

and then execute

round(cbind(
  lmer = unlist(ranef(rslt)), 
  stan = unlist(ranef(rslt_stan))
), digits = 2)

I get

                  lmer  stan
id1.(Intercept)1 -0.20 -0.18
id1.(Intercept)2  0.03  0.02
id1.(Intercept)3  0.17  0.14
id2.(Intercept)1 -0.86 -0.84
id2.(Intercept)2  1.54  1.56
id2.(Intercept)3 -0.68 -0.66

and finally running

round(cbind(lmer = fixef(rslt), stan = fixef(rslt_stan)), digits = 2)

I get

             lmer  stan
(Intercept) -0.31 -0.33
ef1          2.35  2.34
ef2          1.84  1.84

However inspecting the values for icpt, id1_values and id2_values I get

c(id1_values,id2_values)
[1] -0.6856617 -0.7927145 -0.4070042 -1.1486706  1.1157605 -0.8794568

as well as icpt=0.4711. And now I am not sure how to recover the random intercepts from the stan or lmer point estimates. I guess it must be a linear combination (if it is possible at all).

I just realised that there is a continuum of possibilities to get values for the group specific intercepts, since you can always shift the global intercept and then adjust the relativ difference of the group specific intercepts. In other words, the only thing I think you can recover is the sums `icpt+id1_values+id2_values´. Simpliyfing the problem I considered only one random intercept (whith a slightly adjusted toy model of course):

library(tidyverse)
library(lme4)
library(rstanarm)
set.seed(42)
# Create synthetic data -------------------------------------------------------
ef1 <- rnorm(1000)
ef2 <- rnorm(1000)
id1 <- sample(c(1L,2L,3L), 1000, replace=TRUE)
id1_values <- rnorm(3)
icpt <- .4711
p1 <- 2.3
p2 <- 1.7
z = icpt + p1*ef1 + p2*ef2 + id1_values[id1] 
pr = 1/(1+exp(-z))
y = rbinom(1000,1,pr)
df = tibble(y=y,ef1=ef1,ef2=ef2, id1=id1)
# Fit and simulate -------------------------------------------------------------
rslt<-glmer(formula=y~ef1+ef2+(1|id1),data=df,family=binomial)
rslt_stan <- stan_glmer(formula=y~ef1+ef2+(1|id1),data=df,family=binomial,
                        chains = 4, cores = 4, seed = 12345, iter = 2000, adapt_delta=.99)
mcmc_pairs(as.array(rslt_stan), pars=c("(Intercept)",sprintf("b[(Intercept) id1:%d]", 1:3)))

Is this a continuous non-identifiability in the sense of http://mc-stan.org/users/documentation/case-studies/identifying_mixture_models.html ?

Yeah In this particular example there’s not really anything to distinguish the two varying intercept terms. But there are plenty of cases when it’s possible, e.g. if one or both are also correlated with varying slopes.

For future reference I cite the relevant section of the Stan manual (w.r.t. v2.17.0):

  1. Problematic Posteriors

There is in particular a section on redundant intercepts.

1 Like