Correlated posteriors in a multilevel binomial regression

Marina_Golivets · July 5, 2018, 2:02pm

Hello,

I am trying to analyze a dataset with predictors collected at two hierarchical levels. The higher level is determined by a plot geographical location and the lower level is determined by environmental heterogeneity within a plot so that the response variable is measured for each unique environmental condition within a plot. In most cases, plots were homogeneous and thus are represented by a single data point, whereas about 30% of plots contained 2-4 conditions and thus are represented by 2-4 data points.

I’ve been trying to fit a binomial regression with these data but I run into a problem that posterior distributions of several predictors are highly correlated. I suspect that the problem might be a small number of observations per group (i.e. number conditions per plot), though I am not completely sure. I get the same problem with simulated data and I can’t recover the parameters:

a <- rnorm(1000, 1, 3)
b <- rnorm(1000, 7, 5)
c <- rnorm( 1000, 0, 2)
d <- rnorm( 1300, 0, 1)

group <- sample(1:1000, 1300, replace = TRUE)
data <- data.frame(a = a, b = b, c = c, group = 1:1000) %>% right_join(data.frame(d = d, group = group))
x1 <- 7
x2 <- -2

data$y <- 5 + x1data$a + x2data$b + data$c + data$d
data$pr <- round( exp(data$y) / (1 + exp(data$y)), 4)
data$y <- rbinom(1300, 1, data$pr)

data$obs <- 1:1300

data[, c(“a”, “b”)] <- data[, c(“a”, “b”)] %>% mutate_all(funs(as.numeric(scale(.))))

m.sim <- brm(y ~ a + b + (1|group) + (1|obs), data = data, family = bernoulli(),
prior = c(prior(“normal(0,5)”, class = “b”),
prior(“normal(0,5)”, class = “sd”)),
chains = 1, cores = 1, thin = 20, iter = 10000, warmup = 2000,
control = list(adapt_delta = .9, max_treedepth = 12))

pairs(m.sim)
Pairs.pdf (172.7 KB)

print(m.sim)

Family: bernoulli
Links: mu = logit
Formula: y ~ a + b + (1 | group) + (1 | obs)
Data: data (Number of observations: 1300)
Samples: 1 chains, each with iter = 10000; warmup = 2000; thin = 20;
total post-warmup samples = 400

Group-Level Effects:
~group (Number of levels: 719)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.53 0.37 0.05 1.33 223 1.00

~obs (Number of levels: 1300)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.64 0.45 0.02 1.65 400 1.00

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept -1.99 0.33 -2.63 -1.39 400 1.00
a 14.92 1.83 11.78 18.77 400 1.00
b -7.03 0.88 -8.99 -5.49 400 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
is a crude measure of effective sample size, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

I’d really appreciate any advice on how to proceed with this problem.

Marina

Bob_Carpenter · July 17, 2018, 3:55pm

I’m going to tag this one “brms” and hopefully @paul.buerkner will see it.

paul.buerkner · July 17, 2018, 6:54pm

Observation level residuals (the (1|obs) thing) are not identified if the response is dichotomous. I suggest removing the (1|obs) term in bernoulli models.

Bob_Carpenter · July 19, 2018, 12:01am

Thanks—these tags are like magic. The problem we’re facing is way too many threads for any one person to follow.

Topic		Replies	Views
Very large credible intervals for odds ratio for logistic regression in brms Modeling fitting-issues	3	2490	August 22, 2019
Multivariate Multilevel model (some conceptual questions) brms	1	373	August 26, 2022
Question re brms: conditional_effects General brms	4	692	July 4, 2022
Plotting custom predictor values for a multilevel model with a categorical variable brms brms	2	471	February 16, 2021
Colinearity issues arise in multilevel multivariate models (but not in submodels) brms	13	1829	March 15, 2019

Correlated posteriors in a multilevel binomial regression

Related Topics