Should a mathematically linear model yield the same results regardless of whether it is specified in brms as linear or nonlinear?
I tested a logistic regression model with a horseshoe prior. I tried the same mathematical model with two brms specifications: a linear formula and a nonlinear formula. In many cases the results are similar, but lots of predictors (30) with extreme values of par_ratio(10) yield clear differences in the posteriors.
# OS: Ubuntu 20.04.6
# brms version 2.23.0
library(tidyverse)
library(brms)
library(tidybayes)
n <- 400 # samples
p <- 5 # predictors
X <- matrix(rnorm(n * p), nrow = n, ncol = p)
X <- scale(X) # standardize
beta_true <- c(2.0, -1.5, 1.0, rep(0, p - 3)) # only first 3 signals nonzero
linpred <- drop(X %*% beta_true)
prob <- plogis(linpred)
y <- rbinom(n, size = 1, prob = prob)
df <- data.frame(y = y, X)
names(df) <- c("y", paste0("x", 1:p))
#----------- Linear Model -------------
formula_linear <- bf(y ~ 1 + x1 + x2 + x3 + x4 + x5)
prior_linear <- prior(normal(0, 3), class = "Intercept") +
prior(horseshoe(par_ratio=10), class = "b")
fit_linear <- brm(
formula = formula_linear,
data = df,
family = bernoulli(link = "logit"),
prior = prior_linear,
chains = 4,
iter = 2000,
warmup = 1000,
seed = 1
)
#----------- Nonlinear Model -------------
formula_nonlinear <- bf(
y ~ b0 + b1 + b2 + b3 + b4 + b5,
nl = TRUE
) +
lf(b0 ~ 1) +
lf(b1 ~ 0 + x1) +
lf(b2 ~ 0 + x2) +
lf(b3 ~ 0 + x3) +
lf(b4 ~ 0 + x4) +
lf(b5 ~ 0 + x5)
prior_nonlinear <- prior(normal(0, 3), nlpar = "b0", coef = "Intercept") +
prior(horseshoe(par_ratio=10), nlpar = "b1") +
prior(horseshoe(par_ratio=10), nlpar = "b2") +
prior(horseshoe(par_ratio=10), nlpar = "b3") +
prior(horseshoe(par_ratio=10), nlpar = "b4") +
prior(horseshoe(par_ratio=10), nlpar = "b5")
fit_nonlinear <- brm(
formula = formula_nonlinear,
data = df,
family = bernoulli(link = "logit"),
prior = prior_nonlinear,
chains = 4,
iter = 2000,
warmup = 1000,
seed = 1
)
Below are the marginal densities for the nonzero effects with 30 total predictors and par_ratio=10. It appears that the linear model specification is generally giving smaller parameter absolute values compared to the nonlinear specification.
I have also included an Rmd that illustrates the behavior and the outputs it produces.
horseshoe_prior.pdf (391.3 KB)
nonlinear.stan (50.5 KB)
linear.stan (2.9 KB)
horseshoe_prior.Rmd (7.2 KB)
