Linear and nonlinear model specifications give different results

Should a mathematically linear model yield the same results regardless of whether it is specified in brms as linear or nonlinear?

I tested a logistic regression model with a horseshoe prior. I tried the same mathematical model with two brms specifications: a linear formula and a nonlinear formula. In many cases the results are similar, but lots of predictors (30) with extreme values of par_ratio(10) yield clear differences in the posteriors.

# OS: Ubuntu 20.04.6
# brms version 2.23.0

library(tidyverse)
library(brms)
library(tidybayes)

n  <- 400          # samples
p  <- 5            # predictors
X  <- matrix(rnorm(n * p), nrow = n, ncol = p)
X  <- scale(X)     # standardize 

beta_true <- c(2.0, -1.5, 1.0, rep(0, p - 3))  # only first 3 signals nonzero
linpred   <- drop(X %*% beta_true)
prob      <- plogis(linpred)
y         <- rbinom(n, size = 1, prob = prob)

df <- data.frame(y = y, X)
names(df) <- c("y", paste0("x", 1:p))

#----------- Linear Model -------------

formula_linear <- bf(y ~ 1 + x1 + x2 + x3 + x4 + x5)

prior_linear <- prior(normal(0, 3), class = "Intercept") + 
  prior(horseshoe(par_ratio=10), class = "b")

fit_linear <- brm(
  formula = formula_linear,
  data    = df,
  family  = bernoulli(link = "logit"),
  prior   = prior_linear,
  chains  = 4,
  iter    = 2000,
  warmup  = 1000,
  seed    = 1
)

#----------- Nonlinear Model -------------

formula_nonlinear <- bf(
  y ~ b0 + b1 + b2 + b3 + b4 + b5, 
  nl = TRUE
) + 
  lf(b0 ~ 1) +
  lf(b1 ~ 0 + x1) + 
  lf(b2 ~ 0 + x2) + 
  lf(b3 ~ 0 + x3) + 
  lf(b4 ~ 0 + x4) +
  lf(b5 ~ 0 + x5)

prior_nonlinear <- prior(normal(0, 3), nlpar = "b0", coef = "Intercept") +
  prior(horseshoe(par_ratio=10), nlpar = "b1") + 
  prior(horseshoe(par_ratio=10), nlpar = "b2") + 
  prior(horseshoe(par_ratio=10), nlpar = "b3") + 
  prior(horseshoe(par_ratio=10), nlpar = "b4") +
  prior(horseshoe(par_ratio=10), nlpar = "b5")

fit_nonlinear <- brm(
  formula = formula_nonlinear,
  data    = df,
  family  = bernoulli(link = "logit"),
  prior   = prior_nonlinear,
  chains  = 4,
  iter    = 2000,
  warmup  = 1000,
  seed    = 1
)

Below are the marginal densities for the nonzero effects with 30 total predictors and par_ratio=10. It appears that the linear model specification is generally giving smaller parameter absolute values compared to the nonlinear specification.

I have also included an Rmd that illustrates the behavior and the outputs it produces.

horseshoe_prior.pdf (391.3 KB)

nonlinear.stan (50.5 KB)

linear.stan (2.9 KB)

horseshoe_prior.Rmd (7.2 KB)

Welcome to discourse. This is a good question.

The answer is that in the nonlinear specification the prior isn’t doing what you expect. It’s estimating a different regularized horseshoe (with different global parameters) for each of the nonlinear parameters–i.e. for all of the coefficeints. I think this is rarely or perhaps never a good idea. I can’t imagine it’s useful to try to estimate the global parameters of a regularized horseshoe applied to just a single coefficient. The linear specification does something altogether more sensible and presumably is what you intend–estimating the global parameters by pooling over the five coefficeints.

1 Like

@jsocolar Thank you for the response. I can see the large number of horseshoe global parameters in the nonlinear model’s stan code like you mentioned. Indeed, the nonlinear model prior is not doing what I thought it was!