Issues with attempting to draw random coefficients from log-normal distribution with brms

Hello,

We are having problems fitting a model with random coefficients drawn from a log-normal distribution. The model is given by

Outcome = Coefficient1 * Variable1 + Coefficient2 * Variable2 + epsilon,

where coefficients 1 and 2 are drawn from log-normal distributions and epsilon is the residual.
To test this with brms, we generate mock data with Coefficient1 ~logNormal(0,5), Coefficient2 ~ logNormal(0,7), Variable1 ~ Uniform[1,2], Variable2 ~ Uniform[5,6], and epsilon ~ Normal(0, 0.1).

We tried using syntax for non-linear models to estimate parameters for the coefficients’ underlying Normal distribution (which is equivalent to that of the log-normal). However, despite our specification matching our data generating process, our brm() call produces implausibly large values of sigma. With some specifications (e.g., including a non-zero mean for the random coefficients), the estimates also deviate from the true values by couple orders of magnitude. We also consistently get warnings, including: 1) many divergent transitions, 2) many transitions hitting max tree depth, 3) E-BFMI less than 0.2, and 4) parts of model not converging (some Rhats are > 1.05).

The issue seems to arise from using exp() with random coefficients. exp() with constant coefficients produces good estimates, and using the same syntax with coefficients drawn from normal distribution (so no exp()) also works fine.

The model exhibits the same issue with only one random coefficient and with and without intercepts. We also tried using stronger priors and increasing sample size and iterations, which did not get rid of the warnings.

library(brms)
library(cmdstanr)
library(tidyverse)


# Data generation ---------------------------------------------------------

#Set number of respondents and rounds per respondent
n_resp <- 200
n_round <- 5

fit_data <- tibble(respondent = 1:n_resp) %>%
  #Draw random coefficients for each respondent
  mutate(round = list(1:n_round),
         coef1 = exp(rnorm(n_resp, 0, 5)),
         coef2 = exp(rnorm(n_resp, 0, 7))) %>%
  unnest(round) %>%
  #Draw variables for each respondent-round
  mutate(var1 = runif(n_resp * n_round, 1,2),
         var2 = runif(n_resp * n_round, 5,6)) %>%
  #Calculate outcome variable with added noise
  mutate(outcome = coef1 * var1 + coef2 * var2 + rnorm(n_resp * n_round, 0,.1))


# Model -------------------------------------------------------------------

#Fit model
fit_rlnorm <- brm(bf(outcome ~ exp(b1) * var1 + exp(b2) *var2,
                     b1 ~ 0 + (1| respondent),
                     b2 ~ 0 + (1| respondent),
                     nl = TRUE),
                  data = fit_data,
                  backend = "cmdstanr",
                  chains = 1)

summary(fit_rlnorm)

The code and output of brm() with strong priors, a larger sample size (n_resp at 500 and
n_round at 50), and 6000 iterations is below:

fit_rlnorm_tight <- brm(bf(outcome ~ exp(b1) * var1 + exp(b2) *var2,
                           b1 ~ 0 + (1| respondent),
                           b2 ~ 0 + (1| respondent),
                           nl = TRUE),
                        data = fit_data,
                        prior = c(set_prior("normal(5,1)", nlpar = "b1", class = "sd"),
                                  set_prior("normal(7,1)", nlpar = "b2", class = "sd")),
                        backend = "cmdstanr",
                        chains = 4,
                        cores = 4,
                        iter = 6000)

Screenshot 2023-06-22 at 16.49.14
Screenshot 2023-06-22 at 16.49.06

  • Operating System: macOS Ventura 13.4
  • R Version: 4.2.2
  • brms Version: 2.19.0
  • CmdStanR Version: 0.5.3
  • CmdStan Version: 2.32.2