Beta prior in non-linear `brms`?

dholstius · May 10, 2024, 6:40pm

Goal and Question

Hi all! Hoping you can help with a simple (?) brms interface question.

Goal: predict a non-decreasing proportion p in a population sampled over time.
Example: the proportion of cars on the road that were manufactured after 1990. At 1990, p is by definition zero. After that, let’s assume some steady-state with pre-1990 cars surviving effectively forever, such that \theta = \lim_{t\to\infty} p, with \theta \lt 1.
Assume an exponential-decay model:
- p = \text{Pr}(Y = \text{post-1990} | t) = \theta \times (1 - \text{exp}(-\lambda \, t))
- … with 1990 represented by t = 0.
Question: how can I set a better prior on \theta, while using the non-linear functionality afforded by the brms package? Reprex below.

Simulation

Thanks to @jsocolar for the helpful pointer to the Estimating Non-Linear Models with brms vignette. With that, I got the code below working. It adequately recovers the simulated \theta and \lambda. But it puts a truncated \text{Normal}(0.5, 10) prior on \theta, with upper bound 1 and lower bound 0. Not ideal …

When I tried replacing:

prior(normal(0.5, 10), nlpar = "theta", lb = 0, ub = 1)

… with:

prior(beta_binomial(1, 1), nlpar = "theta")

… I got this error message:

Function 'beta_binomial_lpdf' is not implemented for distribution 'beta_binomial', use 'beta_binomial_lpmf' instead.

Is this a case of not-yet-implemented functionality in brms? Or am I missing something obvious to y’all? Perhaps a different prior is warranted? Or not?

Code

library(tidyverse)
library(brms)

#' Helper function
rbernoulli <- function (n, prob = 0.5) {
  sample(c(0L, 1L), size = n, replace = TRUE, prob = c(1 - prob, prob))
}

#' `p()` should yield zero for 1990, and approach `theta` after that
#' Let the "true" `theta` be 0.75, and the "true" `lambda` be 0.3
p <- function (since_1990, theta = 0.75, lambda = 0.3) {
  return(theta * (1 - exp(-lambda * since_1990)))
}

#' Simulated data
sim_data <-
  tibble(
    since_1990 = 5:20,  # observations start at 1995
    p = p(since_1990),
    Y = map(p, function (p) rbernoulli(100, p))) %>%
  unnest(cols = c(Y)) %>%
  glimpse()

# Exponential model for binary observations
exp_mod <-
  bf(
    Y ~ 0 + theta * (1 - exp(-lambda * since_1990)),
    nl = TRUE,
    theta + lambda ~ 1,
    family = bernoulli(link = "identity"))

#'
#' Q1: What's a better prior for `theta` that spans 0, 1?
#'
priors <-
  #prior(beta_binomial(1, 1), nlpar = "theta") + # this fails
  prior(normal(0.5, 10), nlpar = "theta", lb = 0, ub = 1) +
  prior(exponential(3), nlpar = "lambda", lb = 0)

fit <- brm(
  formula = exp_mod,
  prior = priors,
  data = sim_data)

#' `theta` and `lambda` are acceptably recovered
summary(fit)

jsocolar · May 10, 2024, 6:53pm

I think this one is super quick. It seems that you have written beta_binomial when you mean to write beta.

dholstius · May 10, 2024, 7:01pm

Aha! Thanks for the quick reply @jsocolar. If I replace it with prior(beta(1, 1), nlpar = "theta") I get multiple warnings like this:

Chain 1 Rejecting initial value:
Chain 1   Error evaluating the log probability at the initial value.
Chain 1 Exception: beta_lpdf: Random variable[1] is -1.65226, but must be in the interval [0, 1] (in '/var/folders/0p/hym513y91n15vvmxczp0flz00000gn/T/RtmpedT8BS/model-16fb636709bda.stan', line 23, column 2 to column 38)

… and:

Warning messages:
1: It appears as if you have specified a lower bounded prior on a parameter that has no natural lower bound.
If this is really what you want, please specify argument 'lb' of 'set_prior' appropriately.
Warning occurred for prior 
b_theta ~ beta(1, 1)
 
2: It appears as if you have specified an upper bounded prior on a parameter that has no natural upper bound.
If this is really what you want, please specify argument 'ub' of 'set_prior' appropriately.
Warning occurred for prior 
b_theta ~ beta(1, 1)

I can replace it with prior(beta(1, 1), nlpar = "theta", lb = 0, ub = 1) (i.e., explicitly specifying those bounds) and the warnings go away.

I’m a little puzzled that I need to tell it something I think it should already know … doesn’t feel quite right. But I’ll go with that if it’s the right thing to do. Maybe it just doesn’t know this in the non-linear context?

jsocolar · May 10, 2024, 7:18pm

There is a very deep and fundamental distinction between choosing the model’s parameterization (which we do when we declare the parameters and their bounds), and incrementing the target density with some probability density function (like the beta prior).

This distinction isn’t the sort of thing that is easy to make quick sense of when working with brms. Maybe the simplest pointer that I can give in the right direction is that a sampling statement in Stan like theta ~ beta(1,1) doesn’t get translated to instructions like “draw a pseudo-random variate from a beta distribution and name it theta”. Instead, it says "take the parameter that you’ve previously declared as theta, and increment the target density by beta_lupdf(theta | 1,1). If theta happens to have a value outside the support, this doesn’t work and you get the warnings shown. When you add the upper and lower bounds, under the hood brms is telling Stan to parameterize the model internally in a fundamentally different way to ensure that the object called theta (which will in fact be a transformed representation of an unconstrained parameter constructed under the hood) respects the constraints that you specify.

It turns out that much of the power and flexibility of Stan as a probabilistic programming language hinges on doing things this way, i.e. there are very very solid reasons underpinning these design choices.

Topic		Replies	Views
Custom prior nonlinear model brms	3	1189	March 3, 2020
Prior specification for beta distribution in brms brms	20	5717	February 13, 2020
Scale of prior for non-linear beta regression brms fitting-issues	12	1623	November 3, 2020
Brms iformative prior for bionomial regression brms	21	1313	October 8, 2024
Non-linear models using family = Beta() Modeling rstan , brms	22	2068	June 23, 2021

Beta prior in non-linear `brms`?

Goal and Question

Simulation

Code

Related topics