Take the following simple hurdle lognormal model with splines:
library(brms) set.seed(1) N <- 1000 x <- runif(N) pi <- 0.18 + 0.1*x mu_log <- 1.8 + 0.3 * x + 12.1 * x^2 sigma_log <- 0.37 y <- (1 - rbinom(N, 1, prob = pi)) * rlnorm(N, mu_log, sigma_log) d <- data.frame(y = y, x = x) fit <- brm(bf(y ~ s(x), hu ~ s(x)), data = d, family = hurdle_lognormal())
How can I estimate the probability that the response is below a given value, conditional on x taking a given value? (E.g., what is the probability that y<100 given x=0.5.)
I.e., I would answer it using simulation with
mean(posterior_predict(fit, newdata = data.frame(x = 0.5), nsamples = nsamples(fit))<100)
But the question is: how to calculate it analytically? (Optimally with confidence interval.) If I am not mistaken, it shouldn’t be difficult, as it is “just” an integration of the posterior (conditional on x), complicated by the fact the we have a spline, and a hurdle model.