- Operating System: Win7 * brms Version: 2.9.6 from GitHub
Following on @paul.buerkner advice Convergence fails for (every) truncated gaussian model I triggered a bunch of bugs. I still hope to add a “solution” to my original topic, so I started this separate topic.
First, lets see how pp_check() breaks down:
require("modelr")
require("tidyverse")
require("tidybayes")
require("brms")
require("emmeans")
set.seed(1)
nRats <- 20
rats <- data.frame(ratNo=1:nRats, mu=rnorm(nRats)/3, sigma=rlnorm(nRats),
TrtEffect=rnorm(nRats,mean=1)) # each rat is special
trueParams <- expand.grid(replicateNo=1:6, ratNo=1:nRats, Time=c(0,30,60)) %>%
mutate(Trt = ifelse(ratNo <= nRats/2, "Sham", "TBI"),
mu=rats$mu[ratNo]+ifelse(Trt=="Sham", 0, rats$TrtEffect[ratNo]*Time/10),
sigma = rats$sigma[ratNo] + 1 + mu/3)
data.measured <- trueParams %>%
mutate( pa = rnorm(n(), mean=mu, sd=sigma), # the "real" effect
mpa = abs(round(pa*2)/2) ) # and its rounded measurement, producing many zeroes
data.recorded <- data.measured %>% ungroup %>%
mutate(Trt = as.factor(Trt), Time = as.factor(Time), ratNo = as.factor(ratNo) ) %>%
group_by(Trt, Time, ratNo) %>% summarize(mpa = mean(mpa))
# Paul's advise that mpa==0 break truncated normal model
data.recorded2 <- data.recorded %>% mutate(mpa = replace(mpa, mpa < 0.25/6, 0.25/6))
tn.prior <- brm(
data = data.recorded2, family = gaussian,
mpa | trunc(lb=0) ~ Trt*Time,
prior = c( prior(exponential(10), class = sigma), # comment sigma prior the get model with best loo
prior(normal(0, 3), class = "Intercept"),
prior(normal(0, 3), class = "b") ),
seed = 1, chains=3, cores=3, iter=1e4, sample_prior = "only" )
pp_check(tn.prior)
# Using 10 posterior samples for ppc type 'dens_overlay' by default.
# Error: NAs not allowed in 'yrep'.
# Call `rlang::last_error()` to see a backtrace
I tried to understand @paul.buerkner advice from Help understanding and setting informative priors in brms that priors should be checked with pp_check(), but it broke down (all Rhat==1.0 and each ESS >= 8512 for tn.prior model).
I usually fiddle better with priors, when I progressively tighten them (in Statistical Rethinking 2 style https://github.com/rmcelreath/statrethinking_winter2019) to the point when 95% CI from marginal_effects() contain all data points,while not being 10^BigNumber - times wider than the actual data. The code below shows either a bug or a strange feature: textual summary me[[1]] reports upper__ bound as Inf, but graphical summary, if you plot it, will draw all upper margins to be slightly above 20…
me <- marginal_effects(tn.prior, "Trt:Time", method="fitted")
me[[1]]
# Trt Time mpa cond__ effect1__ effect2__ estimate__ se__ lower__ upper__
# 1 Sham 0 3.294444 1 Sham 0 3.048649 3.818217 0.01706015 Inf
# 2 Sham 30 3.294444 1 Sham 30 3.327312 4.188858 0.01816852 Inf
# ....
plot(me, points = TRUE, point_args = list(width = 0.1), plot=FALSE)[[1]] +
geom_hline(yintercept=0, linetype="dashed", color="black")
Next are the two inconsistent features of update(): (a) the progress report not redirected to RStudio’s Viewer and (b) some arguments of the model, like “chains” or “iter”, will be inherited by updater, while cores parameter will be quietly set to 1.
tn.fit <- update(tn.prior, sample_prior="yes", seed = 1)
# The desired updates require recompiling the model
# Compiling the C++ model
# Start sampling
#
# SAMPLING FOR MODEL '42ba15d28a59f71a914bbab12cbba47d' NOW (CHAIN 1).
# Chain 1: Rejecting initial value:
# Chain 1: Log probability evaluates to log(0), i.e. negative infinity.
# Chain 1: Stan can't start sampling from this initial value.
# Chain 1:
# Chain 1: Gradient evaluation took 0 seconds
# Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
# Chain 1: Adjust your expectations accordingly!
# Chain 1:
# Chain 1:
# Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
# Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
summary(tn.fit) # model converged ...
pp_check(tn.fit) # ... and now pp_check works without error
And now I report the most annoying bug: for some distributions brms is not passing the transformation to “response scale” correctly into emmeans:
(emm <- emmeans(tn.fit, ~ Trt | Time, transform = "response"))
# ....
# Time = 60:
# Trt emmean lower.HPD upper.HPD
# Sham -0.902 -4.03 1.856
# TBI 7.163 5.28 8.991
#
# HPD interval probability: 0.95
data.recorded2 %>%
group_by(Trt, Time) %>%
data_grid(ratNo=NA) %>%
# add_fitted_draws(tn.fit, scale="response") %>% # same as marginal_effects(tn.fit, "Trt:Time", method="fitted")
add_fitted_draws(tn.fit, scale="linear") %>% # emmeans can ONLY do this
median_hdi()
# Trt Time ratNo .row .value .lower .upper .width .point .interval
# ...
# 3 Sham 60 NA 3 -0.902 -4.03 1.86 0.95 median hdi
# ...
# 6 TBI 60 NA 6 7.16 5.28 8.99 0.95 median hdi
This bug also affects lognormal and hurdle_lognormal, and maybe some other distributions which I did not check. This is definitely not a feature, since some other distributions work well, for example bernoulli:
some.fake.bernoulli.data <- data.recorded %>% mutate(YesNo = factor(mpa>5,c(FALSE,TRUE)))
b.fit <- brm( data = some.fake.bernoulli.data, family = bernoulli,
YesNo ~ Trt*Time,
prior = c(prior(normal(0, 1), class = "Intercept"),
prior(normal(0, 1), class = "b")),
seed = 1, chains=3, cores=3)
(emm <- emmeans(b.fit, ~ Trt | Time))
# Time = 60:
# Trt emmean lower.HPD upper.HPD
# Sham -2.417 -3.958 -0.906
# TBI 0.353 -0.824 1.510
#
# Results are given on the logit (not the response) scale.
# HPD interval probability: 0.95
# Warning message:
# In model.matrix.default(trms, m, contrasts.arg = contr) :
# variable 'YesNo' is absent, its contrast will be ignored
… emmeans by default report “linear scale” and could be commanded to use “response scale”:
(emm <- emmeans(b.fit, ~ Trt | Time, transform = "response"))
# Time = 60:
# Trt response lower.HPD upper.HPD
# Sham 0.0818 0.00889 0.226
# TBI 0.5874 0.33068 0.841
#
# HPD interval probability: 0.95
# Warning message:
# In model.matrix.default(trms, m, contrasts.arg = contr) :
# variable 'YesNo' is absent, its contrast will be ignored
Sure, the warning message flashes unnecessary red text into user’s eyes, but at least it is not affecting the results as far as I can tell.
Sorry for spilling many bugs in one topic. I really-really hope fixing 'em will make our beloved brms better.