I use R 3.4.4 on Ubuntu 16.04.4 LTS and ‘brms’ package version 2.2.0
brms is wonderful! It automatically took care of that difficult prior business and, after some friendly pushes, converged for my model that broke lme4::bootMer(). So for about one month I am trying to convert from lme4 to Bayesian world and to make sense of brmsfit I got. Oh boy, these contrasts were so easy to investigate with emmeans package and, unfortunately, require so much HARD and MANUAL work even in brms.
After some exploration of “8 steps to become Bayesian” I bought Kruschke 2015 book and got through first 20 chapters of it. Yes, its Stan examples are quite obsolete now, but at least the language of brms vignettes and posts of Google group started to decode for me. I have a general question: Does brms documentation have a worked out non-toy example of hierarchical Bayesian ANOVA with complicated interaction (say between 2 categorical and 1 nominal predictors), heterogeneous variances, random effects and with some automation for constructing contrasts, sampling them to plot their median and HDI, and converting these contrasts to strings for hypothesis()?
Right now I dig through http://www.flutterbys.com.au/stats/tut/tut9.2b.html, but the most interesting parts (i.e. contrasts) do not quite match. Here is the simple reproducible example with my question:
require("brms")
require("tidyverse")
require("tidybayes")
set.seed(1)
fit <- brm(count ~ log_Age_c + log_Base4_c * Trt + (1 | patient),
data = epilepsy, family = poisson(),
cores=4, chains=4, sample_prior=TRUE )
me <- marginal_effects(fit, effects = "Trt:log_Base4_c", robust=TRUE, re_formula=NA,
conditions = make_conditions(fit, "log_Age_c"),
int_conditions = make_conditions(fit, "log_Base4_c"))
head(me$`Trt:log_Base4_c`, 2)
# Trt log_Base4_c log_Age_c cond__ count patient estimate__ ...
# 1 0 -0.75 -0.222457 log_Age_c = -0.22 8.254237 NA 2.784081 ...
# 2 0 0 -0.222457 log_Age_c = -0.22 8.254237 NA 5.371503 ...
newdata = me$`Trt:log_Base4_c` %>%
mutate( log_Base4_c = as.numeric(as.character(log_Base4_c)) )
me.fs <- fitted(fit, newdata=newdata, summary=TRUE, robust=TRUE, re_formula=NA )
head(me.fs, 2)
# Estimate ...
# [1,] 2.775040 ... # discrepancy in 3rd digit: 2.7???
# [2,] 5.371503 ... # exactly as in marginal_effects()
Just like Paul explained in Goggle group, marginal_effects() exactly match fitted() when the effects of log_Base4_c and log_Age_c do not exist. But why all other groups show the discrepancy?
h.Trt01 <- hypothesis(fit, hypothesis=c('Trt1vs0'=c("Trt1 = 0")))
h.Trt01$hypothesis
# Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Star
# 1 Trt1vs0 -0.3364897 0.1647266 -0.6633257 -0.01424099 NA *
Does this this mean one could safely report that the Treatment on average reduced seizures between visits by by -.34? But why when we look at this contrast directly we get
newdata[c(8,11),]
# Trt log_Base4_c log_Age_c ...
# 8 0 0 1.731103e-16 ...
# 11 1 0 1.731103e-16 ...
s.Trt01 <- fitted(fit, newdata=newdata[c(8,11),], summary=FALSE, robust=TRUE, re_formula=NA )
s.Trt01 <- s.Trt01 %>% as_tibble %>% rename(Trt0=V1, Trt1=V2) %>% mutate(Trt1vs0=Trt1-Trt0)
s.Trt01$Trt1vs0 %>% mean_qi
# y ymin ymax .prob
# 1 -1.71745 -3.439241 -0.07358314 0.95
This “direct look” gives the “average Treatment effect” which is much more on the scale of difference of medians between the groups being compared. Am I correct that by sampling we accounted for all correlations, so we could not blame them for this discrepancy?
I thought that I can sample my contrast in terms of group means using fitted() to validate tedious manual “contrast forging from raw coefficients”, currently required for hypothesis()… But if hypothesis() disagree with sampled contrast on purpose, could I request that some helper (say make_contrasts, in the spirit of make_conditions) be added to simplify/automate hypothesis testing for complicated models? It will be so helpful if one could also formulate hypotheses in terms of group means. Or, perhaps, it will be possible to implement sampling and hypotheses for conditional random variables like “Response if we know that Predictor == LevelA or Predictor == LevelB” ?