I have a few probably naive questions about Bayesian replication test over a GLMM (in my case, with Bernoulli response distribution). Verhagen and Wagenmarkers (2014), propose a test to ask whether the responses in replication study are more likely to have been drawn from the same distribution as the responses in the original study, or from the null. In their paper, they describe this for a one-sample t-test (p. 1461):
where t_df is the ordinate of the central t-distribution with df degrees of freedom, and t_df,deltaN^.5 is the ordinate of the noncentral t-distribution with df degrees of freedom and noncentrality parameter Delta = deltaN^.5. The index i indicates the i-th sample out of M samples drawn from the posterior distribution of effect sizes delta after the original experiment has been observed. t_rep is the observed t-value in the replication attempt.
What I’m trying to do
My most general question is how to best generalize this test to regression models, for which I want to know whether a specific effect replicates. For linear mixed models fit with e.g., lme4, it seems like one could just take the t-value for the coefficient test with N-2 dfs (or more conservatively, with the dfs estimated by e.g., Kenward-Roger’s approximation). But I’m curious how to more directly use the samples from a brmsfit (also because that would free me from assumptions about the relevant statistics and DFs for the coefficient tests). So I have questions about how best estimate the numerator and the denominator of the first line in equation (6) for a brms model.
A) To get the numerator: My first thought was to get the value for p(Y_rep | H_r) by taking the model fit to the original data and then calculate the likelihood of the replication data under all samples and average over that. But that would evaluate whether everything replicates. However, I’m interested in whether a specific effect (i.e., one specific parameter value) replicates. So, I intend to take the brmsfit fit to the replication data, but replace the estimates for the critical effect by samples from the brmsfit fit to the original data. E.g,. I could sample rows of parameter values from the posterior samples of the brmsfit fit to the replication data, but replace the value for the parameter of interest with random draws from posterior samples for that parameter from the brmsfit fit to the original data.
(if I have random slopes by any grouping factor for the parameter of interest, I would presumably also have to make a decision about how I would handle that; the problem I am currently dealing with does not require such random slopes).
B) To get the denominator: I’m thinking of using the same procedure as described in A) but then to subtract its mean estimate from each draw for the parameter of interest from the brmsfit fit to the original data, in order to get an idea of how likely the data is to come from a null effect under the assumption that the variability around that null would be identical to the variability observed around the original effect in the original experiment.
Now, finally, my questions
- Does that sound adequate? Or am I completely off?
- If yes, should I sample any row from the replication fit and any parameter value of the original fit proportional to their respective data likelihoods (or is that likelihood already captured by the distribution of posterior samples itself)?
- Is there a convenient way to replace a matrix of posterior samples from a brmsfit with such a new manipulated matrix, so that I could—for convenience—keep using the predict or some other function to estimate the posterior probability of the data under the revised posterior samples?
Any pointers (incl. “argh, this is so stupid that I don’t even know where to start” ;)) would be welcome. Thanks for reading this long post.
Verhagen and Wagenmarkers (2014). Bayesian Tests to Quantify the Result of a Replication Attempt. JEP:G 143(4), 1457-75