Quantifying differences between datasets using marginal means

tester · February 26, 2024, 9:11am

I am seeking for some brief “peer-review” with the following method:

I would like to quantify differences in subjects’ behavior between two different datasets. I want to estimate what is the overall (intercept) difference (on the scale of the outcome, response) and the difference of the effect of a variable T1 and T2 (both binary, [0; 1]) that changed within subject.

I therefore estimated a model with brms on all data and included dataset [“A”; “B”] as a variable, and subject [1, 2, 3, … N_dataset] as a random effect. (I include T2 only through an interaction with T1 since it physically only made sense when T1 = 1.)

model <- brm(
    formula = response ~ 0 + dataset + dataset:T1 + dataset:T1:T2 + (1|dataset:subject),
    family = lognormal(),
    data = data
)

As said, I want to estimate the difference on the response scale between the two datasets, for the intercept, T1 and T2. For this, I am using the emmeans functionality from the tidybayes package. I want to include the uncertainty arising from different subjects, therefore, I use the re_formula and allow_new_levels options.

For the (overall) difference in intercept, I use emmeans:

model %>%
  emmeans(specs = "dataset",
          epred = TRUE,
          re_formula = NULL, 
          allow_new_levels = TRUE) %>%
  contrast(method = "pairwise")

To capture the difference in the effect of T1 on response, I use emtrends:

model %>% 
  emtrends(specs = "dataset", 
           var = "T1",
           epred = TRUE,
           re_formula = NULL, 
           allow_new_levels = TRUE) %>% 
  contrast(method = "pairwise")

For T2, I use the option at = list(T1 = 1) to restrict the analyses to that specific case (explained above):

model %>% 
  emtrends(specs = "dataset", 
           var = "T2",
           at = list(T1 = 1),
           epred = TRUE,
           re_formula = NULL, 
           allow_new_levels = TRUE) %>% 
  contrast(method = "pairwise")

Does this sound like a sensible method? Any thoughts are highly appreciated.

Operating System: Windows 10
brms Version: 2.20.4

tester · May 7, 2024, 8:54am

Anyone who might have thoughts? :-)

tester · May 14, 2024, 1:42pm

@rvlenth perhaps?

tarheel · May 14, 2024, 2:28pm

There are more sophisticated approaches for this. Look up the literature on the Bayesian hierarchical model / historical data priors ([normalized] power prior, commensurate prior, meta-analytic predictive priors, latent exchangeability priors). I have an R package called hdbayes that uses cmdstanr and implements all of these.

Russ_Lenth · May 14, 2024, 8:40pm

I can’t comment on how well this fits with Bayesian methodology, versus other possibly more appropriate alternatives. But I would say that the code examples shown would produce reasonable answers.

That said, I have become increasingly allergic to the idea of piping one result into another until we get a final answer. If we run any of these codes as written, we obtain comparisons of means or slopes, but we never see estimates of the means or the slopes themselves that are being compared. Wouldn’t we want to know those? So in all of these, I would keep the steps:

EMM <- emmeans(model, "dataset", etc)
confint(EMM)
contrast(EMM, "pairwise")

By the way, these estimates are made at the mean T1 and the mean T2, and it may be better to use at ro specify particular values of these variables.

tester · May 17, 2024, 3:04pm

Thanks very much for the thoughts.

Thanks for pointing that out. I guess that means that a 0/1-coded variable would be averaged, too, which may not make sense here since T1 and T2 are categorical by nature.

Topic		Replies	Views
Trouble with {brms}/{emmeans} integration brms	0	238	April 1, 2024
Computation of contrasts using emmeans based on brms input in R brms emmeans	1	313	February 3, 2024
Estimating contrasts (code peer-review) brms specification , brms , emmeans	3	796	May 20, 2021
How to compare the effect of different datasets in model Modeling brms	6	463	January 26, 2024
Using emmeans to get marginal means from a multinomial logistic regression in brms brms emmeans	2	2719	October 7, 2021

Quantifying differences between datasets using marginal means

Related topics