Typical applications I’ve seen for multiple regression with post-stratification (MRP) is when we want to predict an outcome and there are known differences between the sample and population on some demographic variables (e.g., age + sex):
brm(y ~ (1 | age) + (1 | sex))
Has anyone seen an application where MRP is applied to a within-subjects experiment? Conceptually, it makes sense that we might want to use MRP in experimental settings as well. Then, what would differ from the syntax above is we’d need to estimate the effect in each condition and a random effect for participant (because they have participated multiple times:
brm(y ~ condition + (1 | age) + (1 | sex) + (1 | participant_id))
I guess we further might want to even let the effect of the manipulation vary across participants:
brm(y ~ condition + (1 | age) + (1 | sex) + (1 + condition | participant_id))
My questions: Does this seem like a reasonable application of MRP? If so, what would the post-stratification frame look like? Would I need a column for
My post hasn’t gotten any traction. Does anyone have advice on (a) where I could go for help with this question or (b) how I could edit the question to make it easier for someone to answer?
This is a nice clear question, but it’s normal on this forum that good questions might not get answered within 36 hours, especially over the weekend.
To post-stratify over participant ID, you effectively want to make predictions for all of the individuals in the population, incorporating the uncertainty due to the random effects distribution. Of course it’s unwieldy to predict to every individual, but you still have three good options:
- In some cases, for example if you have Gaussian error and identity link, then for some forms of prediction you don’t need to bother to integrate over the random effect explicitly because the expectations come out in the wash.
- If you have another link function, it is especially crucial to integrate properly over the random effect distribution. Option 1 is to numerically integrate over the random effects distribution.
- Option 2 is just be to pass a whole lot of participant IDs within each post-stratification group and approximate the integrals by averaging over the resulting predictions (check out the
sample_new_levels arguments of
Thanks for the response, much appreciated.
In my case, I do have an identify link and Gaussian errors. So I guess I won’t worry about integrating over the participant random effect.
Just to check if there’s any comments on the exact implementation, if I fit this model:
mod <- brm(y ~ condition + (1 | age) + (1 | sex) + (1 + condition | participant_id))
Now, the post stratification frame (
post_strat) if I have only two conditions (
Control) might look like this:
Note that the sum of the proportions within each condition are equal to 1 (i.e., because every participant partakes in each condition - it’s within-subjects).
And to get estimates for each experimental condition, I could use the tidybayes package and explicitly omit the participant random effect from
add_epred_draws(newdata = post_stat, re_formula = ~ (1 | age) + (1 | sex)) %>%
rename(y = .epred) %>%
mutate(y_prop = y*Proporton) %>%
group_by(condition, .draw) %>%
summarise(y_predict = sum(y_prop)) %>%
summarise(mean = mean(y_predict),
lower = quantile(y_predict, 0.025),
upper = quantile(y_predict, 0.975))
Does this seem like the right steps? Does it seem logical? I haven’t seen this exact scenario before so I’m wanting to check that my approach is reasonable.