# MRP with ordinal outcome

I’m attempting to follow this tutorial from Monica Alexander on using brms for multilevel regression and post-stratification. The twist is that in my use case the outcome is an ordered factor with 4 levels. The tutorial deals with a proportion.

Therefore, rather than run:

``````mod <- brm(kept_name ~ (1|age_group) + (1|educ_group) +
data = d_mncs,
family=bernoulli())
``````

I followed @paul.buerkner’s lead in his ordinal regression tutorial and fit the model with `family = cumulative("probit")`.

The next step in Monica’s tutorial is to use the results of the regression to predict the outcome and multiply the `.prediction` by the proportion (`prop`). This is the post-stratification step.

``````res_age <- mod %>%
filter(age_group>20,
age_group<80,
allow_new_levels=TRUE) %>%
rename(kept_name_predict = .prediction) %>%
mutate(kept_name_predict_prop = kept_name_predict*prop) %>%
group_by(age_group, .draw) %>%
summarise(kept_name_predict = sum(kept_name_predict_prop)) %>%
group_by(age_group) %>%
summarise(mean = mean(kept_name_predict),
lower = quantile(kept_name_predict, 0.025),
upper = quantile(kept_name_predict, 0.975))
``````

This works in her example with a binary outcome (1/0) because the `.prediction` is numeric and can be multiplied by the population proportion. In my use case with an ordinal outcome, `.prediction` is a factor.

Any thoughts about how to apply this MRP example when your outcome is ordinal, not binary?

4 Likes

This is a good question. I don’t have the dataset from Monica’s example so I’ll show an example using a different model I happen to have available, the `m_cyl` model from the tidybayes+brms vignette.

First we’ll transform the results of `add_predicted_draws()` into a data frame where `.prediction` is always `1` (more on why in the comment below) and `.category` is the predicted category:

``````pred_df = mtcars %>%
select(mpg) %>%
# for real post-stratification these proportions would come
# from some other dataset, as in Monica Alexander's example
mutate(prop = 1/n()) %>%
# add_predicted_draws will give us one row per {input row} x {draw from the model}
# containing a .prediction column that is a factor indicating the predicted category.
# We might imagine we instead want k rows per {input row} x {draw} where k is the
# number of factors, .category is the factor level, and .prediction is 1 if that
# category was predicted and 0 otherwise. We *could* do this by creating the 1-rows
# then "filling in" the missing 0-rows with tidyr::complete(), but we don't have to
# do that second step: because the 0 rows are going to contribute nothing to the final
# sum, we can just create the 1-rows:
mutate(
.category = .prediction,
.prediction = 1
)
# if you don't believe me, add the following line to the pipe to create the
# 0-rows that we don't need:
# %>% complete(.row, .draw, .category, fill = list(.prediction = 0))

pred_df
``````
``````# A tibble: 128,000 x 8
# Groups:   mpg, prop, .row 
mpg   prop  .row .chain .iteration .draw .prediction .category
<dbl>  <dbl> <int>  <int>      <int> <int>       <dbl> <fct>
1    21 0.0312     1     NA         NA     1           1 6
2    21 0.0312     1     NA         NA     2           1 6
3    21 0.0312     1     NA         NA     3           1 6
4    21 0.0312     1     NA         NA     4           1 6
5    21 0.0312     1     NA         NA     5           1 4
6    21 0.0312     1     NA         NA     6           1 6
7    21 0.0312     1     NA         NA     7           1 6
8    21 0.0312     1     NA         NA     8           1 4
9    21 0.0312     1     NA         NA     9           1 6
10    21 0.0312     1     NA         NA    10           1 4
# ... with 127,990 more rows
``````

Then we can use the same summing approach from Monica’s example, except we have to
add the `.category` column to the grouping factors when summing:

``````pred_df %>%
mutate(.prediction_prop = .prediction*prop) %>%
# add whatever other predictors you don't want to marginalize over to this group_by
# (age_group in Monica Alexander's example)
group_by(.draw, .category) %>%
summarise(.prediction = sum(.prediction_prop)) %>%
group_by(.category) %>%
mean_qi(.prediction)
``````
``````# A tibble: 3 x 7
.category .prediction .lower .upper .width .point .interval
<fct>           <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
1 4               0.354 0.25    0.438   0.95 mean   qi
2 6               0.199 0.0625  0.344   0.95 mean   qi
3 8               0.447 0.344   0.531   0.95 mean   qi
``````

To double-check, since we used the proportions from the original data (instead of from another dataset, as in proper post-stratification) these should be basically the same as the proportions in the input data, modulo sampling error:

``````prop.table(xtabs(~ cyl, data = mtcars))
``````
``````cyl
4       6       8
0.34375 0.21875 0.43750
``````

Seems fine to me.

4 Likes

Thanks for taking the extra time to explain the approach, @mjskay. Great stuff.

3 Likes

Of course!

2 Likes

For an example where we combined post-stratification with an ordinal model, you can also check out https://osf.io/preprints/socarxiv/3v5g7/

5 Likes

I’m late to the thread (buried under marking), but just wanted to add a plug for our lovely MRP reading group. Contact me or Rohan Alexander if interested!

4 Likes

Thanks again for all of the help. Here’s what I came up with. I’m a novice when it comes to MRP and brms, so if anyone has tips for improving it please shoot me a note. Hopefully there are no fatal flaws!!🤞

1 Like

Very good example!

You could add a footnote about that it might be useful to model the interaction of age group and gender as well.

1 Like

Thanks for taking the time to review and share this suggestion. I updated the post.