MRP with ordinal outcome

This is a good question. I don’t have the dataset from Monica’s example so I’ll show an example using a different model I happen to have available, the m_cyl model from the tidybayes+brms vignette.

First we’ll transform the results of add_predicted_draws() into a data frame where .prediction is always 1 (more on why in the comment below) and .category is the predicted category:

pred_df = mtcars %>% 
  select(mpg) %>% 
  # for real post-stratification these proportions would come 
  # from some other dataset, as in Monica Alexander's example
  mutate(prop = 1/n()) %>% 
  add_predicted_draws(m_cyl) %>% 
  # add_predicted_draws will give us one row per {input row} x {draw from the model}
  # containing a .prediction column that is a factor indicating the predicted category.
  # We might imagine we instead want k rows per {input row} x {draw} where k is the 
  # number of factors, .category is the factor level, and .prediction is 1 if that 
  # category was predicted and 0 otherwise. We *could* do this by creating the 1-rows
  # then "filling in" the missing 0-rows with tidyr::complete(), but we don't have to
  # do that second step: because the 0 rows are going to contribute nothing to the final 
  # sum, we can just create the 1-rows:
  mutate(
    .category = .prediction,
    .prediction = 1
  ) 
  # if you don't believe me, add the following line to the pipe to create the 
  # 0-rows that we don't need:
  # %>% complete(.row, .draw, .category, fill = list(.prediction = 0))

pred_df
# A tibble: 128,000 x 8
# Groups:   mpg, prop, .row [32]
     mpg   prop  .row .chain .iteration .draw .prediction .category
   <dbl>  <dbl> <int>  <int>      <int> <int>       <dbl> <fct>    
 1    21 0.0312     1     NA         NA     1           1 6        
 2    21 0.0312     1     NA         NA     2           1 6        
 3    21 0.0312     1     NA         NA     3           1 6        
 4    21 0.0312     1     NA         NA     4           1 6        
 5    21 0.0312     1     NA         NA     5           1 4        
 6    21 0.0312     1     NA         NA     6           1 6        
 7    21 0.0312     1     NA         NA     7           1 6        
 8    21 0.0312     1     NA         NA     8           1 4        
 9    21 0.0312     1     NA         NA     9           1 6        
10    21 0.0312     1     NA         NA    10           1 4        
# ... with 127,990 more rows

Then we can use the same summing approach from Monica’s example, except we have to
add the .category column to the grouping factors when summing:

pred_df %>%
  mutate(.prediction_prop = .prediction*prop) %>% 
  # add whatever other predictors you don't want to marginalize over to this group_by
  # (age_group in Monica Alexander's example)
  group_by(.draw, .category) %>% 
  summarise(.prediction = sum(.prediction_prop)) %>%
  group_by(.category) %>%
  mean_qi(.prediction)
# A tibble: 3 x 7
  .category .prediction .lower .upper .width .point .interval
  <fct>           <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 4               0.354 0.25    0.438   0.95 mean   qi       
2 6               0.199 0.0625  0.344   0.95 mean   qi       
3 8               0.447 0.344   0.531   0.95 mean   qi  

To double-check, since we used the proportions from the original data (instead of from another dataset, as in proper post-stratification) these should be basically the same as the proportions in the input data, modulo sampling error:

prop.table(xtabs(~ cyl, data = mtcars))
cyl
      4       6       8 
0.34375 0.21875 0.43750 

Seems fine to me.

4 Likes