# Product of two response vars: avg_predictions(), generated quantities, or post-processing of draws()?

I have a multivariate model with two outcomes, Y_1 and Y_2:

• E[Y_1] is an average rate (proportion)
• E[Y_2] is an average amount

The goal requires E[Y_1] and E[Y_2] conditional on certain combinations of covariates, and averaged across others, but also E[Y_1 Y_2] conditional on those same combinations.

My understanding is that `marginaleffects::avg_predictions()` can compute E[Y_1|\dots] and E[Y_2|\dots], but I don’t see how it can deliver E[Y_1Y_2|\dots]. (If it can, great!) Y_1 and Y_2 are both positive, and positively correlated, so the product E[Y_1|\dots] E[Y_2|\dots] will be biased low compared to E[Y_1 Y_2|\dots].

Wondering which of these approaches might seem best, or if there is a fourth option:

1. Get `avg_predictions()` to yield the desired product
2. Modify the `generated quantities` block using `brm(..., stanvars = ...)` (see below)
3. Sample from the paired `draws()` and calculate, then summarize, the products from the paired draws

Here is an example using a toy dataset and a bit of pseudo-ish code inside `stanvars()` that doesn’t work, but hopefully gives the idea for approach #2:

``````require(brms)
require(cmdstanr)

bf_mpg <- bf(mpg ~ cyl)
bf_am <- bf(am ~ disp) + bernoulli()

mod <- brm(
bf_mpg + bf_am,
stanvars = stanvar(
scode = "real foo = mpg .* am;", # What should this be?
block = "genquant")
backend = "cmdstanr",
data = mtcars)
``````

Solutions or ideas? Thank you in advance for your time and attention!

Another option (a variation on #3) is to use `brms::posterior_epred` or `tidybayes::add_epred_draws`, with a manual step to compute the product Y_1Y_2 of the two outcomes Y_1 and Y_2.

First create a data frame with the covariate combinations of interest.

``````newdata <- data.frame(
cyl = c(4, 4, 6, 6),
disp = c(200, 300, 200, 300)
)
``````

3a. Using `brms::posterior_epred`

``````epreds <- posterior_epred(mod, newdata = newdata)
dim(epreds)
# (num draws) x (num newdata rows) x (num responses)
#> [1] 4000    4    2

Y1 <- epreds[, , 1] # corresponds to the `mpg` outcome
Y2 <- epreds[, , 2] # corresponds to the `am` outcome

# Compute the product outcome.
Y1xY2 <- Y1 * Y2
``````

3b. Using `tidybayes::add_epred_draws`

``````epreds <- add_epred_draws(newdata, mod)
epreds
# The `.category` column indicates the response.
#> # Groups:   cyl, disp, .row, .category [8]
#>      cyl  disp  .row .chain .iteration .draw .category .epred
#>    <dbl> <dbl> <int>  <int>      <int> <int> <fct>      <dbl>
#>  1     4   200     1     NA         NA     1 mpg         27.1
#>  2     4   200     1     NA         NA     2 mpg         26.9
#>  3     4   200     1     NA         NA     3 mpg         26.1
#>  4     4   200     1     NA         NA     4 mpg         25.9
#>  5     4   200     1     NA         NA     5 mpg         26.6
#>  6     4   200     1     NA         NA     6 mpg         26.9
#>  7     4   200     1     NA         NA     7 mpg         26.7
#>  8     4   200     1     NA         NA     8 mpg         26.3
#>  9     4   200     1     NA         NA     9 mpg         26.9
#> 10     4   200     1     NA         NA    10 mpg         27.1
#> # ℹ 31,990 more rows

# Pivot the `.category` column into separate `mpg` and `am` columns
# and calculate their product.
epreds <- epreds |>
pivot_wider(
names_from = .category,
values_from = .epred
) |>
mutate(
mpg_am = mpg * am
)
``````

Now we’ve got two data structures (an array `Y1xY2` and a tibble `epreds`) with the posterior draws of the expected product Y_1Y_2. We can use any of the functions that facilitate summarizing or plotting the posterior predictive distribution.

``````# The summary has one row for each covariate combination (row) in newdata.
# The `variable` column though is not particularly informative.
summarise_draws(Y1xY2)
#> # A tibble: 4 × 10
#>   variable  mean median    sd   mad    q5   q95  rhat ess_bulk ess_tail
#>   <chr>    <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
#> 1 ...1     11.2   11.2   3.05  3.16 6.35  16.4   1.00    3957.    2869.
#> 2 ...2      3.92   3.52  2.37  2.25 0.806  8.40  1.00    3321.    1799.
#> 3 ...3      8.79   8.79  2.38  2.50 4.97  12.8   1.00    3959.    2935.
#> 4 ...4      3.06   2.74  1.85  1.77 0.627  6.55  1.00    3296.    1860.

# It may more convenient to have a column for each of the covariates as well.
epreds |> median_qi(mpg_am, .width = 0.9)
#> # A tibble: 4 × 9
#>     cyl  disp  .row mpg_am .lower .upper .width .point .interval
#>   <dbl> <dbl> <int>  <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>
#> 1     4   200     1  11.2   6.35   16.4     0.9 median qi
#> 2     4   300     2   3.52  0.806   8.40    0.9 median qi
#> 3     6   200     3   8.79  4.97   12.8     0.9 median qi
#> 4     6   300     4   2.74  0.627   6.55    0.9 median qi
``````
1 Like