I am confused about what `emmeans`

is averaging over with `brms`

.

After fitting a generalized linear model with `brms`

, I can pass the model object to `emmeans()`

to obtain the posterior for marginal means the outcome on the probability or response scale. Since probabilities are non-linear in the parameters, the distribution of predictors affects these predictions. Accordingly, we need to specify the distribution/grid we want to average over for the marginal means.

To illustrate the source of confusion, letâ€™s simulate some data and fit a beta regression. Weâ€™re interested in the marginal means of `y`

for values of `x`

, averaging over `z`

.

```
library(brms)
library(emmeans)
set.seed(123)
# simulate data
df <- data.frame(y = runif(100),
x = sample(0:1, size = 100, replace = T),
z = as.character(sample(1:4, size = 100, replace = T)))
# fit beta model
mod <- brms::brm(formula = y ~ x + z + x:z,
data = df,
family = Beta(link = "logit", link_phi = "log"),
seed = 123)
```

# Doesnâ€™t include predictors in grid

If we donâ€™t specify anything about the predictor grid, the resulting object does not contain any information about the distribution of the predictor weâ€™re averaging over `z`

. What distribution is it implicitly assuming?

```
# Specify nothing about grid
nogrid <- emmeans(object = mod, ~x, epred = TRUE)
nogrid
#> x emmean lower.HPD upper.HPD
#> 0 0.545 0.466 0.622
#> 1 0.459 0.386 0.530
#>
#> Point estimate displayed: median
#> HPD interval probability: 0.95
nogrid@grid
#> x .wgt.
#> 1 0 46
#> 2 1 54
```

If we explicitly tell `emmeans`

to weight each cell of the grid by its frequency in the sample, we get identical implied weights and still no mention of `z`

in the grid/output.

```
# Specify that expected values should be based on probabilities for each cell
# of the predictor grid, where the cell's contribution is weighted by frequency
weighted <- emmeans(object = mod, ~x, weights = "cells", epred = TRUE)
weighted
#> x emmean lower.HPD upper.HPD
#> 0 0.541 0.460 0.614
#> 1 0.462 0.393 0.535
#>
#> Point estimate displayed: median
#> HPD interval probability: 0.95
weighted@grid
#> x .wgt.
#> 1 0 46
#> 2 1 54
```

It also does not appear possible to use the `at`

argument either to fix this, as the below code produces the same issue (although somewhat different estimates).

```
emmeans(object = mod, ~x, weights = "cells",
at = list(z = unique(df$z)), epred = TRUE)
```

# Expected grid

While none of the above appear to average over `z`

, I was expecting them to use a predictor grid like the below:

```
# What I was expecting:
grid_expected <- ref_grid(mod, weights = "cells")
grid_expected@grid
#> x z .wgt.
#> 1 0 1 15
#> 2 1 1 17
#> 3 0 2 13
#> 4 1 2 11
#> 5 0 3 9
#> 6 1 3 10
#> 7 0 4 9
#> 8 1 4 16
```

Rather than pass the grid I produced with `ref_grid`

above to `emmeans()`

, Iâ€™d like to understand what is going on with the other code.

Figured Iâ€™d ping @Russ_Lenth