Perform math/arithmetic function within brms model using imputed `mi()` values

Hi,

I am interested in using a summary measurement in a brms model. This measurement is calculated from several individual measurements in the dataset. A good example is the geometric mean of a series of measurement values, which is a method sometimes used in biology to describe the overall size of some object. For example:

library(tidyverse)
library(brms)

i2 <- iris |> 
  mutate( flower_gm = 
            (Sepal.Length * Sepal.Width * Petal.Length * Petal.Width)^1/4))

head(i2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species flower_gm
#> 1          5.1         3.5          1.4         0.2  setosa    1.2495
#> 2          4.9         3.0          1.4         0.2  setosa    1.0290
#> 3          4.7         3.2          1.3         0.2  setosa    0.9776
#> 4          4.6         3.1          1.5         0.2  setosa    1.0695
#> 5          5.0         3.6          1.4         0.2  setosa    1.2600
#> 6          5.4         3.9          1.7         0.4  setosa    3.5802

Created on 2024-06-13 with reprex v2.1.0

From this, we could do Petal.Length ~ flower_gm to see how well overall flower geometric mean describes petal length.

But what if I am missing some of the measurements needed to calculate the flower_gm value.

library(tidyverse)
i2.2 <- iris |> 
    mutate( 
      Sepal.Length = 
        ifelse(row_number() %in% c(1:4), NA,Sepal.Length))

head(i2.2)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1           NA         3.5          1.4         0.2  setosa
#> 2           NA         3.0          1.4         0.2  setosa
#> 3           NA         3.2          1.3         0.2  setosa
#> 4           NA         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

Created on 2024-06-13 with reprex v2.1.0

Of course I could always calculate flower_gm and keep NA’s in cells that are missing one or more of the 4 measurements. But, what I would love to do is estimate the missing Sepal.Length values, then calculate the geometric mean flower_gm for each draw, then run the model.

mod <- bf(Petal.Length ~ flower_gm) +
# this estimates the missing Sepal.Length values
  bf(Sepal.Length| mi() ~ 0 + Species) +
# this does the arithmetic to calculate the flower_gm  using each imputed
# draw from the Sepal.Length| mi() ~ 0 + Species model, which I don't know how to do!
 bf(flower_gm = Sepal.Length * Sepal.Width * Petal.Length * Petal.Width)^1/4) +
set_rescor(FALSE)

fit_mod <- brm(mod, data = i2.2)

AFAIK, arithmetic functions cannot be done within a brms model call. Apart from using some other tool like mice to do the imputations separately, does anyone know of a way to do this type of calculation all in one model object? Is it even possible?

Thanks!

Hello @jonnations, I’m not sure if this will be a satisfactory answer to the whole problem, but you can implement simple functions in the nonlinear syntax, i.e.

bf(flower_gm ~ (Sepal.Length * Sepal.Width * Petal.Length * Petal.Width)^1/4, nl = TRUE)...

Is literally

flowergm = (sepallength \cdot sepalwidth \cdot petallength \cdot petalwidth)^{1/4}

Plus whatever response family you nominate for flower_gm.

It might be possible to then use the mi() syntax to estimate missing values within the linear predictors for those nonlinear parameters… but I think this form of problem (which is a joint model) will be easier to handle in Stan directly.

1 Like