Using mi() to generate same imputed value for all levels of a group

mcm315 · March 24, 2020, 10:33pm

I’m attempting to model the ratings of five raters by predictors related to each subject while imputing subject-level predictors. Below is a fictitious example that may illustrate the problem.

Suppose we have n subjects (ID = 1:n) of age X (X[1]:X[n]) and weight Y (Y[1]:Y[n]). Five raters record each subjects blood pressure (Z) and we would like to model the effect of age and weight on the blood pressure, while also quantifying the between-rater variability. Now suppose some weights are missing. Top lines of the data (e.g., data_ex) might look like

Observation   ID     X       Y       Z       Rater
1             1      26      135     118     1
2             1      26      135     116     2
3             1      26      135     122     3
4             1      26      135     120     4
5             1      26      135     114     5
6             2      20      170     133     1
7             2      20      170     130     2
8             2      20      170     120     3
9             2      20      170     125     4
10            2      20      170     140     5
11            3      23      NA      100     1
12            3      23      NA      110     2
13            3      23      NA      113     3
14            3      23      NA      105     4
15            3      23      NA      108     5

What I am attempting to do is model Z ~ X + Y but I would like to impute missing values of Y. Supposing Y is dependent upon X, I am modeling the following (ignoring subject-level random effects for simplicity):

bf_mod <- bf(Z | mi() ~ mi(Y) + X) + 
          bf(Y| mi() ~ X) + set_recor(FALSE)
fit <- brm(bf_mod, data = data_ex)

The issue is that for a given subject, there are five total observations, one from each rater. However, the imputed weight at a given iteration must be the same for all observations from a given subject. However, the code above will impute unique estimates from the posterior predictive distribution for observations 11 - 15, despite those value coming from the same subject. I am at a loss as to how to implement such a model in brms. In Stan I would typically pass vectors of different lengths

vector[N] Age;
vector[N] Weight; 
vector[Total] BP;  // Total = N * 5
vector[Total] ID;  // patient ID for each BP measurement

impute the weight based on age and then get the correct value corresponding to an ID into the equation for BP using something similar to below in the transformed parameters block

vector[Total] Z_hat;
for(i in 1:Total)
  Z_hat[i] = b0 + b1 * Y[ID[i]] + b2 * X[ID[i]];

Any thoughts on how to go about indexing something of this nature in brms?

paul.buerkner · March 29, 2020, 11:21am

This is not currently supported in brms and I didn’t yet get my head around at how to allow this nicely. me() terms have a similar grouping feature already and mi() terms might use this syntax eventually as well.

The nested indexing approach is indeed the most reasonable option we have to express this at the Stan level.

mcm315 · March 30, 2020, 9:56pm

Thank you very much for your response; I really appreciate it!

Topic		Replies	Views
Losing observation when modelling missing values brms	9	637	April 11, 2019
Brms: mi() for discrete outcomes in an IRT model brms specification , irt , missing-data	13	1883	June 18, 2021
Missing data with brms mi() brms missing-data	2	781	June 5, 2020
Missing data in multiple correlated predictors brms	2	382	May 6, 2020
Question regarding the handling of missing data in brms brms specification	3	696	June 1, 2021

Using mi() to generate same imputed value for all levels of a group

Related topics