Question regarding the handling of missing data in brms

jonasdora · May 31, 2021, 9:42pm

Hi all,

I am planning an individual participant data meta-analysis in brms, in which daily observations are nested in participants, which are further nested in studies - the syntax looks something like this:

brm(bf(y ~ 0 + Intercept + covariate + x1 + x2 + x1:x2 + 
                      (1 + x1 | study:pid) + (1 + x1 + x2 + x1:x2 | study), 
                      hu ~ 0 + Intercept + covariate + x1 + x2 + x1:x2 + 
                      (1 + x1 | study:pid) + (1 + x1 + x2 + x1:x2 | study)), 
                   data = metadata, family = hurdle_negbinomial(), prior = metaPriors, sample_prior = TRUE, 
                    iter = 3000, chains = 4, backend = "cmdstanr", threads = threading(7))

As you can tell, the dependent variable is a zero-inflated count variable. X1 is a continuous within-participant predictor, X2 is a continuous between-participant predictor. There is missingness both in the within-participant predictor, which is assessed across all studies, and in the between-participant predictor, which is assessed only in a subset of studies (~ 50-60% of studies). My question is, how do I handle this missingness in the model? How do I prevent losing the data from the studies that did not assess x2 in estimating the x1 effect to listwise deletion? My only solution right now is to run a separate model in which I don’t include x2 at all. I don’t think multiple imputation here is feasible due to the complexity of the outcome/model, and this model will already run for multiple days on a supercomputing cluster due to the large amount of data, so imputation would make the computing time explode. I was wondering if there is a way in brms around listwise deletion or possibly to treat missing data as a parameter?
Help would be greatly appreciated!

Best,
Jonas

andrjohns · June 1, 2021, 12:46am

Have you looked at the brms vignette for missing data: Handle Missing Values with brms

I believe the Imputation During Model Fitting is what you’re after

jonasdora · June 1, 2021, 4:06am

When I tried that I got the following error:

Error: Argument ‘mi’ is not supported for family ‘hurdle_negbinomial(log)’.

So I assumed that this form of imputation would also not work for this model.

andrjohns · June 1, 2021, 5:10am

Ah I missed that you had a count outcome. In that case, there is no way to account for missingness in brms. This is because brms missingness treats the missing value as a parameter to be estimated, but Stan is not able to estimate discrete parameters. Your only option here (that I’m aware of, happy to be corrected) would be to impute the missing count data externally

Topic		Replies	Views
Data imputation in multilevel meta analysis brms brms meta-analysis , missing-data	4	905	June 25, 2020
Meta-analysis in brms -- Combining se() and mi() brms techniques , specification , metascience , meta-analysis	2	618	July 14, 2020
Missing data with brms mi() brms missing-data	2	813	June 5, 2020
Missing data of main effects in model with interaction terms brms missing-data	17	3118	October 4, 2022
Subsetting missing data models for second-level predictor brms hierarchical-model , missing-data	0	400	February 22, 2023

Question regarding the handling of missing data in brms

Related topics