How to pool BRMSfit imputed data, Mice not appropriate

Hi Community,

I have imputed data using the missRanger package (mice package doesn’t work for my data). I then ran my BRMS model on the imputed data, and tried to pool the results using the mice package, sadly it doesn’t seem to work for BRMS. I need to find a function that can handle brmsfit objects and pool these.

Has anyone come across this?

Imputing function: 

imputed <- lapply(3456:3460, function(x)
  missRanger(
    df,
     . #predict all columns except the following
    - age
    ~ . #Make predictions using all columns except:
    - id,
    maxiter = 10,# How many iterations until it stops? 
    pmm.k = 3, #Predictive Mean Matching leading to more natural imputations and improved distributional properties of the resulting values
    verbose = 1,#how much info is printed to screen, 
    seed = x,#Integer seed to initialize the random generator.
    num.trees = 200,
    returnOOB = TRUE,
    case.weights = NULL
  )
)

And then I run my model:

models_imputed <- lapply(imputed, function(x) 
  brm(formula = score  ~ 1 + cs(group) + age, data = x, family = acat("cloglog")))

models_pool <- pool(models_imputed)
1 Like

I am not familiar with the missRanger package, but as long as its capable of returning its outputs as a list of data.frames (with each data.frame being one imputed dataset), the brm_multiple (Run the same brms model on multiple datasets — brm_multiple • brms) function can be used to run the same model on all imputed datasets.

The combine argument (defaults to TRUE) will merge the results, if set to FALSE, the separate results from each fit can be pooled using the combine_models function (Combine Models fitted with brms — combine_models • brms) - I think this will also work after using lapply, as in your example. See the link provided above and the brms vignette on missing data (Handle Missing Values with brms • brms).

Combining results from multiply imputed Bayesian models is basically simple - draws from the different models are simply pooled, so no need for Rubin’s rules or similar. Be aware that high Rhats may be “false positives”, as discussed in the vignette, due to some differences between the data in the different imputed datasets. If I remember correctly, brms also only uses the first fit in some functions using the fit.

Personally, I’ve previously used brm_multiple with combine = FALSE, checked adequacy of the fit and convergence in each separate dataset, and then pooled the results.

2 Likes

Thank you, this is great and a really good explanation!!!

Sadly this didn’t work in the end.

So using lapply as in my example gives the exact same outcome as using brm_multiple, so they work the same way. However, I am not left with something I can summarise:

model<- lapply(imputed, function(x)
brm(formula = score ~ 1 + cs(group), data = x, family = acat("cloglog"), chains=1))

which gave me a list of 5

n/. 5 of the list, for example:
[[5]]
Family: acat
Links: mu = cloglog; disc = identity
Formula: score ~ 1 + cs(group)
Data: x (Number of observations: 2638)
Samples: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 1000 Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept[1] 2.83 0.16 2.53 3.13 1.00 161
Intercept[2] -0.57 0.08 -0.72 -0.41 1.00 166
group2[1] 0.83 0.20 0.45 1.20 1.00 176
group2[2] -0.48 0.11 -0.70 -0.26 1.00 231
group3[1] 0.86 0.19 0.48 1.23 1.00 189
group[2] -0.46 0.11 -0.65 -0.23 1.00 201
Tail_ESS
Intercept[1] 397
Intercept[2] 381
group[1] 423
group[2] 466
group[1] 338
group[2] 385 Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc 1.00 0.00 1.00 1.00 1.00 1000 1000 Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

but whenever i try to pool… it comes up with an error
Error: The model does not contain group-level effects.

I also tried combined fit in brms, which doesn’t work

but maybe there is a way to manually pool?

While combine=TRUE, did not run, I ended up running the brm_multiple with combine=false and then ran

combined<-combine_models(, mlist=models_imputed, check_data=FALSE)

and it worked! Success

2 Likes