This post refers to a problem I have addressed in a previous post. I have not been able to find a solution using the approach described in that post, so I would like to try a different approach. The problem:
I am running two complex multivariate models using a large dataset (N=4830). The models follow this format:
f1 = as.formula('y1~1+var1+var2+var3+(1+var1+var2+var3|GroupID)')
f2 = as.formula('y2~1+var1+var2+var3+(1+var1+var2+var3|GroupID)')
f3 = as.formula('y1~1+var1+var2+(1+var1+var2+|GroupID)')
f4 = as.formula('y2~1+var1+var2+(1+var1+var2+|GroupID)')
model_of_interest = brm(f1+f2,data=df,iter=6000,chains=4,cores=4,family="gaussian",control=list(adapt_delta=0.9),save_all_pars=TRUE)
null_model = brm(f3+f4,data=df,iter=6000,chains=4,cores=4,family="gaussian",control=list(adapt_delta=0.9),save_all_pars=TRUE)
I would like to use loo
to compare the models. However, running brms::loo_subsample()
returns pareto k diagnostic warnings and I am not able to run brms::loo_moment_match()
(error description here). Looking at the output of loo_subsample
for each of my models, the main issue is that p_loo > p
, where p
is the total number of parameters in the model. I am wondering whether it may be the case that loo_subsample
does not work well with complex multivariate models?
This brings me to my idea for a different approach: I am considering running each of the formulas (f1, f2, f3, f4
) as separate brms models, and computing loo
for each individually. If the loo
diagnostics look good for each individual model, is there a way I could combine elpd_loo
for e.g. f1
and f2
to make a composite expected log pointwise predictive density? Then perhaps I could compare the two composite loo
values to see which model is better. This way I could get around loo_subsample
's suboptimal behaviour with complex multivariate models.
Thanks, I appreciate any advice or feedback about this approach.