This post refers to a problem I have addressed in a previous post. I have not been able to find a solution using the approach described in that post, so I would like to try a different approach. The problem:
I am running two complex multivariate models using a large dataset (N=4830). The models follow this format:
f1 = as.formula('y1~1+var1+var2+var3+(1+var1+var2+var3|GroupID)') f2 = as.formula('y2~1+var1+var2+var3+(1+var1+var2+var3|GroupID)') f3 = as.formula('y1~1+var1+var2+(1+var1+var2+|GroupID)') f4 = as.formula('y2~1+var1+var2+(1+var1+var2+|GroupID)') model_of_interest = brm(f1+f2,data=df,iter=6000,chains=4,cores=4,family="gaussian",control=list(adapt_delta=0.9),save_all_pars=TRUE) null_model = brm(f3+f4,data=df,iter=6000,chains=4,cores=4,family="gaussian",control=list(adapt_delta=0.9),save_all_pars=TRUE)
I would like to use
loo to compare the models. However, running
brms::loo_subsample() returns pareto k diagnostic warnings and I am not able to run
brms::loo_moment_match() (error description here). Looking at the output of
loo_subsample for each of my models, the main issue is that
p_loo > p, where
p is the total number of parameters in the model. I am wondering whether it may be the case that
loo_subsample does not work well with complex multivariate models?
This brings me to my idea for a different approach: I am considering running each of the formulas (
f1, f2, f3, f4) as separate brms models, and computing
loo for each individually. If the
loo diagnostics look good for each individual model, is there a way I could combine
elpd_loo for e.g.
f2 to make a composite expected log pointwise predictive density? Then perhaps I could compare the two composite
loo values to see which model is better. This way I could get around
loo_subsample's suboptimal behaviour with complex multivariate models.
Thanks, I appreciate any advice or feedback about this approach.