Model comparison in latent variable models

This is hard to say without seeing the model specification, but based on the links you provided, I’m assuming that your latent variable in the dataframe is just a column vector of NA values. LOO and other information criteria aren’t going to be estimated when there isn’t observed data. These criteria are related to prediction accuracy, but if the response variable is not actually observed, then there’s no way to see how leaving out an observation affects the model’s prediction since we don’t know the observed value to begin with.

The recommendation from the error message is to specify the response variable as something that has observed data, but that doesn’t seem like what you’re interested in (i.e., you’re interested in the model’s ability to tell you about the unobserved latent variable). Depending on your goal here, you may try a different parameterization of your model. For example, IRT models and latent growth curve models can be estimated as generalized (non-)linear mixed models, which is brms’ wheelhouse. Alternatively, if you just want to do Bayesian SEM, then you may check out alternative packages like blavaan that are built specifically for that purpose, or you could just specify the model in Stan like discussed here.

There’s potentially something you could do with the posteriors to compare models. I’m sure that RMSEA-based fit indices could be computed to compare models. My personal recommendation would be to examine and compare the posterior predictive checks. The best model would be the one that captures the data generation process best. It sounds to me like the goal right now is to figure out the best fitting model, which is probably easier to do in another SEM-dedicated package.

1 Like