Network of models

Hi everyone! This is kind of off-topic/random, but it relates to ideas about modeling and modeling selection, with potential applications for Stan. I put it in “general” for lack of a better home.

@andrewgelman had discussed the idea of a network of models previously. This paper, which was mentioned at Rstatsnyc, touches on a similar idea: they have a variety of models (+ sets of hyperparameters) and datasets, and factorize a matrix of mean-squared error results, and then use bayesian optimization to find the “best” model. It seems like a similar approach could be used in building up stan models, perhaps combined with the forthcoming Posteriordb to think about similarities between models?

Thanks for the reference. Just to clarify: when I speak of the network of models, I’m specifically not talking about model choice or model averaging. Rather, I’m talking about navigating through the network to better understand the models we finally end up with. Model choice and model averaging are fine too. The reason I’ve made such a big deal about the network of models is that it’s a different thing.

1 Like

That’s where the embeddings idea seemed interesting to me. There are a variety of choices one can make in fitting models, choosing priors, making structured assumptions, pre-processing data, etc. This seems like an interesting way of comparing how similar models are, following different choices. The “model choice” part seemed like an interesting second-order effect of that embedding.

1 Like

yup, it’s part of the story.