Hi,

I’m trying to work out a way to efficiently do model comparison for very different models, please see below

I am currently attempting to create a model that comes in the form of a hierarchical regression. Lines are fit to a set of data, and the slopes of each lines are assumed to be drawn from some common distribution.

Unfortunately, each “line” in my dataset is not necessarily linear and can be produced by a number of different generative models which contain no information about the line slope. One such model is a model where circles or ellipses are fit to the data instead of lines. In this case, the only shared parameter between the two models is the standard deviation of the noise on the data, \sigma. Effectively the metric of comparison between the two models is something like the goodness of total least squares fit between them (which is similar to the Akaike Weight).

I am trying to come up with a scheme where the probability of a dataset being linear is determined and then this is taken into account at the higher stage of the hierarchy. Because the data can be generated by the line OR circle model, then the likelihood should just be the sum of their likelihoods, i.e. if we have a line slope \theta then we can get the posterior probability:

P(\theta|y) \propto P(\theta)(P(y|\theta,M_{line})+P(\ y|\theta,M_{circ}))

If we try and fit all the parameters in the model at once, then we almost never achieve convergence because each model represents a different mode. However, if we use the marginalized equation above it becomes clear that P(y|\theta,M_{circ}) is a constant, so if we can find the value of this constant then we can incorporate this into our model. I was wondering if there was a computationally efficient way of computing this (effectively you’d want to be computing P(y|M_{circ})) for each y). I have a fairly efficient way of doing the fits for each model (takes around ~2s for 10000 draws of each) in Stan. Would something like an Akaike weighting scheme work here, given that generally in the past I’ve found WAIC/LOO-CV estimates to be fairly unstable each time the model is run? Perhaps using AIC would be the most stable weighting system in this case?

Does anyone on this forum have any suggestions on what the best metric to tackle a problem like this would be?