Compare linear vs. nonlinear model fit

Hello everyone,

some colleagues and I were discussing how to decide whether a linear or a sigmoidal curve fits the data best. E.g., does the probability of catching fish increase in a linear or sigmoidal fashion depending on the number of worms used?
We thought about comparing a model with a gaussian link to a model with a logit link, but as far as I understand it, you cannot compare the fit of models with a different link function in a meaningful way? Plus, a sigmoidal function might look linear depending on the parameters of the function.
The other option would be polynomials/spline regression, but that seems pretty “heavy” considering that we know we want to compare linear vs. sigmoidal?

This sounds like a straightforward problem, but I couldn’t find an answer so far. Sorry if I overlooked something.
I’m grateful for any resources you can point me towards!

Thank you in advance
Juli

I’ve never heard this, but that might be my inexperience with model comparison. It strikes me though that the approach taken by loo for comparison shouldn’t care about the internal structures of the model as being different, just that there be a log_prob for each of the same set of observables. @avehtari , true?

2 Likes

Oh, and as you say, there is a range of values for the sigmoid model that make it indistinguishable from a linear model, so one approach is to fit the sigmoid model and simply look at how much probability mass is in that linear range. No model comparison necessary.

3 Likes

True.

True.

4 Likes

Thank you both! I especially like the approach to see hoch much probability mass is in the linear range - this will probably give the most meaningful answer to the question.

1 Like

BTW, while you expressed that you have domain expertise that led to the parametric form of the non-linear variant, if you didn’t have such prior knowledge and wanted to more flexibly accommodate possibly-non-lienear effects, then you could use a Gaussian Process, or, if the data are too dense and GPs are too slow, use a generalized additive model as an approximation. With both GPs and GAMs, the models will revert to linearity if the data don’t support higher complexity.