Biased posteriors in unidentified models

bnicenboim · April 26, 2021, 8:58am

Hi all,
This is a general question, not strictly about a specific Stan model. (But it’s a spin-off question from this).

Can it be that if I’m dealing with an unidentified complex model, I get biased posterior distributions? That is, I generate data with known true values for the parameters and I get (too) precise 95% CrI that exclude the true values? (Or does this mean that I have an error somewhere else in the model?)

I understand that the most common scenario would be that the model doesn’t converge because the posteriors have many modes and the chains get stuck in different modes. Or if I have good priors then the posteriors of the unidentified model would be more or less the priors. But I faced this problem of biased posteriors a couple of times, (always with super complex models, last time here) and I couldn’t figure out what’s wrong.

Bruno

jsocolar · April 26, 2021, 2:55pm

You might be able to troubleshoot this by checking whether the incorrect estimation persists even if you initialize your chains at the known true values. If multiple chains so initialized converge to the wrong values, then it’d at least be worth undertaking an extra-careful search for bugs. If (at least some of) these chains find the correct value, then it seems very likely to be a problem with either unidentifiability or multimodality.

Edit: Or if the sampler diagnostics are bad it could just be a problem with the parameterization.

betanalpha · April 26, 2021, 10:28pm

First and foremost, even when fitting the true model there is no general guarantee that any single posterior will have any relationship to the true value. Having a model where nearly every posterior distribution is close to the true value is a nice property to have, but it’s one that has to be verified in every application (for example with Bayesian calibration).

If the observational model is non-identified, such that every possible likelihood function doesn’t concentrate around a single point no matter the size of the observation, then computational problems are always an additional issue. It’s hard to accurately quantify a posterior that stretches to infinity with only finite computation!

Topic		Replies	Views
Fitting an unidentified linear model Modeling	16	1173	February 25, 2022
Toy ODE model problem -- posterior predictives not consistent with real values Modeling fitting-issues , specification	14	1062	June 4, 2019
Divergences and bias in simple extreme-value model Modeling	2	788	June 27, 2018
Posterior distributions in hierarchical model problems Modeling fitting-issues , specification , cmdstanpy , hierarchical-model	6	446	February 5, 2024
Proposal: including a "canary" variable to illustrate poor exploration of the posterior General techniques	11	807	June 15, 2020

Biased posteriors in unidentified models

Related topics