Pedagogical example: search for a "good" non-trivial unidentifiable model

Ben_Lambert · June 15, 2021, 8:47am

For teaching, I’m looking for some interesting (practically) unidentifiable models which, ideally, are quite simple. I have a couple of examples that I use (the Lotka-Volterra ODE model with Gaussian noise sampled sparsely, for example), but I’m sure there are better and simpler ones out there.

Specifically, I’m looking for a model + data where:

The model is relatively simple to explain
Ideally, the model has relatively few parameters, so can be compactly written down
Generating simulated data from the model is straightforward so that the class could do this as part of the exercise to show the system is unidentified
The reason for the unidentification is non-trivial. So, for example, it’s not (say) a regression model where two parameters appear in the likelihood as a product

I’m looking for an example that’d be good for an audience of early career researchers who are just beginning in Bayesian inference.

Does anyone have a favourite example here?

jsocolar · June 15, 2021, 3:20pm

Models with two random intercepts with similar but not-identical groups (e.g. nested random effects with few levels of the nested effect per level of the coarser effect), where the total variance is identified but the individual variances not so much. This can be usefully reparameterized by the total variance and a (weakly identified) parameter from zero to one that allocates that variance to one or the other grouping.

A more sophisticated application of this trick is used for the spatial versus nonspatial variances in some parameterizations of ICAR models.

mike-lawrence · June 15, 2021, 8:09pm

SEM-style latent variable models are notorious for needing indentifiability constraints. Simplest:

#R code to generate

#means
z_mu = rnorm(1)
x_mu = rnorm(1)
y_mu = rnorm(1)

#SDs:
z_sigma = exp(rnorm(1))
x_sigma = exp(rnorm(1))
y_sigma = exp(rnorm(1))

# loading weights
z_x_beta = rnorm(1)
z_y_beta = rnorm(1)

#sample size
n = 100

#latent variable z
z = rnorm(n,z_mu,z_sigma)

#observed_variables
x = z*z_x_beta + rnorm(n,x_mu,x_sigma)
y = z*z_y_beta + rnorm(n,y_mu,y_sigma)

Then have them try to do inference on the *_mu, *_sigma & *_beta values as well as the latent z values.

Eventually it’ll become apparent that even with a high sample size, without identifiability constraints Stan’s sampler will struggle thanks to the multimodality. Then you can show that there are different kinds of constraints necessary.
The sign of the betas and the relative ordering of the z-values are interdependent (so a common approach is to fix the sign of one beta to positive).
The magnitude of the *_betas and z_scale are interdependent (so a common approach is to fix z_scale=1.
The values of the *_mus and *_betas are interdependent (so a common approach is to fix z_mu=0).
There’s probably more interdependencies I’m missing (I’m still relatively new to SEM), and certainly other interesting identifiability constraint options than the ones I note parenthetically above.

mike-lawrence · June 15, 2021, 9:26pm

Oh, and multi-item ordinal-outcome models have some neat identifiably aspects.

bnicenboim · June 16, 2021, 6:59am

A categorical/multinomial regression? There’s even an example in the manual.

Ben_Lambert · June 17, 2021, 3:06pm

Thanks @mike-lawrence @jsocolar @bnicenboim – those are all good options. My plan is to make a (pedagogical) repository with these, so will share it once it’s online.

Anyone else – still interested to hear of your interestingly unidentified models!

martinmodrak · June 17, 2021, 3:19pm

I have an IMHO neat sigmoid example at Identifying non-identifiability

spinkney · June 17, 2021, 3:21pm

Mixture models pose all sorts of identifiability problems.

This post exhibits a subtle one with just gaussians. If the means are close and the variances high enough, there’s nearly no way to figure out how many gaussians there should be. You can put “repulsive” priors to help a bit but then you need some “attractive” priors too! It just gets really unwieldy, even in a simple Gaussian mixture model.

See also @betanalpha Identifying Bayesian Mixture Models (betanalpha.github.io).

js592 · June 17, 2021, 6:18pm

I think measurement error models that involve x_{obs} \sim N(x_{true}, \tau_x) produce some interestingly degenerate likelihoods

mike-lawrence · June 17, 2021, 6:20pm

@bnicenboim do I recall correctly that diffusion models for response time data require identifiably constraints? Maybe LBA as well?

mike-lawrence · June 17, 2021, 6:24pm

Here’s another one, as described here, a Von Mises / uniform mixture for circular data has identifiability issues when p(uniform) is high and the Von Mises precision is low.

bnicenboim · June 17, 2021, 6:26pm

| mike-lawrence
June 17 |

| - |

@bnicenboim do I recall correctly that diffusion models for response time data require identifiably constraints?

Yes, the scale. Also LBA.

mike-lawrence · June 17, 2021, 6:32pm

@Ben_Lambert heres a paper whose intro has some refs on identifiably in drift-diffusion models.

jmh530 · June 23, 2021, 12:45am

If you fit a multivariate normal distribution with a single factor structure, then you have problems with identification (positive or negative). You have to constrain one of the factor variables to be greater than zero to identify the rest properly.

betanalpha · June 28, 2021, 2:34pm

One has to be careful to separate out technical identifiabilities (which obstruct almost all realized likelihood functions from contracting to a point with infinite data) from the more common complex uncertainties that arise in many models. To avoid confusion I refer to the latter as degeneracies (unidentifiable models lead to degeneracies, but not all degeneracies are due to unidentifiable models).

I discuss this terminology and review a variety of common sources of degeneracies (and how to investigate them with Stan) in Identity Crisis.

My case studies reviewing particular modeling techniques also review the degeneracies inherent to those models, for example:

Hierarchical Modeling (Section 3 and Section 4)

Ordinal Regression (Section 2.1)

Robust Gaussian Process Modeling (Section 3.2)

fbarraquand · June 30, 2021, 5:04am

If you want to dive into the topic you may be interested by a recent book on parameter identifiability. It has both trivial and non-trivial examples (the latter can be fairly complicated).

Regarding non-trivial yet not too complicated examples: any logistic-style population growth model fitted on time series that do not cover both very high and very low densities will have difficulties separating the maximum growth rate from other parameters (carrying capacity, speed of return to equilibrium,…) An example here

martinmodrak · July 1, 2021, 12:11pm

Unfortunately, the terminology (as noted by Mike) can be quite confusing - identifiability has precise definition used in most of statistics, but in Stan community (and possibly elsewhere) it is used more loosely. In the looser sense, “non-identifiability” is sometimes used to refer also to models that are identifiable in the technical sense, but where the amount/nature of data we currently have result in the likelihood being sufficiently similar for vastly different parameter values that it poses the same obstacles for computation as a strictly non-identifiable model would. Some people use “weakly identifiable” for this case, but I don’t think it is established usage.

I kind of like using the word “degenerate” for the broader class of problems we can encounter, but it has a similar problem since “degenerate posterior” already has an established meaning in statistics which is substantially different.

Terminology can definitely be changed, but I think that for the time being, if precision is desired, it is IMHO best to be a bit more verbose and say exactly what kind of problem one wants to talk about, e.g. “multimodality”, “uneven curvature”, “available data inform only some aspects of the model” etc. but that’s just my personal opinion :-)

Good luck with your teaching!

fbarraquand · July 1, 2021, 12:48pm

Here’s another simple suite of examples with e.g. fixed vs random intercepts in linear mixed models

Ben_Lambert · July 1, 2021, 3:32pm

Thanks all – great suggestions. I’m still leaving this open in case anyone else wants to contribute their examples.

nelsjohnson · July 1, 2021, 3:56pm

The intercept in case-control logistic regression.

Topic		Replies	Views
Blog post: Identifying non-identifiability Publicity	11	4959	June 6, 2018
Identifiability problems with mixed effects models Modeling	2	1120	September 19, 2017
Identifiability of Gaussian mixture mode Modeling	24	2862	October 29, 2017
What does it mean to say that a model is "unidentifiable"? General fitting-issues	5	1844	June 22, 2020
Weak identifiability in measurement models Modeling specification	3	589	January 20, 2020

Pedagogical example: search for a "good" non-trivial unidentifiable model

Related topics