Hierarchical models not having a posterior mode

PhDemetri · May 13, 2022, 2:58pm

In this talk, Bob mentions that “a lot of models, like hierarchical models, don’t have posterior modes” making MAP/Penalized MLE a poor choice for inference.

Does anyone happen to have a citation for this claim that hierarchical models don’t have modes (or perhaps are multimodal, whatever Bob meant by this claim)?

maxbiostat · May 14, 2022, 2:24pm

@Bob_Carpenter @betanalpha

betanalpha · June 13, 2022, 5:51pm

It’s not so much that the posterior mode doesn’t exist but rather that it exists on a boundary of the model configuration space. This behavior invalidates most of the Bernstein-von Mises asymptotic guarantees of maximum a posteriori estimates, and hence a common motivation for these types of estimates.

This is straightforward to see if you look at a normal hierarchal model with “flat” prior density functions for the population location and scale

\begin{align*} \pi(\theta_{k}, \mu, \tau) &= \pi(\theta_{k} \mid \mu, \tau) \, \pi(\mu, \tau) \\ &\propto \text{normal}( \theta_{k} \mid \mu, \tau). \end{align*}

In this case \pi(\theta_{k}, \mu, \tau) has a singular maximum as \tau \rightarrow 0 and \theta_{k} - \mu \rightarrow 0. Geometrically it’s the very bottom of the infamous funnel.

The full posterior, however, also has to take into account the likelihood functions \pi(\tilde{y}_{n} \mid \theta_{k(n)}). The problem is that unless there a lot of data and a reasonable number of groups then the likelihood function won’t be able to exclude that singular mode of the hierarchical model, which will then propagate to the posterior density function.

A prior model for \tau that explicitly excludes \tau = 0 will yield a better behaved maximum, but that’s feasible only when one actually has domain expertise that excludes homogeneous behavior amongst the groups. Even then asymptotics are often so far away that the performance of the maximum a posteriori estimator will not be great even though it has been moved away from the boundary.

Ideally one would confirm this by trying to run an optimization in Stan or similar tool and seeing the boundary behavior.

Topic		Replies	Views
Asymptotic behavior of hierarchical models General hierarchical-model	0	522	December 29, 2021
Posterior distributions in hierarchical model problems Modeling fitting-issues , specification , cmdstanpy , hierarchical-model	6	446	February 5, 2024
Is stan useful for MML? What is the state of GMO? General marginal-likelihood	6	1300	February 6, 2018
Hierarchical Linear Models - Bayes vs. Frequentist General	7	6768	December 14, 2019
SUG 1.13 ("Multivariate priors for hierarchical models"): missing prior & serious identification issue? Modeling specification	14	1284	October 22, 2021

Hierarchical models not having a posterior mode

Related topics