MLE for multimodal Likelihood: frequentist framework

betanalpha · March 4, 2020, 12:02am

I believe there are some fundamental misconceptions about Bayesian and frequentist modeling at play here.

In frequentist modeling one specifies an observational model, \pi(y; \theta) and introduces estimators, functions from the observational space to the parameter space \hat{\theta}: Y \rightarrow \Theta, and a loss function L(\hat{\theta}, \theta) that quantifies how useful an estimator is if \theta identifies the true data generating process. A frequentist analysis then calibrates the estimator by computing the worst case expected loss. At least a frequentist analysis tries to perform such a calibration; in practice this is often too computationally demanding for nontrivial observational models or estimators or loss functions.

Evaluating the observational model at an observed measurement, \tilde{y}, yields the likelihood function, \pi(\tilde{y}; \theta). The parameter values that maximize the likelihood function define the maximum likelihood estimator. Under very specific conditions the maximum likelihood estimator can be approximately calibrated – unbiased, intervals around the maximum likelihood have nice coverage properties, etc.

One necessary condition for the maximum likelihood to be (approximated) calibrated is that the likelihood function concentrates in a single neighborhood. In other words seeing multiple models indicates that any calibrations in invalid. You can still compute a maximum likelihood, or try to at least, it just won’t have any expected behavior.

In a Bayesian analysis the observational model is complemented with a prior model to give a joint distribution over the data and parameter space. When that joint distribution is conditioned on the observed data we get a posterior distribution. We then quantify inference as expectation values with respect to that posterior distribution.

In general a posterior distribution has no calibration – we have no idea how the posterior distribution or posterior expectation values, will behave a priori unless we do the calibration ourselves.

Multimodality doesn’t prevent us from trying to calibrate our Bayesian model in theory, but in practice it can prevent us from implementing the calibration because we can’t estimate expectation values accurately.

For much more see https://betanalpha.github.io/assets/case_studies/modeling_and_inference.html.

Topic		Replies	Views
Diagnose the multimodal posterior distribution General rstan , techniques , specification , loo	6	1643	November 8, 2022
Hierarchical Linear Models - Bayes vs. Frequentist General	7	6771	December 14, 2019
Tutorial on Monte Carlo EM and variants for MML and MMAP Algorithms	16	3750	October 22, 2018
Visual, non-mathematical explanation of bayesian regression coefficients' estimation Modeling	0	398	February 19, 2022
Multi-modality of posteriors Modeling fitting-issues	3	819	December 10, 2018

MLE for multimodal Likelihood: frequentist framework

Related topics