Why StudentT(3,0,1) for prior?

avehtari · March 17, 2019, 9:35am

Yes. The idea is that thick tail reflects our uncertainty in the prior scale and if we have underestimated the prior scale thick tail easier detection of prior-likelihood conflict. I had good experience with t_3 or t_4 when working a lot with GPs when sometimes thick tail really described our prior information for some covariance function parameters well. See O’Hagan (1979). On Outlier Rejection Phenomena in Bayes Inference, JRSSB, 41(3):358-367, for more on benefits of thick tail in case of Student’s t.

Possible complication that I know (or remember) is computational issues. Heavy tailed prior and weak information from likelihood (due to weak data or weak identifiabilities in parameterization) can lead to heavy tailed posterior. For example, dynamic HMC used in Stan is much better than many other MCMC algorithms for sampling from thick tailed distributions but still has problems as least in case of Cauchy. The nice property that the thick tailed prior is robust in case of misspecified scale, can also lead to multimodality which can also cause computational problems. I also recommend normals (and half-normals) because of these computational issues. This is especially recommended when you have good prior information on the scale or if you know that the result is not going to be sensitive if you set the scale to a much larger value. Using normal prior changes a bit how to diagnose the misspecified prior scale, but that is not a big issue.

And sometimes we use even thicker tailed distributions than t_3 Bayes Sparse Regression

Topic		Replies	Views
Estimating student-t degrees of freedom? Modeling	2	2046	January 2, 2018
PSA: Marginally-Student-t multivariate (df-varying-by-dimension) Modeling specification	8	615	June 17, 2022
Non-centered student_t - matt's trick Modeling techniques	2	1487	July 2, 2018
Student T distribution: A question on scale General	3	3575	July 29, 2020
Large estimated standard deviation in multilevel model despite weakly informative prior Modeling	7	720	March 24, 2019

Why StudentT(3,0,1) for prior?

Related topics