Suggestion for Prior Choice Recommendations wiki

mike-lawrence · October 25, 2019, 1:48pm

The Prior Choice Recommendations wiki came up in another thread, and this reminded me of something I’ve wanted to discuss here for a while. I often see folks using a peaked-at-zero (ex. normal(0,1)) prior for variability parameters (standard deviations, variances). This can be useful in imposing shrinkage in hierarchical models, but I’ve even seen peaked-at-zero priors on things like measurement error, where zero is surely a rather incredible value. For example, here’s a trivial model:

data{
    int N ;
    real[N] Y ; //model assumes data has been scaled to mean=0, sd=1
}
parameters{
    real mu ;
    real<lower=0> sigma ;
}
model{
    mu ~ normal(0,1) ; 
    sigma ~ normal(0,1) ; //arguably unreasonable peaked-at-zero prior!
    Y ~ normal(mu,sigma) ;
}

where sigma is given a peaked-at-zero prior, implying that one thinks it most likely that their measurement was achieved with perfect accuracy. Instead, I’ve been recommending folks use something like:

sigma ~ weibull(2,1) //zero-as-incredible prior

Possibly a prior based on gamma() would also work, I am just more familiar with the weibull() distribution.

I don’t think I see any content on the Prior Choice Recommendations page related to this topic (though maybe the last bullet from this section counts?), so what does everyone think of the idea of adding an explicit mini-section on this topic?

maxbiostat · October 25, 2019, 3:25pm

@Bob_Carpenter used to say something about peaked-at-zero priors. Don’t know if he can point to some material on this. Perhaps this is also of interest.

andrewgelman · October 25, 2019, 9:56pm

We’ve used the gamma (not inverse-gamma) prior for its zero-avoiding properties in marginal posterior mode estimation; see this article: http://www.stat.columbia.edu/~gelman/research/published/chung_etal_Pmetrika2013.pdf and also the Wishart (not inverse-Wishart) to avoid degenerate estimates for covariance matrices; see here: http://www.stat.columbia.edu/~gelman/research/published/chung_cov_matrices.pdf

For full Bayes, I don’t see zero-avoidance as necessary from a statistical standpoint, but it can help with computation because it allows us to avoid funnel behavior in posteriors.

Regarding your “peaked at zero” comment: that’s not necessarily the right way of thinking about it. On the scale of log(sigma) there is no longer a peak at zero.

mike-lawrence · October 26, 2019, 2:10pm

@paul.buerkner what is the default prior for the measurement noise term in brms models with family=Gaussian?

paul.buerkner · October 27, 2019, 9:39am

Can you specify which model exactly you have in mind? In any case, the default priors of brms will likely be too wide to recommend them reasonably. In fact, this is one of the aspects of brms that require most improvement I would say.

Topic		Replies	Views
Prior for sigma Modeling specification	3	1885	February 7, 2021
Options for Priors on Random Effects with Non-Centered Parameterizations Modeling specification	6	3214	August 29, 2018
Choosing prior for "overdispersion" in Dirichlet Multinomial distribution Modeling prior-choice , dirichlet-multinomial	3	1671	May 12, 2020
Prior recommendation for scale parameters in hierarchical models too strong? Modeling	25	8288	January 31, 2018
Hierarchical Priors - problems when hierarchical variance drops Modeling	0	455	February 26, 2022

Suggestion for Prior Choice Recommendations wiki

Related topics