In the Stan Users’ Guide (section 11.3, page 140), it suggests the following priors for a Gaussian process:

\rho \sim \operatorname{InvGamma}(5,5)

\alpha \sim \operatorname{Normal}(0,1)

\sigma \sim \operatorname{Normal}(0,1)

I have a rough idea for why the priors are as they are, but one thing that isn’t quite clear to me is the reason for particular values of the arguments of \operatorname{InvGamma} and \operatorname{Normal} (or really half-normal) distributions. I’m guessing that the standard deviations of the normal distributions are a rough estimate of the expected ranges of values for \alpha and \sigma, but I’m far more unsure of the reasoning behind why the arguments of the inverse gamma distribution are what they are, and I’m especially unsure what would change when the expected size of the length scale changes. Would, for example, the first argument of \operatorname{InvGamma} stay the same in order to reflect the shape of the left tail, while the second argument is estimated from a guestimate of the mean or mode of the distribution?

The \alpha parameter is the marginal standard deviation of the GP and should be set like any standard deviation parameter. Something like a half-Normal(0,1) or a half-normal(0,5) or and Exponential with an appropriate mean or a half-t with 3-7 degrees of freedom.

Same advice for \sigma.

The Inverse Gamma on \rho comes from the fact that we need to avoid very small length scales (basically there are model pathologies when the estimated lenght scale is smaller than that length scale of the data). The parameters in the Inverse-Gamma should depend on the location of the points.

The recommendations are part of a paper that @rtrangucci is writing at the moment (it’s not out yet, unfortunately). Some experiments are in @betanalpha’sGP case studies. Some further theoretical justification for the inverse gamma (and a way of choosing the parameters) is here: https://arxiv.org/abs/1503.00256

3 Likes

In particular, see Section 4 of https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html for a discussion of how to use the algebraic solver in Stan to derive the parameters of the inverse Gamma density corresponding to more-interpretable tail probabilities.

1 Like