Non-Negative Prior distribution for group-level in brms

Hello,
I have a general question about modelling prior distribution for sd class (group-level).

When specifying a prior distribution, P. Bürkner (1) wrote:

Each group-level effect of each grouping factor has a standard deviation parameter, which is restricted to be non-negative and, by default, has a half Student-t prior with 3 degrees of freedom and a scale parameter that is minimally 10.

I am confused about the part in bold. Now, If I specify a Student t Distribution with the same proprieties as above:

plot( density ( LaplacesDemon::rst(1000, mu = 0, sigma = 10, nu = 3)))

image

The prior distribution is not bounded to only positive values, And I don’t understand why it is not the case, even if posteriors of sd class are supposed to be positive values (am I right?)… But does

[…] which is restricted to be non-negative […].

means that priors are only gonna be those in the right side of the above distribution plot ?

I am confused in how this works, and would appreciate any information about the technicalities around this. I am not sure either If I interpret correctly the quote above, so please don’t hesitate to tell me if I am wrong, I have some Bayesian knowledge, but I am far from being an expert.

I have read information on several sites, including Stan’s help page on prior distributions, Gelman’s page on informative priors and other sources, but did not find a satisfactory answer to my question.
Thanks!


Question background:

I work on behavioural traits (vigilance) and fitness (survival) in an ungulate. I investigated the drivers of several vigilance traits (which are count data, y1, y2, y3) through generalized linear models with negative-binomial distribution using glmmTMB. Each of these models include year, id (identity of the animal) and observer as random effects, as well as a bunch of population-effects. From that part of my analyses, I saw that the variance of observer random effect is very small (actually,close to 0). I also use results from these analyses to specify prior information on population level-effects.

I am now trying to study the correlations at the individual level (group-level= id) between these behavioural traits (y1,y2,y3) and survival (0-1 data, Bernoulli, y4). I chose to do a Multivariate model and to compute the correlations between shared group levels of different formulas. I followed what is written by P. Bürkner (2) :

Then, however, specifying group-level effects of the same grouping factor to be correlated across formulas becomes complicated. The solution implemented in brms (and currently unique to it) is to expand the | operator into |< ID >|, where < ID > can be any value. Group-level terms with the same ID will then be modelled as correlated if they share same grouping factor(s).

What I am doing now is that I try to specify a prior distribution for a random effect (observer effect). I know from earlier analyses that the variance is small, but to be consistent, I still want to include this group-level effect in the formulas of my multivariate analysis. I am now trying to specify a prior distribution that will “help” observer group-level effect to stay close to zero and improves sampling speed + convergence.

References:

(1) Bürkner, P.-C. (2017). brms : An R package for bayesian multilevel models using Stan. Journal of Statistical Software , 80 , 1–28. doi: 10.18637/jss.v080.i01

(2) Bürkner, P.-C. (2018). Advanced Bayesian multilevel modeling with the R package brms. The R Journal , 10 , 395–411. doi: 10.32614/RJ-2018-017

2 Likes

I am just a novice user, so please do not treat my answer in the same way you would statements from the experts.
Yes, the prior for the sd class variable will be limited to the right side of the distribution you displayed. You can read about some aspects of this in the Stan Reference Manual. In the 2.21 version of the manual, section 7.4 has a subsection on Truncated Distribution starting on page 60. Also, part 10 of the manual discusses Constraint Transforms with 10.2 discussing Lower Bounded Scalar parameters. The transforms allow Stan to treat the parameter as having support at all real values even after the truncation.

I hope those sections will get you started on understanding limits on parameters work.

3 Likes

The part I bolded in this section is the important thing to note.

Half student-t means the part that is only positive.
This is true for any SD and scale parameters because by definition you can’t have a
negative value for these values.

2 Likes

Thanks to you both , I was thinking too hard it seems! It is clear now. :)

1 Like