# Determining value of prior distributions parameters

Hello,

I would like to define the values of the distribution parameters in a hierarchical Bayesian model of Reaction times. I am using the brms package.

I don’t understand how to set values for the parameters of the priors. The online guide Prior Choice Recommendations suggest using a normal(0,10), but the range of the values for reaction times in this study goes from 200 to 750 ms.

1. How do I formalise this information in the prior, given that I want a very weakly informative prior?
2. If I understood correctly, I need to define priors on the parameters defining the outcome variable distribution (here, an ex-Gaussian). How do I define the values for the different parameters outcome distribution?
3. Another measure is accuracy, for which the model is the same, but I am using a different family (family = bernoulli(link = “logit”)). How does the prior parameter estimation changes in this case?

Thank you very much!

Here is the model I am using:

``````RTs1.bmodel = brm(RTs  ~  conditionStimuli * sequenceTrials + (conditionStimuli * sequenceTrials | Num_part)
, data = data_RTs_go
, family = exgaussian(link = "identity")
, warmup = 500
, iter = 2000
, chains = 3
, inits = "random"
, cores = 2
, seed = 123
, save_pars = save_pars(all = TRUE)
, prior = prior1_rt
, sample_prior = "only"  #for prior predictive checks
)
``````
``````data {
int<lower=1> N;  // total number of observations
vector[N] Y;  // response variable
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1;  // number of grouping levels
int<lower=1> M_1;  // number of coefficients per level
int<lower=1> J_1[N];  // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
vector[N] Z_1_2;
vector[N] Z_1_3;
vector[N] Z_1_4;
vector[N] Z_1_5;
vector[N] Z_1_6;
int<lower=1> NC_1;  // number of group-level correlations
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc;  // centered version of X without an intercept
vector[Kc] means_X;  // column means of X before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector<lower=250,upper=750>[Kc] b;  // population-level effects
real Intercept;  // temporary intercept for centered predictors
real<lower=0> sigma;  // residual SD
real<lower=0> beta;  // scale parameter
vector<lower=0>[M_1] sd_1;  // group-level standard deviations
matrix[M_1, N_1] z_1;  // standardized group-level effects
cholesky_factor_corr[M_1] L_1;  // cholesky factor of correlation matrix
}
transformed parameters {
matrix[N_1, M_1] r_1;  // actual group-level effects
// using vectors speeds up indexing in loops
vector[N_1] r_1_1;
vector[N_1] r_1_2;
vector[N_1] r_1_3;
vector[N_1] r_1_4;
vector[N_1] r_1_5;
vector[N_1] r_1_6;
// compute actual group-level effects
r_1 = scale_r_cor(z_1, sd_1, L_1);
r_1_1 = r_1[, 1];
r_1_2 = r_1[, 2];
r_1_3 = r_1[, 3];
r_1_4 = r_1[, 4];
r_1_5 = r_1[, 5];
r_1_6 = r_1[, 6];
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n] + r_1_2[J_1[n]] * Z_1_2[n] + r_1_3[J_1[n]] * Z_1_3[n] + r_1_4[J_1[n]] * Z_1_4[n] + r_1_5[J_1[n]] * Z_1_5[n] + r_1_6[J_1[n]] * Z_1_6[n];
}
target += exp_mod_normal_lpdf(Y | mu - beta, sigma, inv(beta));
}
// priors including constants
target += normal_lpdf(b | 0, 3)
- 5 * log_diff_exp(normal_lcdf(750 | 0, 3), normal_lcdf(250 | 0, 3));
target += normal_lpdf(Intercept | 0, 3);
target += student_t_lpdf(sigma | 3, 0, 118.6)
- 1 * student_t_lccdf(0 | 3, 0, 118.6);
target += gamma_lpdf(beta | 1, 0.1);
target += normal_lpdf(sd_1 | 0, 3)
- 6 * normal_lccdf(0 | 0, 3);
target += std_normal_lpdf(to_vector(z_1));
target += lkj_corr_cholesky_lpdf(L_1 | 1);
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
// compute group-level correlations
corr_matrix[M_1] Cor_1 = multiply_lower_tri_self_transpose(L_1);
vector<lower=-1,upper=1>[NC_1] cor_1;
// extract upper diagonal of correlation matrix
for (k in 1:M_1) {
for (j in 1:(k - 1)) {
cor_1[choose(k - 1, 2) + j] = Cor_1[j, k];
}
}
}
``````

Here is an example of the data:

``````Num_part trial_type  Go_type conditionStimuli ITI_ms response RTs correctResponse order_pres sequenceTrials sdt
2        1         Go     Bent           leaves    819        1 301               1          1            NGG   1
3        1         Go     Bent           leaves    771        1 237               1          1             GG   1
4        1         Go     Bent           leaves   1086        1 393               1          1             GG   1
5        1         Go Straight           leaves    652        1 331               1          1             GG   1
7        1         Go     Bent           leaves    919        1 372               1          1            NGG   1
9        1         Go Straight           leaves    802        1 359               1          1            NGG   1

``````

The best thing to do here is to generate prior predictive data and iterate on the priors until the prior predictive data starts to match up with what your domain expertise suggests. This procedure is generally called a prior-predictive check (PPC)

I’m pretty familiar with RT data, but even the single added complexity of your exgaussian link would leave me feeling that I might make mistakes if trying to set the priors without doing PPCs.

The one caveat on the above is that you’ll want to take care to modify the `init` values to scale with your prior parameter values. For example, if you had a simple normal link and were trying to do a PPC with a prior on the intercept of `Intercept ~ normal(400,100)`, you’d want the inits to be something like `init=800`. If you stuck with the default inits, which are `init = 2`, the sampler would start in a range that is very far from the typical set implied by your prior and could have trouble getting to the typical set during adaptation.

1 Like

Thank you mike-lawrence!

I planned to run a PPC, but I am not sure how to run it. What I understood so far is that the model has to include " `sample_prior = "only"`", after running the model I check the data simulated (visually by “conditional_effects” or plots). Is there a more appropriate procedure to run PPC?

Also, how do I modify the parameters’ values to reflect my prior expectations? For example, I know that the range of RT values is between 250 and 750 ms: as parameters of a possible uniform distribution, I simply put these values or do they need to be transformed/adapted? I don’t understand the meaning of parameters’ values used in many publications…
EDIT: if I set a prior ` set_prior("uniform(200,750)", class = "b", lb = 200, ub = 750)` I still get impossible values (negative RTs). Which class I should assign a prior to avoid these impossible values?

Regarding the caveat, I see only now that is an important point (found this link too: Don't forget your inits | A. Solomon Kurz).
Would you suggest a general rule to set the inits?

Are there other aspects I need to improve regarding the model?

Thank you!

1 Like