I’m still pretty new to brms and Stan (and Bayesian inference generally), but I keep running into the same two problems in different contexts. For background, I’ll stick to the analysis I’m currently working on.
I am trying to estimate the three parameters of an exponentially modified gaussian distribution (exgaussian) for reaction times, based on condition (three levels) and group. The model also has individual intercepts and slopes (for condition) for each of the parameters. The results so far suggest to me that this is a pretty good model for the data, in that the distribution of data simulated from the posterior matches well with the observed distributions (per condition/participant), which is essentially what I’m interested in here.
Reaction times are in seconds (I’ve tried dividing them by the standard deviation but that does not seem to make any difference for these problems) and therefore largely fall between 0 and 1 but with a significant fraction of large outliers (max ~7s).
My approach to setting priors so far is to set priors for the intercept based on guesstimates of where I’m 99% sure the value has to lie. I’m using a student_t(3, …) to allow as much as feasible for the possibility that I’m completely wrong. For the fixed effects I’m using priors centered on 0 because I don’t want to encode the direction of these effects in the priors, and then setting the scale to half of that for the intercept priors. For the ‘random effects’ I am using an exponential prior with a high rate to shrink the sd towards 0, but I’ve relaxed the prior for the beta parameter because posterior checks suggested that it was too strong.
So here’s the model
modex7.bf <- bf(RT ~ cond * group + (1 + cond|a|id),
sigma ~ cond * group + (1 + cond|a|id),
beta ~ cond * group + (1 + cond|a|id),
family = exgaussian(link = 'identity', link_sigma = 'log', link_beta = 'log'))
get_prior(modex7.bf, data = posRT)
modex7.priors <- set_prior('student_t(3,0.5,0.08)', class = 'Intercept') +
# 99%: 0 to 1
set_prior('student_t(3, -2.5, 0.4)', class = 'Intercept', dpar = 'sigma') +
#99%: -5 to 1
set_prior('student_t(3, 2.7, 0.4)', class = 'Intercept', dpar = 'beta') +
# 99%: 1 to 200* (log = 0 to 5.3)
# *at a rate of 200, the exponential contribution to the exgauss is negligible
set_prior('student_t(3, 0, 0.04)', class = 'b')
set_prior('student_t(3, 0, 0.2)', class = 'b', dpar = 'sigma') +
set_prior('student_t(3, 0, 0.2)', class = 'b', dpar = 'beta') +
set_prior('exponential(25)', class = 'sd',) +
set_prior('exponential(25)', class = 'sd', dpar = 'sigma') +
set_prior('exponential(10)', class = 'sd', dpar = 'beta') # relaxed this prior a bit
modex7.fit <- brm(modex7.bf, data = posRT, sample_prior = 'yes', prior = modex7.priors, iter = 4000, cores = 4, seed = mustard)
So my problems:
- I get this warning/error from Stan (a lot) while sampling:
Chain 4: Rejecting initial value:
Chain 4: Log probability evaluates to log(0), i.e. negative infinity.
Chain 4: Stan can't start sampling from this initial value.
I understand that this is because the starting values that Stan is using are far from posterior mass. However, the documentation for brm suggests that you shouldn’t manually set starting values (“Alternatively, inits
can be a list of lists containing the initial values, or a function (or function name) generating initial values. The latter options are mainly implemented for internal testing.”) and 0 doesn’t always make sense as a starting value for the parameter either (e.g., sd of a guassian with identity link).
If I understand correctly, in the absence of initial values, Stan generates them on [-2, 2], but in some cases I really have to pull some strange tricks to make all or even most values in the range at all plausible (e.g., the sigma parameter in the above example).
- I can’t figure out how to set a prior for fixed effects that I believe could plausibly go in both directions (i.e., increase and decrease the parameter) when the parameter itself is bounded (e.g., if sigma in the model above had an identity link instead of a log link).
I strongly suspect that both of these problems can (easily?) be fixed by coding the models directly in Stan, or by using the ‘stanvars’ option in brms, i.e., through some form of reparametrization. However, that would require me to learn Stan syntax, something that is pretty high on my to-do list but not high enough that it provides a solution for this particular case. Are their any alternatives?
Please also provide the following information in addition to your question:
- Operating System: Window 10 Enterprise
- brms Version: 2.12.0