Fair enough. I may be trying to be too precise in that regard. I could go with as is, as visually the distribution of the simulated data sets is roughly what one could expect for the intercepts.
I do have another question related to setting the beta priors for this model:
I have a group predictor, as well as continuous predictors that were log-transformed, scaled and centered. I had thought to set regularizing priors of N(0,1) for the group predictor. I was unsure how to estimate the prior for continuous predictor, so I generated a fake dataset of ordinal and continuous variables with varying degrees of correlations. I then ran those through a simple model looking at the relationship between the two with default priors to get a rough idea of the types of estimates, given this fake data. This would lead me to think I could also use N(0,1). However, my prior predictive plot is rather U-shaped, which I did not expect. Am I approaching the continuous predictor correctly, or is there another way I should consider this?
Related code for reference:
# set contrasts (sum-coding)
contrasts(df$group) = contr.sum(2)
# develop formula
bf.form = brms::bf(rating ~ group + cont1 + cont2 +
group:cont1 + group:cont2)
get_prior(bf.form, data = df, family = cumulative('probit'))
tibble(rating = 1:7) %>%
mutate(proportion = 1/7) %>%
mutate(cumulative_proportion = cumsum(proportion)) %>%
mutate(right_hand_threshold = qnorm(cumulative_proportion))
priors = c(
prior(normal(-1.07, 1), class = Intercept, coef = 1),
prior(normal(-0.57, 1), class = Intercept, coef = 2),
prior(normal(-0.18, 1), class = Intercept, coef = 3),
prior(normal(0.18, 1), class = Intercept, coef = 4),
prior(normal(0.57, 1), class = Intercept, coef = 5),
prior(normal(1.07, 1), class = Intercept, coef = 6),
prior(normal(0, 1), class = b)
)
# simulate data
generator <- SBC_generator_brms(bf.form, data = df,
family = cumulative('probit'), init = 0.1, prior = priors,
thin = 50, warmup = 10000, refresh = 2000,
# Will generate the log density - this is useful,
#but a bit computationally expensive - turned off
generate_lp = FALSE)
datasets <- generate_datasets(generator, 100)
This is the prior predictive plot of the simulated datasets given the above noted priors: