Hello everyone,
I have been fitting finite mixture models in brms, and I have one last piece of doubt regarding setting sensible priors and how they are implemented. I’ve done my best to read all the documentation, the output of make_stancode()
for my models, and find other relevant posts, but I’m still not exactly sure that I’m doing this correctly.
In most of the materials that I’ve learned from online, a prior is set on the mixing proportions in the probability space using something like beta(4,4). When I set a prior like this, my models fit properly and I don’t have issues with non-converging chains. However, I notice that the estimate for parameters such as theta2
are not bound between 0 and 1, and thus I’m assuming that this prior is being transformed to log-odds space somewhere.
This would make sense to me, but the default prior that brms implements is logistic(0,1), which is already in the log-odds space. As my actual models take a long time to fit, I’ve come up with an example using the iris data set where I predict not only an outcome variable but also the probability of the proportion of the second distribution (theta2
). Like my actual models, the beta(5, 5) leads to a well fitting model but the logistic(0,0.35), which is similar information only in the log-odds space, leads poorly mixed chains and other problems.
My questions:
- Is it appropriate to use the beta distribution for setting priors on mixing proportions in models like I’ve done below for the
beta_prior_model
? - Is the
theta2_Intercept
in the model summary in log-odds? - If I want to use the posterior of
theta2_Intercept
as an informative prior for new data, should I transform it back into the probability space/approximate that posterior with another beta distribution?
Please let me know if I haven’t explained things well, and thanks for your time if you’ve made it this far!
library(brms)
iris$Petal_length <- as.numeric(scale(iris$Petal.Length))
iris$Petal_width <- as.numeric(scale(iris$Petal.Width))
iris$Species <- as.factor(iris$Species)
# Setting up a mixture
two_normal_mixture <- mixture(gaussian(), gaussian(), order = TRUE)
# Setting up a model formula
model_formula <- bf(Petal_length ~ Petal_width,
theta2 ~ Species)
# Checking priors
get_prior(model_formula,
data = iris,
family = two_normal_mixture)
beta_prior_model <- brm(model_formula,
data = iris,
family = two_normal_mixture,
prior = c(
prior(normal(-1, 0.4), class = "Intercept", dpar = "mu1"),
prior(normal(1, 0.4), class = "Intercept", dpar = "mu2"),
prior(beta(5, 5), class = "Intercept", dpar = "theta2"),
prior(normal(0, 0.5), class = "b", dpar = "mu1"),
prior(normal(0, 0.5), class = "b", dpar = "mu2"),
prior(normal(0, 0.5), class = "b", dpar = "theta2"),
prior(normal(0, 1), class = "sigma1"),
prior(normal(0, 1), class = "sigma2")),
cores = 4,
warmup = 5000,
iter = 10000,
control = list(adapt_delta = 0.95),
backend = "cmdstanr")
summary(beta_prior_model)
Family: mixture(gaussian, gaussian)
Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
Formula: Petal_length ~ Petal_width
theta2 ~ Species
Data: iris (Number of observations: 150)
Draws: 4 chains, each with iter = 5000; warmup = 0; thin = 1;
total post-warmup draws = 20000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept -0.16 0.03 -0.22 -0.10 1.00 9347 7230
mu2_Intercept 0.16 0.03 0.11 0.21 1.00 21235 15456
theta2_Intercept 0.39 0.30 -0.19 0.98 1.00 11268 7346
mu1_Petal_width 0.88 0.03 0.83 0.92 1.00 13547 8542
mu2_Petal_width 1.15 0.03 1.09 1.21 1.00 17965 13337
theta2_Speciesversicolor 0.89 0.46 -0.02 1.77 1.00 11413 8607
theta2_Speciesvirginica -0.71 0.39 -1.48 0.06 1.00 14674 9848
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma1 0.17 0.02 0.14 0.22 1.00 18814 12725
sigma2 0.18 0.02 0.15 0.22 1.00 18822 14294
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
logistic_prior_model <- brm(model_formula,
data = iris,
family = two_normal_mixture,
prior = c(
prior(normal(-1, 0.4), class = "Intercept", dpar = "mu1"),
prior(normal(1, 0.4), class = "Intercept", dpar = "mu2"),
prior(logistic(0, 0.35), class = "Intercept", dpar = "theta2"),
prior(normal(0, 0.5), class = "b", dpar = "mu1"),
prior(normal(0, 0.5), class = "b", dpar = "mu2"),
prior(normal(0, 0.5), class = "b", dpar = "theta2"),
prior(normal(0, 1), class = "sigma1"),
prior(normal(0, 1), class = "sigma2")),
cores = 4,
warmup = 5000,
iter = 10000,
control = list(adapt_delta = 0.95),
backend = "cmdstanr")
summary(logistic_prior_model)
Family: mixture(gaussian, gaussian)
Links: mu1 = identity; sigma1 = identity; mu2 = identity; sigma2 = identity; theta1 = identity; theta2 = identity
Formula: Petal_length ~ Petal_width
theta2 ~ Species
Data: iris (Number of observations: 150)
Draws: 4 chains, each with iter = 5000; warmup = 0; thin = 1;
total post-warmup draws = 20000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept -0.09 0.12 -0.21 0.15 1.53 7 NA
mu2_Intercept 0.23 0.13 0.12 0.65 1.44 8 NA
theta2_Intercept -0.38 0.81 -2.00 0.82 1.53 7 NA
mu1_Petal_width 0.94 0.10 0.84 1.15 1.53 7 NA
mu2_Petal_width 0.97 0.32 0.25 1.22 1.53 7 NA
theta2_Speciesversicolor 0.74 0.60 -0.63 1.75 1.22 13 30
theta2_Speciesvirginica 0.06 1.10 -1.31 2.35 1.53 7 NA
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma1 0.18 0.02 0.14 0.22 1.05 57 2982
sigma2 0.18 0.03 0.14 0.26 1.12 6459 35
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).