Categorical Modeling and priors for interaction terms

Hildetb · March 18, 2020, 4:36am

Please also provide the following information in addition to your question:

Operating System: Windows 10
brms Version: 2.12

I am new to brms and working on an example to teach myself capabilities for categorical outcomes, particularly in repeated trial experiments. I am starting with a simple example with 3 category response and several conditional effects with interactions. The model is complex, so I am trying to teach myself brms with realistic expectations.

Using default priors, I estimated the following model.

Blockquote
Model1 ← brm(Choice ~ Randomization + Phase + Session + Session:Phase + (1 | ID), data = temp1, family=categorical)

I am interested in understanding the interaction and testing different priors. Does anyone have advice on reasonable priors for interactions with these types of models and advice on what model checking should be done? All of my predictors are categorical (2 levels).

Typical model checking appeared reasonable to me for default priors.

Posterior Check also consistent with good model

torkar · March 18, 2020, 8:01am

Hi,

I would recommend you to set some default priors (start with N(0,1)) and then see what happens when you change them. Run with sample_prior="only" and use pp_check(fit, type="bars", nsamples=100) and compare the plots. You’d like it to look nearly uniform on the outcome scale; if you don’t have any prior knowledge you’d like to use. In McElreath’s book (Ch. 8 in the 2nd edition I believe) he discusses this at length (both using categorical and continuous interactions). In short, always plot your interactions.

Hildetb · March 19, 2020, 3:16am

@torkar Thank you very much for the suggestion. The models tips and chapter in McElreath’s book are helpful. I am wondering if I should be concerned about the following error that comes up when I try to specify priors:

Specifying global priors for regression coefficients in categorical models is deprecated and may not work as expected.Specifying global priors for regression coefficients in categorical models is deprecated and may not work as expected

The results are very different from default prior models and when I compare the two, I am surprised how much different the estimates.

Using the following priors

priors ← c(prior(student_t(3, 1, 10), class = “Intercept”),
prior(normal(0, 0.1), class = “b”)
)

I tried the same model with

prior(normal(0, 1), class = “b”)

and the results were similar. The divergence appears to be on the intercept estimates.

Default Priors

Informative Prior on “b”

I left the default priors on “sd” and the prior on the “intercept” is the same as the default.

torkar · March 19, 2020, 6:44am

Hi,

use p <- get_prior(brm(y ~ 1 + foo), data=df, family=categorical) to get a list of all priors. Then you set new priors explicitly using, e.g., p$prior[1] <- "normal(0,1)".

brms is picky when it comes to the priors since Paul really wants to make sure that you understand what you do. So you should listen to the warning :)

Change all default priors and play around so you get to know how this works, then start by setting som regularizing priors to see how it looks like on the outcome scale.

Solomon · March 19, 2020, 6:09pm

I ran into this issue, recently and agree with @torkar’s advice. You can find a couple examples of how I handled it at https://bookdown.org/content/3890/counting-and-classification.html#multinomial.

Hildetb · March 19, 2020, 6:52pm

@Solomon and @torkar, this is all very helpful. I see that I need to set a specific prior for each unique parameter and the differences from model to model (prior to prior) are making a lot more sense. I am also realizing that I didn’t intuit that the sample _prior = "only" specification skipped the likelihood. Now it is all making a lot more sense to me, thank you both!

Hildetb · March 25, 2020, 10:30pm

I have a quick follow-up to this practice that I hope is easy to answer. If I have a variable that is coded as an integer, for example

library(tidyverse)
library(brms)
library(bayesplot)

df ← setNames(data.frame(matrix(ncol = 2, nrow = 150)), c(“Choice”, “Choice0”))
df ← df %>% mutate(Choice = rep(c(1,2,3, 1, 1, 3), 25), Choice0 = rep(c(0,1, 2, 0, 0, 2), 25))

Then I model in brms with categorical family, just the intercept

prior1 ← c(prior(normal(0,5), class = “Intercept”, dpar = “mu2”), #Normal Priors sample wider range than default student t parameters
prior(normal(0,5), class = “Intercept”, dpar = “mu3”)
)
Example1 ← brm(Choice ~ 1,
data = df,
family=categorical(link = “logit”),
prior= prior1, #defined priors from above
sample_prior=“only”,## This skips the likelihood in generating posterior probability sampling
seed = 123
)
pp_check(Example1, type=“bars”, nsamples = 100)

prior2 ← c(prior(normal(0,5), class = “Intercept”, dpar = “mu1”),
prior(normal(0,5), class = “Intercept”, dpar = “mu2”)
)
Example2 ← brm(Choice0 ~ 1,
data = df,
family=categorical(link = “logit”),
prior= prior2, #defined priors from above
sample_prior=“only”,## This skips the likelihood in generating posterior probability sampling
seed = 123
)
pp_check(Example2, type=“bars”, nsamples = 100)

the two figures for pp_check are below.

My question is why the two examples of posterior checks result in two different point estimates for the intercepts? Is it because the intercepts are different? I also noticed that the default for the figure X axis is 1, 2, 3 regardless of the coding for the DV in the model.

Example1 = mu1, mu2

Example2 = mu2, mu3

The model results yield equivalent point estimates as expected

summary(Example1)
summary(Example2)

Family: categorical
Links: mu1 = logit; mu2 = logit Formula: Choice0 ~ 1
Data: df (Number of observations: 150)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept 0.09 5.05 -9.93 9.89 1.00 3114 2648
mu2_Intercept 0.04 5.01 -10.13 9.69 1.00 3092 2683

Family: categorical
Links: mu2 = logit; mu3 = logit
Formula: Choice ~ 1
Data: df (Number of observations: 150)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu2_Intercept 0.09 5.05 -9.93 9.89 1.00 3114 2648
mu3_Intercept 0.04 5.01 -10.13 9.69 1.00 3092 2683

Solomon · March 26, 2020, 7:53pm

Interesting. You’ve got me stumped.

Topic		Replies	Views
Setting priors for categorical models brms	2	823	May 7, 2020
Brms complains about global priors in a categorical model brms	1	1040	September 5, 2020
Priors for multinomial logistic regression in brms brms prior-choice , priors , multinomial-response , brms	5	197	February 13, 2025
Choosing priors in brms for interaction term and covariance matrix Modeling priors , phylogenetic , brms	2	44	March 19, 2025
Can you use a horseshoe prior with a categorical model? brms techniques , multinomial-response	2	973	June 22, 2020

Categorical Modeling and priors for interaction terms

Related topics