Categorical Modeling and priors for interaction terms

Please also provide the following information in addition to your question:

  • Operating System: Windows 10
  • brms Version: 2.12

I am new to brms and working on an example to teach myself capabilities for categorical outcomes, particularly in repeated trial experiments. I am starting with a simple example with 3 category response and several conditional effects with interactions. The model is complex, so I am trying to teach myself brms with realistic expectations.

Using default priors, I estimated the following model.

Blockquote
Model1 ← brm(Choice ~ Randomization + Phase + Session + Session:Phase + (1 | ID), data = temp1, family=categorical)

I am interested in understanding the interaction and testing different priors. Does anyone have advice on reasonable priors for interactions with these types of models and advice on what model checking should be done? All of my predictors are categorical (2 levels).

Typical model checking appeared reasonable to me for default priors.

Posterior Check also consistent with good model

Hi,

I would recommend you to set some default priors (start with N(0,1)) and then see what happens when you change them. Run with sample_prior="only" and use pp_check(fit, type="bars", nsamples=100) and compare the plots. You’d like it to look nearly uniform on the outcome scale; if you don’t have any prior knowledge you’d like to use. In McElreath’s book (Ch. 8 in the 2nd edition I believe) he discusses this at length (both using categorical and continuous interactions). In short, always plot your interactions.

@torkar Thank you very much for the suggestion. The models tips and chapter in McElreath’s book are helpful. I am wondering if I should be concerned about the following error that comes up when I try to specify priors:

Specifying global priors for regression coefficients in categorical models is deprecated and may not work as expected.Specifying global priors for regression coefficients in categorical models is deprecated and may not work as expected

The results are very different from default prior models and when I compare the two, I am surprised how much different the estimates.

Using the following priors

priors ← c(prior(student_t(3, 1, 10), class = “Intercept”),
prior(normal(0, 0.1), class = “b”)
)

I tried the same model with

prior(normal(0, 1), class = “b”)

and the results were similar. The divergence appears to be on the intercept estimates.


Default Priors


Informative Prior on “b”

I left the default priors on “sd” and the prior on the “intercept” is the same as the default.

Hi,

use p <- get_prior(brm(y ~ 1 + foo), data=df, family=categorical) to get a list of all priors. Then you set new priors explicitly using, e.g., p$prior[1] <- "normal(0,1)".

brms is picky when it comes to the priors since Paul really wants to make sure that you understand what you do. So you should listen to the warning :)

Change all default priors and play around so you get to know how this works, then start by setting som regularizing priors to see how it looks like on the outcome scale.

1 Like

I ran into this issue, recently and agree with @torkar’s advice. You can find a couple examples of how I handled it at https://bookdown.org/content/3890/counting-and-classification.html#multinomial.

2 Likes

@Solomon and @torkar, this is all very helpful. I see that I need to set a specific prior for each unique parameter and the differences from model to model (prior to prior) are making a lot more sense. I am also realizing that I didn’t intuit that the sample _prior = "only" specification skipped the likelihood. Now it is all making a lot more sense to me, thank you both!

1 Like

I have a quick follow-up to this practice that I hope is easy to answer. If I have a variable that is coded as an integer, for example

library(tidyverse)
library(brms)
library(bayesplot)

df ← setNames(data.frame(matrix(ncol = 2, nrow = 150)), c(“Choice”, “Choice0”))
df ← df %>% mutate(Choice = rep(c(1,2,3, 1, 1, 3), 25), Choice0 = rep(c(0,1, 2, 0, 0, 2), 25))

Then I model in brms with categorical family, just the intercept

prior1 ← c(prior(normal(0,5), class = “Intercept”, dpar = “mu2”), #Normal Priors sample wider range than default student t parameters
prior(normal(0,5), class = “Intercept”, dpar = “mu3”)
)
Example1 ← brm(Choice ~ 1,
data = df,
family=categorical(link = “logit”),
prior= prior1, #defined priors from above
sample_prior=“only”,## This skips the likelihood in generating posterior probability sampling
seed = 123
)
pp_check(Example1, type=“bars”, nsamples = 100)

prior2 ← c(prior(normal(0,5), class = “Intercept”, dpar = “mu1”),
prior(normal(0,5), class = “Intercept”, dpar = “mu2”)
)
Example2 ← brm(Choice0 ~ 1,
data = df,
family=categorical(link = “logit”),
prior= prior2, #defined priors from above
sample_prior=“only”,## This skips the likelihood in generating posterior probability sampling
seed = 123
)
pp_check(Example2, type=“bars”, nsamples = 100)

the two figures for pp_check are below.

My question is why the two examples of posterior checks result in two different point estimates for the intercepts? Is it because the intercepts are different? I also noticed that the default for the figure X axis is 1, 2, 3 regardless of the coding for the DV in the model.

Example1 = mu1, mu2

Example2 = mu2, mu3

The model results yield equivalent point estimates as expected

summary(Example1)
summary(Example2)

Family: categorical
Links: mu1 = logit; mu2 = logit Formula: Choice0 ~ 1
Data: df (Number of observations: 150)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept 0.09 5.05 -9.93 9.89 1.00 3114 2648
mu2_Intercept 0.04 5.01 -10.13 9.69 1.00 3092 2683

Family: categorical
Links: mu2 = logit; mu3 = logit
Formula: Choice ~ 1
Data: df (Number of observations: 150)
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup samples = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
mu2_Intercept 0.09 5.05 -9.93 9.89 1.00 3114 2648
mu3_Intercept 0.04 5.01 -10.13 9.69 1.00 3092 2683

Interesting. You’ve got me stumped.