Prior and Model Selection for Multinomial Logistic Regression

Nana · April 4, 2024, 6:20pm

Dear Stan Community,

For the analysis of a long-term study, I would like to perform a multinomial logistic regression to predict the affiliation to different progression groups (e.g. previously healthy, now ill or previously healthy, now still healthy, etc.) using different predictors. So far, I have only had experience with frequentist models in this context and have used packages such as mlogit. But now I would really like to try my hand at baysian approaches and find the brms package very clear and descriptive.

However, my prior knowledge of model specification is very limited. Does anyone have tips or sources for the brms code for a multinomial logistic regression (is family = categorical sufficient?) and how do I find or choose meaningful uninformative to weakly informative priors for my betas in this model?

I would be happy about every answer and every piece of advice!

avehtari · April 5, 2024, 8:26am

Can you tell more about your data like the number of observations, the number of target categories, the number of predictors, possible multilevel/hierarchical structure, and do you plan to include interaction or smooth terms? This information would help to know how carefully you would need to think about the priors, as if you have many observations and few categories and predictors, the data will dominate anyway.

Couple examples of multinomial modeling with brms

@Solomon’s 22 Nominal Predicted Variable | Doing Bayesian Data Analysis in brms and the tidyverse.
@andrewheiss’ The ultimate practical guide to multilevel multinomial conjoint analysis with R | Andrew Heiss

Nana · April 5, 2024, 12:26pm

Dear avehtari,
thank you very much for your response!

My data comprises approx. N = 1200 persons. The surveys have been conducted in 4 waves so far. There is also missing data.

The number of categories should be 4 to reflect the different trajectories:
Healthy - Healthy, Healthy - Sick, Sick - Healthy and Sick - Sick.
The reference category should probably be healthy-healthy.
As predictors, I would like to use several numerical variables (approx. 3) in the form of questionnaire sum scores as well as age (numerical, continuous) and gender (factor) as demographic variables.
I have not yet considered a hierarchical structure or nesting, as I am basically only interested in the comparison between the first and the last survey.

Thank you in advance for your advice!

avehtari · April 7, 2024, 5:08pm

Sounds like you have enough data and not too many predictors that you can start with weak priors, use prior-likelihood sensitivity analysis (e.g. with priorsense package supported by brms) to check whether data are sufficiently informative, and then think harder about the priors if needed. You can post here your model checking results for further comments.

Nana · April 8, 2024, 4:06pm

Thank you really much, I will try that!

Topic		Replies	Views
Priors for multinomial logistic regression in brms brms prior-choice , priors , multinomial-response , brms	5	208	February 13, 2025
Multinomial logistic regression with categorical family (brms) Modeling multinomial-response , brms	5	77	May 2, 2025
Setting priors for multinomial regression in brms Modeling prior-choice , multinomial-response	5	1868	June 14, 2020
Use categorical /multinomial family with binary-ish data based on separate column in brms brms multinomial-response , brms	8	1444	August 8, 2023
Getting predictions for multinomial model using brms brms specification	7	1560	May 7, 2021

Prior and Model Selection for Multinomial Logistic Regression

Related topics