How to model multinomial data with uneven numbers of options between items

Hi, I’m interested in modeling (using brms, ideally, or if not, something custom in rstan) multinomial count data, but I have data which a) I have reason to believe are influenced by the same / very similar factors, but b) have different outcome options.

For example, I have two kinds of items, call them A and B. Items of type A could have outcomes A1, A2, A3, and A4; however B items only have B1, B2, and B6 (another category distinct from the A categories) as possible outcomes. I have reason to believe that the data generating process — that is, whether an A item is realized as A1, A2, etc. — makes reference to the same factors / are influenced by the same predictors in both A and B items, and so therefore would like to model them using the same set of underlying variables (and also the same subjects gave responses for A and B type items), but since there are a different number of categories between A and B, not all of which overlap, I’m not sure the best way to go about trying to do this in the same model. Any advice / tips? Alternately, if this is a trivial problem / isn’t relevant here, any suggestions on where to look for answers? Thanks!

I find your questioin a bit hard to answer with the information you provided, but you could look at some kind of polytomous IRT model (e.g. the Polytomous Rasch Model). I imagine these models are what you are looking for.

Since you stated that you have two item types A and B I would probably start with a Rating Scale Model (assuming the outcome categories are ordered) with a different set of threshold parameters for each item type. I do not know if brms can handle different number of outcome categories, but it can certainly be done in Stan itself. If programmed in Stan the model probably requires some custom indexing of the parameter vector, since there are no ragged data structures. This can be a bit of a pain sometimes, but shouldn’t be too bad if you only have 2 item types with fixed threshold parameters.

1 Like

Hi Paul,
Thanks for your response - to be more specific, the categories are not ordered and there are maybe 12 categories of item (A, B, C,…). So i think what you’re saying about jagged categories makes sense, but potentially in an unordered context? If there’s a name for that kind of model.

You can take a look at the Nominal Response Model.

Great, thanks!

Actually, @p-gw, I read @paul.buerkner’s paper on Bayesian IRT (which seems to be the nearest bayesian resource to Nominal Resposne Modeling, as you suggested), but unfortunately it didn’t cover what to do with ragged categories / nominal data from a range of categories with unequal outcome numbers. Sorry to bother, but would you and/or Paul be able to give a small example on how to do this? Thanks!

You are asking for conditional logit models I think. There is a (workaroundy) implementation of this in brms, but it requires lots of manual specification:

Also, rstanarm has a stan_clogit function that might be of help.

1 Like