I have data that consists of a number of ordinal questions on a survey, all of which measure the same basic concept. I would like to build a hierarchical, ordinal logistic regression model that borrows strength across different survey questions. I noticed that Stan has an
ordered_logistic distribution, which might make this a lot easier than going all the way to the likelihood. However, there are a couple complications:
Not all survey questions have the same responses, and sometimes not even the same number of responses. My plan is to allow the “intercepts” (cutpoints) to be completely question-specific, with a different number of cutpoints for each question. However, I’m not sure how to specify the data structure for these parameters in Stan. I’d like to have some sort of “ragged” list of
orderedvectors, is anything like this possible? If not is there a work-around you could suggest?
My data is in a collapsed format, with one row for every response option for each question, and a column that is the “count” of responses. Is there a way to get
ordered_logisticto accept data of this format, or would I have to expand the dataset to have one row for each respondent, for each question?