I’m having trouble understanding how to format my data and brms formulas for multiple likert data. I’ve been using the Burkner & Vuorre (2019) tutorial to do a cumulative model. I have 70 responses to 10 questions, currently in a long format. I would like to use model selection to find the top model of what factors predict y, an ordinal response. Each model is composed of a different set of questions, each with likert data, for example:
hypothesis 1) y ~ 1 + a + b
hypothesis 2) y ~ 1 + a + b + c
hypothesis 3) y ~ 1 + a + c
where a, b, and c are all likert responses. Firstly, would this be an incorrect model or is this the equivalent of doing a pooled regression where all respondents share the same intercept?
Second, I realized that since there are multiple likert items these data should be converted to long format according to the tutorial with the variable names in one column and values in another. I’m wondering whether I need different dataframes for each hypothesis as to only include likert responses that are used in that hypothesis (ex. dataframe with 140 rows and only responses a + b for hypothesis 1, 210 for hypothesis 2). That way, I would model:
hypothesis 1) y ~ 1 + (1|person) + (1|likert responses dataframe a + b)
hypothesis 2) y ~ 1 + (1|person) + (1|likert responses dataframe a + b + c)
hypothesis 3) y ~ 1 + (1|person) + (1|likert responses dataframe a + c)
Any help would be much appreciated.