Dear Members of the Stan Forum,
I am currently using Stan via brms to check the stability of the model parameters of a rather complex multilevel model with respect to many covariates. In my data matrix, I have several texts that were rated by various raters on a rating scale (e.g., 1 = bad text, 4 = excellent text). The rating process should be modeled by a parametric regression model, with the models parameters corresponding to properties such as the “strictness” of individual raters. I now want to check whether these parameters, such as the “strictness”, are stable for different groups of texts (e.g., different languages, different types of texts, etc.). From a technical perspective, I am using item response models, as they are described, for instance, here: Bayesian Item Response Modeling in R with brms and Stan | Journal of Statistical Software
A complicating issue is that not all of my texts are rated by all raters, so that there are missing data. But we can assume that these are missing completely at random.
I have about 21,000 data rows (texts) in my data in longitudinal format. My model syntax in brms looks like this:
baseline_formula ← bf(
rating ~ var1 + var2 + var3 + var4 + var5 + var6 + (1 | textId) + (0 + var1 + var2 + var3 + var4 + var5 + var6| rater),
disc ~ 1 + (1 | rater)
)
prior_ord_2pl ←
prior(“constant(1)”, class = “sd”, group = “textId”) +
prior(“normal(0, 3)”, class = “sd”, group = “rater”) +
prior(“normal(0, 1)”, class = “sd”, group = “rater”, dpar = “disc”)
gpcm_fit ← brm(
formula = baseline_formula,
data = selected_df_Ges,
family = brmsfamily(“cumulative”,“logit”),
prior = prior_ord_2pl
)
Here, var1 to var6 are essentially factors denoting different text categories (e.g., language, topic, sex of the author,…), textID is the ID of the individual texts, and rater denotes the identity of the indivudal raters.
This model shows convergence problems with some Rhats > 1.05, and brms recommends to define stronger priors. I now have two questions:
- What are best practices for addressing such a problem? I am aware that the selection of suitable priors is a delicate problem, and that my choice might affect the outcome of the analysis.
- I was wondering if there are alternative approaches for handling this type of problem? In classical machine learning, one might apply Lasso regression models to determine which covariates actually play a role for the stability of the model, like in this approach: https://link.springer.com/article/10.3758/s13428-019-01224-2. While the presence of missing data can be problematic for regularization methods, there might be clever analogous Bayesian approaches available here.