I’m currently trying to formulate a multilevel model which contains 46 ordinal predictors and 93 regression terms overall. The high number of predictors warranted the use of a shrinkage prior and I was considering the R2D2M2 prior for this purpose (from this article by @javier.aguilar and @paul.buerkner). The problem is that I initially intended to model the predictors as monotonic effects as described in this article by Bürkner and @Emmanuel_Charpentier).
After reading the article on the R2D2M2 prior, I noticed that predictors must be standardized based on the sample mean and variance. I’m guessing there’s a non-compatibility issue between these modeling techniques since it doesn’t seem to make sense to “standardize” monotonic effects. On the other hand, if these effects are not standardized, then Var(\mu) \neq \sigma^2 \tau^2.
Here’s a simplified reprex of the model I have in mind with only 3 predictors using brms :
# Simulate data
N <- 100 # number of observations
J <- 20 # number of levels
y <- rnorm(N) # Outcome variable
jj <- sample(1:J, n, replace = T) # Grouping index variable
# Ordinal predictors
x1 <- sample(1:3, N, replace = T)
x2 <- sample(1:3, N, replace = T)
x3 <- sample(1:3, N, replace = T)
data <- data.frame(y, x1, x2, x3, jj)
# Generate stancode with brms
library(brms)
formula <- bf(y ~ 1 + mo(x1) + mo(x2) + mo(x3) + (1 + mo(x1) + mo(x2) + mo(x3) | jj))
bprior <- prior(R2D2(), class = b)
make_stancode(formula, data, prior = bprior)
Notice that the stancode generated by brms does not center or scale monotonic variables. This means that the global variance of the model \tau^2 will vary based on the mean and standard deviation of each monotonic effects, which in turn depends on how each predictor variable is distributed (e.g. uniform, asymmetric, concentrated around a specific category, etc.) AND the shape parameter.
In short, I’m wondering :
- Are the estimates from this model interpretable?
- If not, can monotonic predictors and the R2D2M2 prior (or even the R2D2 prior) be used together?
- If not, what else could be done?
My first idea is to remove the monotonic transformations and model the ordinal predictors simply as continuous predictors. I could then retrieve the implied effects for each category of the predictors using a linear transformation on posterior draws.