I’m confused about the behavior of rstanarm
when you specify a formula where a predictor is of class "ordered" "factor"
. Here’s an example :
# Simulate data
n <- 100
y <- rnorm(n)
x_cont <- sample(1:5, n, replace = TRUE)
x_fac <- factor(sample(1:5, n, replace = TRUE))
x_ord <- ordered(sample(1:5, n, replace = TRUE))
data <- data.frame(y = y, x_cont = x_cont, x_fac = x_fac, x_ord = x_ord)
# Model
fit <- stan_lm(y ~ x_cont + x_fac + x_ord, data, prior = R2(0.5, "mode"))
print(fit, digits = 2)
stan_lm
family: gaussian [identity]
formula: y ~ x_cont + x_fac + x_ord
observations: 100
predictors: 10
------
Median MAD_SD
(Intercept) 0.20 0.30
x_cont -0.02 0.06
x_fac2 -0.28 0.31
x_fac3 0.18 0.31
x_fac4 -0.14 0.33
x_fac5 -0.53 0.30
x_ord.L 0.03 0.25
x_ord.Q -0.25 0.22
x_ord.C 0.09 0.20
x_ord^4 0.38 0.20
Auxiliary parameter(s):
Median MAD_SD
R2 0.14 0.05
log-fit_ratio 0.03 0.07
sigma 0.96 0.07
------
Looking at the posterior_vs_prior(fit, pars = "beta")
plot, it looks like the effect parameters of the unordered parametrization are given wider prior distributions and I can’t figure out why.