I have a dataset dat
with the following structure:
str(dat)
‘data.frame’: 450 obs. of 4 variables:
Subject : Factor w/ 23 levels "S1","S2","S3",..: 1 1 1 1 1 1 1 1 1 1 ... Condition : Factor w/ 20levels “A”, “B”, “C”, …
Value: num 0.679 0.5819 0.2531 0.0469 1.2375 ... X : num 0.62 0.62 …
The variable Condition
is a repeated-measures (or within-subject) factor, and X
is between-subjects quantitative variable such as age (i.e., X
varies across subjects, but does not vary across the levels of Condition
).
I would like to obtain the posterior distribution for X
at each level of the Condition
factor. So, I’m thinking to perform the following Bayesian model:
library(‘rstanarm’)
options(mc.cores = 4)
fm ← stan_lmer(Value ~ 1 + X + (1 | Subject) + (1+X | Condition), data=dat, chain=4)
Then I can pull out the posterior distribution for X
at each Condition
level based on the output of the above model. However, as X
does not vary across the levels, I’m not sure if the above multilevel model is appropriate for the part of (1+X | Condition)
. For example, In the conventional statistics, the following linear mixed-effects model does not seem to make sense to me:
library(lme4)
fm2 ← lmer(Value ~ X + (1 | Subject) + (1 + X | Condition), data = dat)
Sometimes the fm2
model may fail to converge. Even if it converges, in the output of summary(fm2), the problem will show up in the correlation being -1 between the random effects of intercept and X
for the part of (1 + X | Condition)
.
So, here are my questions:
- With the following Bayesian model,
fm0 ← stan_lmer(Value ~ 1 + X + (1 | Subject) + (1 | Condition), data=dat, chain=4)
Is there a way to obtain the posterior distribution for X
at each ‘Condition’ level?
- Is the following Bayesian model meaningful even though the corresponding lmer() model is not?
fm ← stan_lmer(Value ~ 1 + X + (1 | Subject) + (1+X | Condition), data=dat, chain=4)
- When
X
is a between-subject factor such as gender (instead of quantitative variable), would the answers for 1) and 2) above remain the same?