Hi There
This is my first post on the Stan forums. I am using Stan in R via Rstan. I am having trouble with getting a Bayesian mixed-effects model to yield stationary and well-mixed chains. I have created my own data so I know what parameters should be retrieved by the model. Unfortunately because the effective number of parameters is so low and the Rhat so high the parameter estimates are complete nonsense.
The data is designed so there are 60 subjects, split into three groups (g1, g2, g3) of 20 subjects each. Each subject is exposed to 3 conditions (cond1, cond2, cond3). I designed the data so there is no difference among the groups, but there are differences among the conditions, with cond1 scoring 100 on average, cond2 scoring 75 on average, and cond3 scoring 125.
Here is the data. I will supply all of this as R code so it can be replicated
df <- data.frame(id = factor(rep(1:60, 3)),
group = factor(rep(c("g1", "g2", "g3"), each = 20, length.out = 180)),
condition = factor(rep(c("cond1", "cond2", "cond3"), each = 60)),
score = c(ceiling(rnorm(60, 100, 15)), ceiling(rnorm(60, 75, 15)), ceiling(rnorm(60, 125, 15))))
Here are the descriptives
library(dplyr)
df %>% group_by(group, condition) %>% summarise(m = mean(score), sd = sd(score))
# group condition m sd
# <fct> <fct> <dbl> <dbl>
# 1 g1 cond1 108 12.4
# 2 g1 cond2 79.4 13.1
# 3 g1 cond3 128 11.5
# 4 g2 cond1 105 15.5
# 5 g2 cond2 71.6 10.6
# 6 g2 cond3 127 17.7
# 7 g3 cond1 106 13.3
# 8 g3 cond2 75.8 17.6
# 9 g3 cond3 124 14.5
Everything looks to be correct, the differences between conditions are preserved nicely across groups.
Now for the the model. The model I am running has a grand mean, a parameter for group, a parameter for condition, a parameter for the group x condition interaction, and a subject parameter.
Here is the data list
##### Step 1: put data into a list
mixList <- list(N = nrow(df),
nSubj = nlevels(df$id),
nGroup = nlevels(df$group),
nCond = nlevels(df$condition),
nGxC = nlevels(df$group)*nlevels(df$condition),
sIndex = as.integer(df$id),
gIndex = as.integer(df$group),
cIndex = as.integer(df$condition),
score = df$score)
Now to build the model in rstan, saving the string as a .stan file using the cat() function
###### Step 2: build model
cat("
data{
int<lower=1> N;
int<lower=1> nSubj;
int<lower=1> nGroup;
int<lower=1> nCond;
int<lower=1,upper=nSubj> sIndex[N];
int<lower=1,upper=nGroup> gIndex[N];
int<lower=1,upper=nCond> cIndex[N];
real score[N];
}
parameters{
real a0;
vector[nGroup] bGroup;
vector[nCond] bCond;
vector[nSubj] bSubj;
matrix[nGroup,nCond] bGxC;
real<lower=0> sigma_s;
real<lower=0> sigma_g;
real<lower=0> sigma_c;
real<lower=0> sigma_gc;
real<lower=0> sigma;
}
model{
vector[N] mu;
bCond ~ normal(100, sigma_c);
bGroup ~ normal(100, sigma_g);
bSubj ~ normal(0, sigma_s);
sigma ~ cauchy(0,2)T[0,];
for (i in 1:N){
mu[i] = a0 + bGroup[gIndex[i]] + bCond[cIndex[i]] + bSubj[sIndex[i]] + bGxC[gIndex[i],cIndex[i]];
}
score ~ normal(mu, sigma);
}
", file = "mix.stan")
Next is to generate the chains in rstan
##### Step 3: generate the chains
mix <- stan(file = "mix.stan",
data = mixList,
iter = 2e3,
warmup = 1e3,
cores = 1,
chains = 1)
Now let’s take a look a what we have
###### Step 4: Diagnostics
print(mix, pars = c("a0", "bGroup", "bCond", "bGxC", "sigma"), probs = c(.025,.975))
# mean se_mean sd 2.5% 97.5% n_eff Rhat
# a0 -1917.21 776.69 2222.64 -5305.69 1918.58 8 1.02
# bGroup[1] 2368.36 2083.48 3819.06 -2784.04 9680.78 3 1.54
# bGroup[2] 7994.87 446.06 1506.31 4511.22 10611.46 11 1.00
# bGroup[3] 7020.78 2464.68 4376.83 81.18 14699.90 3 1.91
# bCond[1] -3887.06 906.99 1883.45 -7681.24 -247.48 4 1.60
# bCond[2] 4588.50 676.28 1941.92 -594.56 7266.09 8 1.10
# bCond[3] 73.91 1970.28 3584.74 -5386.96 5585.99 3 2.13
# bGxC[1,1] 3544.02 799.91 1819.18 -1067.27 6327.68 5 1.26
# bGxC[1,2] -4960.08 1942.57 3137.33 -10078.84 317.07 3 2.66
# bGxC[1,3] -396.35 418.34 1276.44 -2865.39 2543.45 9 1.42
# bGxC[2,1] -2085.90 1231.36 2439.58 -5769.81 3689.38 4 1.46
# bGxC[2,2] -10594.89 1206.58 2560.42 -14767.50 -5074.33 5 1.02
# bGxC[2,3] -6024.75 2417.43 4407.09 -12002.87 4651.14 3 1.71
# bGxC[3,1] -1111.81 1273.66 2853.08 -4843.38 5572.87 5 1.48
# bGxC[3,2] -9616.85 2314.56 4020.02 -15775.40 -4262.64 3 2.98
# bGxC[3,3] -5054.27 828.77 2245.68 -8666.01 -321.74 7 1.00
# sigma 13.81 0.14 0.74 12.36 15.17 27 1.00
The low number of effective samples and high Rhats tell me I am doing something terribly wrong here, but what? Is it not specifying a prior on bGxC? How does one specify a prior on a matrix?