Hi all.
I am working with multi-variate outcomes, and being very new to STAN I’ve started with the manuals example model in section 9.15. Note I’m sticking with the less efficient version for now as I find it easier to understand. Anyhow, I would like to extend this model to a multi-level model however I’m at a loss on how to go about this. I was hoping someone could help. I have made some toy data to model as an example:
First simulate students clustered in classes with multiple outcomes
library(simstudy)
set.seed(45)
gen.class <- defData(varname = "c0", dist = "normal", formula = 0, variance = 2, id = "idClass")
gen.class <- defData(gen.class, varname = "nStudents", dist = "noZeroPoisson", formula = 20)
dtClass <- genData(8, gen.class)
gen.student <- defDataAdd(varname = "Gender", dist = "binary", formula = 0.5)
gen.student <- defDataAdd(gen.student, varname = "age", dist = "uniform", formula = "9.5; 10.5")
gen.student <- defDataAdd(gen.student, varname = "y1", dist = "normal",
formula = "50 - Gender + c0", variance = 2)
gen.student <- defDataAdd(gen.student, varname = "y2", dist = "normal",
formula = "50 + age + c0", variance = 2)
gen.student <- defDataAdd(gen.student, varname = "y3", dist = "normal",
formula = "50 - (age*Gender) + c0", variance = 2)
students <- genCluster(dtClass, cLevelVar = "idClass", numIndsVar = "nStudents",
level1ID = "idChild")
students <- addColumns(gen.student, students)
setDF(students)
dim(students)
Define STAN multivariate model:
model <- '
data {
int<lower=1> K; // num of outcomes
int<lower=1> J; // num of explanatory variables
int<lower=0> N; // num of obs
vector[J] x[N]; // x is array of size N containing vectors of J elements
vector[K] y[N]; // y is array of size N containing vectors of K elements
}
parameters {
matrix[K, J] beta; // K x J matrix
cov_matrix[K] Sigma; //
}
model {
vector[K] mu[N];
for (n in 1:N){
mu[n] = beta * x[n];
}
y ~ multi_normal(mu, Sigma);
}'
Set-up data for STAN and run model
library(rstan)
stan_dat <- list(J = 3, K = 3,
N = dim(students)[1],
x = as.matrix(data.frame(x0 = 1,
x1 = students$Gender,
x2 = students$age)),
y = students[, c("y1", "y2", "y3")] )
stan1 <- stan(model_code = model, data = stan_dat, chains = 2, cores=2 , iter=1000,
pars = c("beta"))
##### Examine results
summary(stan1)
res <- summary(stan1)
muc <- rstan::extract(stan1, par="beta[2,2]", permuted=FALSE, inc_warmup=TRUE)
mdf <- melt(muc)
ggplot(mdf,aes(x=iterations,y=value,color=chains)) + geom_line() + ylab(mdf$parameters[1])
I have made an attempt or two to extend the model but got confused and abandoned ship. (I am used to using indexes in JAGS, however the matrix format of the equation in the STAN example is adding an extra level of confusion for me!). I would appreciate any help folks might be able to offer.