Multi-level multivariate normal model

Hi all.
I am working with multi-variate outcomes, and being very new to STAN I’ve started with the manuals example model in section 9.15. Note I’m sticking with the less efficient version for now as I find it easier to understand. Anyhow, I would like to extend this model to a multi-level model however I’m at a loss on how to go about this. I was hoping someone could help. I have made some toy data to model as an example:

First simulate students clustered in classes with multiple outcomes

library(simstudy)

set.seed(45)

gen.class <- defData(varname = "c0", dist = "normal", formula = 0, variance = 2, id = "idClass")
gen.class <- defData(gen.class, varname = "nStudents", dist = "noZeroPoisson", formula = 20)
dtClass <- genData(8, gen.class)

gen.student <- defDataAdd(varname = "Gender", dist = "binary", formula = 0.5)
gen.student <- defDataAdd(gen.student, varname = "age", dist = "uniform", formula = "9.5; 10.5")
gen.student <- defDataAdd(gen.student, varname = "y1", dist = "normal", 
                          formula = "50 - Gender + c0", variance = 2)
gen.student <- defDataAdd(gen.student, varname = "y2", dist = "normal", 
                          formula = "50 + age + c0", variance = 2)
gen.student <- defDataAdd(gen.student, varname = "y3", dist = "normal", 
                          formula = "50 - (age*Gender) + c0", variance = 2)
students <- genCluster(dtClass, cLevelVar = "idClass", numIndsVar = "nStudents", 
                        level1ID = "idChild")

students <- addColumns(gen.student, students)
setDF(students)
dim(students)

Define STAN multivariate model:

model <- '
data {
    int<lower=1> K;     // num of outcomes
    int<lower=1> J;     // num of explanatory variables
    int<lower=0> N;     // num of obs
    vector[J] x[N];     // x is array of size N containing vectors of J elements
    vector[K] y[N];     // y is array of size N containing vectors of K elements
}
parameters {
    matrix[K, J] beta;     // K x J matrix
    cov_matrix[K] Sigma;    // 
}

model {
    vector[K] mu[N];

    for (n in 1:N){
        mu[n] = beta * x[n];
    }
    y ~ multi_normal(mu, Sigma);
}'

Set-up data for STAN and run model

library(rstan)
stan_dat <- list(J = 3, K = 3,
                 N = dim(students)[1],
                 x = as.matrix(data.frame(x0 = 1,
                                          x1 = students$Gender,
                                          x2 = students$age)),
                 y = students[, c("y1", "y2", "y3")] )

stan1 <- stan(model_code = model, data = stan_dat, chains = 2, cores=2 , iter=1000,
              pars = c("beta"))

##### Examine results
summary(stan1)

res <- summary(stan1)
muc <- rstan::extract(stan1, par="beta[2,2]",  permuted=FALSE, inc_warmup=TRUE)
mdf <- melt(muc)
ggplot(mdf,aes(x=iterations,y=value,color=chains)) + geom_line() + ylab(mdf$parameters[1])

I have made an attempt or two to extend the model but got confused and abandoned ship. (I am used to using indexes in JAGS, however the matrix format of the equation in the STAN example is adding an extra level of confusion for me!). I would appreciate any help folks might be able to offer.

I seem to have made some progress in implementing partial pooling:

model <- '
data {
    int<lower=1> K;     // num of outcomes
    int<lower=1> J;     // num of explanatory variables
    int<lower=0> N;     // num of obs

    int nCL;            // num of groups
    int CL[N];          // group of each individual

    vector[J] x[N];     // x is array of size N containing vectors of J elements
    vector[K] y[N];     // y is array of size N containing vectors of K elements
}
parameters {
    matrix[K, J] beta[ nCL ];     // K x J matrix of size nCL - num of clusters
    cov_matrix[K] Sigma;    // 
}

transformed parameters {
    vector[K] mu[N];

    for (n in 1:N){
        mu[n] = beta[ CL[n] ] * x[n];
    }
}

model {
    y ~ multi_normal(mu, Sigma);
}'

This model compiles and runs successfully, although provides me with more results parameters than I necessarily want, I’m getting “outcomes x explanatory_vars x groups” count of beta terms now, when what I want is a beta for each outcome x each explanatory var pooled across groups. I’m unsure if I’ve gone wrong or need to add more to the model.

You did not specify any (hierachical) priors in your model, and thus your betas won’t be partially pooled. Such a prior is usually most efficient when using the so called “non-centered” parameteritzaion. You can read more about it in the Stan manual.

If you are calling Stan from R, you can also let the brms package generate the Stan code of such a model for you. You can read more about multivariate model in brms by running vignette(“brms_multivariate”).

1 Like

Thanks Paul. Yes I suspected I needed to specify priors also. Thanks for the tip about brms - it turns out I had an older version without that vignette and so had missed it - however I’ve updated now!

So finally got back to this. I think I’ve now specified a mutli-variate linear mixed model:

model <- '
    data {
    int<lower=1> K;     // num of outcomes
    int<lower=1> J;     // num of explanatory variables
    int<lower=0> N;     // num of obs

    int nCL;            // num of groups
    int CL[N];          // group of each individual

    vector[J] x[N];     // x is array of size N containing vectors of J elements
    vector[K] y[N];     // y is array of size N containing vectors of K elements
}
parameters {
    matrix[K, J] beta;     // K x J matrix
    vector[K] alpha[nCL];
    cov_matrix[K] Sigma;
    real a0[nCL];
    real b0[nCL];
    real sigma_cl;
    real sigma_slope;
}

transformed parameters {
    vector[K] mu[N];

    for (n in 1:N){
        mu[n] = alpha[ CL[n] ] + beta * x[n];
    }
}

model {
    // Define Priors
    for (l in 1:nCL){
        alpha[l] ~ normal( a0[l], sigma_cl );
        to_vector(beta) ~ normal(b0[l], sigma_slope);
    }
    a0 ~ normal(0, 1);
    b0 ~ normal(0, 1);
    sigma_cl ~ cauchy(0, 10);
    sigma_slope ~ cauchy(0, 10);

    // Likelihood
    y ~ multi_normal(mu, Sigma);
}'

I remain confused about priors however, both in trems of when to use a weak or strong prior and also regarding the non-centred parameterisation. I think I understand the reasons why this is better, but as often with these things I’m having trouble applying that to my model. Any tips appreciated. Thanks all