# Modeling correlation between parameters

I’m new to stan and have a question that is probably very simple but I haven’t been able to work out on my own.

I have data with repeated observations of units across multiple measures and want to measure unit-specific means for each of those measures. Some of these variables are correlated and data on each comes at irregular intervals. When I get new information from one variable I want to use it to update the entire matrix of unit-specific means.

This first set of code simulates two correlated variables that are similar to what I’m working with.

``````library(MASS)
library(rstan)

a <- -.1
b <- .05
sigma_mu_1 <- .2
sigma_1 <- 1.5
sigma_mu_2 <- .1
sigma_2 <- .75
rho = .5

##Make correlation matrix
mu <- c(a,b)
sigmas <- c(sigma_mu_1, sigma_mu_2)
Rho <- matrix(c(1, rho, rho, 1), nrow = 2)
Sigma <- diag(sigmas) %*% Rho %*% diag(sigmas)

n_units <- 10

set.seed(1000)

##sample unit effects
mu_matrix <- mvrnorm(n_units, mu, Sigma)

mu_1 <- mu_matrix[,1]
mu_2 <- mu_matrix[,2]

n_obs <- 100

##Generate data
team_id <- rep(1:n_units, each = n_obs)
mean_1 <- mu_1[team_id]
mean_2 <- mu_2[team_id]
y_1 <- rnorm(n_units * n_obs, mean_1, sigma_1)
y_2 <- rnorm(n_units * n_obs, mean_2, sigma_2)

dat <- list(
N_1 = n_obs * n_units,
N_2 = n_obs * n_units,
I = n_units,
ii = team_id,
jj = team_id,
y_1 = y_1,
y_2 = y_2
)

``````

This stan code estimates the effects separately and everything samples fine but all of my attempts to build in correlation between mu_1 and mu_2 have gone poorly. What is the appropriate way to model this correlation?

``````data{
int<lower = 0> N_1;
int<lower = 0> N_2;
int<lower = 1> I;
int<lower = 0, upper = I> ii[N_1];
int<lower = 0, upper = I> jj[N_2];
real y_1[N_1];
real y_2[N_2];
}
parameters{
vector[I] mu_1_raw;
vector[I] mu_2_raw;
real alpha_1;
real alpha_2;
real<lower = 0> sigma_1;
real<lower = 0> sigma_2;
real<lower = 0> sigma_mu_1;
real<lower = 0> sigma_mu_2;
}
transformed parameters{
vector[I] mu_1 = mu_1_raw * sigma_mu_1;
vector[I] mu_2 = mu_2_raw * sigma_mu_2;
}
model{
alpha_1 ~ normal(0, .1);
alpha_2 ~ normal(0, .1);
sigma_1 ~ normal(0,2);
sigma_2 ~ normal(0,2);
sigma_mu_1 ~ std_normal();
sigma_mu_2 ~ std_normal();
mu_1_raw ~ std_normal();
mu_2_raw ~ std_normal();

y_1 ~ normal(alpha_1 + mu_1[ii], sigma_1);
y_2 ~ normal(alpha_2 + mu_2[jj], sigma_2);
}
``````

You want to use `multi_normal` (or `multi_normal_cholesky`), which in turn necessitates a bit of reshaping of your parameters. You have your model set up with non-centered parameterization, but it’s easier to follow if I first show the centered version, with variables named a bit more informatively (if less succinctly):

``````parameters{

vector mean_of_unit_means;
vector<lower=0> sd_of_unit_means ;
corr_matrix cor_mat ;

matrix[I,2] unit_means ;

vector<lower=0> measurement_noise ;

}
transformed parameters{
//combine the sds & correlation matrix to a covariance matrix
matrix[2,2] cov_mat = quad_form_diag( cor_mat, sd_of_unit_means) ;
}
model{
mean_of_unit_means ~ normal(0,.1) ;
measurement_noise  ~ normal(0,2);
sd_of_unit_means ~ std_normal();
cor_mat ~ lkj_corr(3) ; // note: modify this to match your prior expectations for the correlation

for(i in 1:I){
unit_means[i,] ~ multi_normal(
mean_of_unit_means
, cov_mat
) ;
}

y_1 ~ normal(unit_means[ii,1], measurement_noise);
y_2 ~ normal(unit_means[jj,2], measurement_noise);
}
``````

Now it turns out to be possible to avoid the loop in the model block by simply changing the shape of the `unit_means` parameter, in which case it also makes sense to compute the covariance matrix in-place so it’s not saved:

``````parameters{

vector mean_of_unit_means;
vector<lower=0> sd_of_unit_means ;
corr_matrix cor_mat ;

vector unit_means[I] ;

vector<lower=0> measurement_noise ;

}
model{
mean_of_unit_means ~ normal(0,.1) ;
measurement_noise  ~ normal(0,2);
sd_of_unit_means ~ std_normal();
cor_mat ~ lkj_corr(3) ; // note: modify this to match your prior expectations for the correlation

unit_means ~ multi_normal(
mean_of_unit_means
) ;

y_1 ~ normal(unit_means[ii], measurement_noise);
y_2 ~ normal(unit_means[jj], measurement_noise);
}
``````

Now, for the non-centered version, the manual suggests that using the Cholesky factor representation of the correlation matrix for better sampling performance. I think it should be possible to similarly use the Cholesky factor representation in the centered model above, so if you get divergences with the non-centered but not the centered version and you find the centered version slow, try adding the Cholesky stuff below to the model above. Report back if you try this, as I’d be curious to hear how it goes.

Anyway, here’s the non-centered with Cholesky representation:

``````parameters{

row_vector mean_of_unit_means;
vector<lower=0> sd_of_unit_means ;
cholesky_factor_corr chol_cor_mat ;

matrix[I,2] unit_means_z ;

vector<lower=0> measurement_noise ;

}
transformed parameters{
matrix[I,2] unit_means = ( //using some idiosyncratic whitespace here for (imo) better clarity
rep_matrix(mean_of_unit_means,I)
+ (
unit_means_z
*diag_post_multiply(chol_cor_mat,sd_of_unit_means)
)
);
}
model{
mean_of_unit_means ~ normal(0,.1) ;
measurement_noise  ~ normal(0,2);
sd_of_unit_means ~ std_normal();
chol_cor_mat ~ lkj_corr_cholesky(3) ; // note: modify this to match your prior expectations for the correlation

to_vector(unit_means_z) ~ std_normal() ;

y_1 ~ normal(unit_means[ii,1], measurement_noise);
y_2 ~ normal(unit_means[jj,2], measurement_noise);
}
``````

Thank you for your help! I will let you know how things go.