Hierarchical Multivariate Normal model

Hello,
I am trying to model some data that has a hierarchical structure. I have data for 2 random variables that is indexed by 43 “country level” groups. I want to fit a multivariate normal model and in particular want to estimate hierarchical structures on my mean vectors and correlation and covariance matrices. Meaning I want 43 mu, Omega and Sigma, one for each group/country as well as an overall mu, Omega and Sigma.

My data is here (mvn_sample_data.csv (78.3 KB)). My Stan code is as follows:

data{
  int N; // number of observations
  int n_country; // number of countries
  int n_random_vars; // number of random variables being included, 2 here
  int country_id[N]; // country id vector
  vector[n_random_vars] joint_data[N]; //data
}
parameters{
  vector[n_random_vars] mu[n_country];
  corr_matrix[n_random_vars] Omega [n_country]; 
  vector<lower = 0>[n_random_vars] sigma [n_country]; 
}
transformed parameters {
  cov_matrix[n_random_vars] Sigma[n_country]; 
  for(i in 1:N){
    Sigma[country_id[i]] = quad_form_diag(Omega[country_id[i]], sigma[country_id[i]]);
  }
}
model{
  for(i in 1:N){
    joint_data[i] ~ multi_normal(mu[country_id[i]],Sigma[country_id[i]]);
    mu[country_id[i]] ~ normal(0,1);
    sigma[country_id[i]] ~ cauchy(0, 5);
    Omega[country_id[i]] ~ lkj_corr(1);
  }
}

The problems that I am facing are as follows:

  1. My current parametrization is slow to run, how can I take the mu, sigma and Omega out of the for loop as this might speed things up?
  2. I don’t know how to move past these numeric priors to prior distributions that would yield an overall mu, Omega and Sigma, potentially something like this
model{
  for(i in 1:N){
    joint_data[i] ~ multi_normal(mu[country_id[i]],Sigma[country_id[i]]);
  }
  mu ~ multi_normal(mu_prior, tau_mu_prior); //where mu_prior is a vector of zeros and ta_mu_prior is a matrix of 1's
  sigma ~ multi_normal(sigma_prior, tau_sigma_prior); //where sigma_prior is a vector of zeros and ta_sigma_prior is a matrix of 1's
Omega ~ lkj_corr(Omega_prior);
}

The above block is just an attempt and does not work. I declare mu_prior, tau_mu_prior, sigma_prior, and tau_sigma_prior in the data block. I additionally have not been able to find ways to impose a hierarchical structure on Omega_prior as it has to be an integer greater than or equal to 1.

My R code to run the model is below:

# read in csv
year_1_df <- read.csv("mvn_sample_data.csv", header = T)

# prepping data for Stan
joint_data <- as.matrix(cbind(year_1_df$lbl,
                              year_1_df$fd))
N <- dim(joint_data)[1]
n_country <- length(unique(year_1_df$country_id))
n_random_vars <- dim(joint_data)[2]
country_id <- year_1_df$country_id

#fitting the model
fit <- stan(*model name here*, 
            data = list(joint_data, N, country_id, n_random_vars), 
            iter = 1000, 
            chains = 3)

My questions are as follows:

  1. How do I enable prior distributions on mu, Omega and Sigma in order to impose a hierarchical structure on them (i.e I want 43 mu, Omega and Sigma as well as overall mu, Omega and Sigma)?
  2. In particular how do I enable priors for Omega?
  3. Any tips to speed up my code would be useful too.
    Any help would be much appreciated.

Somebody else may be able to give you more complete and explicit suggestions, but as a start, I find it very helpful in both speeding up model convergence as well as in figuring out how I want to do hierarchical prior specification to use the transformed parameters block to create vectors and then use sampling statements on the whole vector rather than inside of a for loop.

So just using your observation-level mean as an example:


transformed parameters {
real mu_i[N];
for (i in 1:N) {
  mu_i[i] = mu[country_id[i]]
}

Then your model statement for mu can just be:

model {
mu_i ~ normal(0, 1);
}

Then it might be easier to see how to break apart the country-level means and specify priors on those, etc.

In case you haven’t already seen it, there is pretty good documentation about vectorization in the Stan User’s Guide in both the Regression section under section 1.1.1 Matrix Notation and Vectorization as well as more detailed information in the vectorization section of the efficiency tuning chapter.

This is a great start, thank you!

1 Like

I have changed the model around a little bit to enable the hierarchical parametrization. My new Stan model is as follows:

data{
  int N; // number of observations
  int n_country; // number of countries
  int n_random_vars; // number of random variables being included, 2 here
  int country_id[N]; // country id vector
  vector[n_random_vars] joint_data[N]; //data
  vector[n_random_vars] mu_prior; // vector created in R and parsed as data
  matrix[n_random_vars,n_random_vars] tau_mu_prior; // matrix created in R and parsed as data
  vector[n_random_vars] sigma_prior; // vector created in R and parsed as data
  matrix[n_random_vars,n_random_vars] tau_sigma_prior; // matrix created in R and parsed as data
  int Omega_prior[n_country]; // vector created in R and parsed as data
}

parameters{
  vector[n_random_vars] mu[n_country];
  corr_matrix[n_random_vars] Omega [n_country]; 
  vector<lower = 0>[n_random_vars] sigma [n_country]; 
}
transformed parameters {
  cov_matrix[n_random_vars] Sigma[n_country]; 
  for(i in 1:N){
    Sigma[country_id[i]] = quad_form_diag(Omega[country_id[i]], sigma[country_id[i]]);
  }
}
model{
  for(i in 1:N){
    target += multi_normal_lpdf(joint_data | mu[country_id[i]], Sigma[country_id[i]]);
  }
  mu ~ multi_normal(mu_prior,tau_mu_prior);
  sigma ~ multi_normal(sigma_prior,tau_sigma_prior);
  for(j in 1:43){
    Omega[j] ~ lkj_corr(Omega_prior[j]);
  }
}

when I run the “stanc” command I get no errors. However the HMC is unable to sample owing to the following error message.

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: multi_normal_lpdf: LDLT_Factor of covariance parameter is not positive definite.  last conditional variance is 0.  (in 'model25b5ba204ee_multi_mean_sigma_priors' at line 37)

Chain 1: 
Chain 1: Initialization between (-2, 2) failed after 100 attempts. 
Chain 1:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done

I am not sure how to ensure that the matrices tau_mu_prior and tau_sigma_prior are positive semidefinite, I tried using <lower = 0.1> as a constraint, as well as the cov_matrix definition. None of these fix the error.

My R code is here

mu_prior <- sigma_prior <- c(0,0)
tau_mu_prior <- tau_sigma_prior <- matrix(c(1,1,1,1), nrow = 2, ncol = 2)
Omega_prior <- rep(2, times = 43)
#-----------------#
fit <- stan(*Stan model here*, 
              data = list(N,
                          n_country,
                          n_random_vars,
                          country_id,
                          joint_data,
                          mu_prior,
                          tau_mu_prior,
                          sigma_prior,
                          tau_sigma_prior,
                          Omega_prior),
              iter = 1000,
              chains = 3)

Any help with understanding this problem/error would be appreciated.

I am having the same challenge with optimising hierachical multivariate normal code.

It seems that the optimisation steps from the stan documentation will be helpful in the model you’re after (with some minor adjustments to the data input), see:

https://mc-stan.org/docs/2_19/stan-users-guide/multivariate-hierarchical-priors-section.html

Though, the issue with this page of documentation is the final model block does not actually compile for me:

Expression is ill formed.
 error in 'MVN_hierarchical.stan' at line 22, column 58
  -------------------------------------------------
    20:   vector<lower=0>[K] tau;     // prior scale
    21:   for (k in 1:K) tau[k] = 2.5 * tan(tau_unif[k]);
    22:   beta = u * gamma + (diag_pre_multiply(tau,L_Omega) * z)';
                                                                                                          ^
    23: }
  -------------------------------------------------

EDIT: The model from the docs does compile, I was just confused from the documentation linked above – I had to re-define u as a matrix. The docs note there are no changes to the data, which I assumed was the data from the original model declaration, it meant no changes since the Cholesky factorisation section.

Thanks James, I will look into this!

@advaitr8 If you haven’t tried already, you may find it helpful to reparameterize the covariance matrices using the Cholesky decomposition.

There are background mathematical details here and an applied example here in the section called “Optimization through Cholesky Factorization”

Also I am not sure but think that this line will give you trouble because you haven’t specified the index on joint_data:

  for(i in 1:N){
    target += multi_normal_lpdf(joint_data | mu[country_id[i]], Sigma[country_id[i]]);
  }

Again, you could probably save some time by vectorizing that chunk and not incrementing inside the for loop.