Non-centered parameterization for multivariate model with heterogeneity

I’m writing my first models in Stan and today I started with a multivariate model with 2 outcomes. To capture heterogeneity across individuals (my obervations are grouped in multiple individuals), next to the intercept alpha0 for each equation, I want to add an individual-specific intercept alpha to capture heterogeneity across individuals.

Here is my model code:

data { 
  int <lower = 1> N;                    // Number of observations
  int<lower = 1> J;                      // number of outcomes 
  int<lower = 0> K;                     // number of covariates 
  vector[J] Y[N];                         // vector of J outcome variables
  vector[K] X[N];                        // vector of K covariates
  int <lower = 1> id[N];              // identifier of individuals
  int <lower = 1> H;                   // number of individuals
}

parameters { 
  vector[J] alpha0;                                   // intercept for each equation
  vector[J] alpha[H];                                // individual-specific intercept for each equation
  vector<lower = 0>[J] sigma_alpha; 
  
  cholesky_factor_corr[J] L_Omega;
  vector<lower=0>[J] L_sigma;
  
  matrix[J, K] beta;
} 

model {
  vector[J] mu[N];
  
  alpha[J] ~ normal(0, sigma_alpha);

  for (n in 1:N) 
    mu[n] = alpha0 + alpha[id[n]] + beta * X[n]; 

  to_vector(beta) ~ normal(0, 2);
  L_Omega ~ lkj_corr_cholesky(1); 
  L_sigma ~ student_t(3, 0, 2);
  
  Y ~ multi_normal_cholesky(mu, diag_pre_multiply(L_sigma, L_Omega));
}

Now I realized my data requires me to apply the non-cenetred parameterization. I’m struggling with the parameters now because I’m not confident which paramaters need to be individual-specific then.

Here is my code for a multivariate model capturing heterogeneity across individuals with a non-centered parameterization for the individual-specific intercept alpha.

data { 
  int <lower = 1> N;                     // Number of observations
  int<lower = 0> J;                      // number of outcomes 
  int<lower = 0> K;                      // number of covariates 
  vector[J] Y[N];                        // vector of J outcome variables
  vector[K] X[N];                        // vector of K covariates
  int <lower = 1> id[N];                 // identifier of individuals
  int <lower = 1> H;                     // number of individuals
}

parameters { 
  real alpha0;
  vector[H] alphai_raw;
  real<lower = 0> sigma_alphai;
  
  cholesky_factor_corr[J] L_Omega;
  vector<lower=0>[J] L_sigma;
  
  matrix[J, K] beta;
} 

transformed parameters {
vector[H] alphai;
alphai = alpha0 + sigma_alphai*alphai_raw;
}

model {
  vector[J] mu[N];
  
  alphai_raw ~ cauchy(0,.2);
  alpha0 ~ cauchy(0, 1);

  for (n in 1:N) 
    mu[n] = alphai[id[n]] + beta * X[n]; 

  to_vector(beta) ~ normal(0, 2);
  L_Omega ~ lkj_corr_cholesky(1); 
  L_sigma ~ student_t(3, 0, 2);
  
  Y ~ multi_normal_cholesky(mu, diag_pre_multiply(L_sigma, L_Omega));
}
  • I’m wondering if it’s right that I can get rid of the non-individual-specific intercept alpha0 when defining mu[n] in the model section?
  • The way I have coded it I get one intercept for each individual for the two equations together, but how can I get one intercept for each individual for each equation?

Moin Jochen!

First thing I noticed is alphai_raw ~ cauchy(0,.2);, which should be alphai_raw ~ std_normal();. Non-centered normals are of this form:

y \sim N(\mu,\sigma) \hspace{3mm} \leftrightarrow \hspace{3mm} y = \mu + \sigma\tilde{y}, \hspace{1mm} \text{with} \hspace{2mm} \tilde{y} \sim N(0,1)

…so the _raw or _tilde always have std_normal() priors.

Well, you are not getting rid of it. It is part of (each) alphai. So yeah, what you did is correct.

You need to model each alphai as a two dimensional (or more general J dimensional) vector (this is what you did in the first model) and apply a 2 (or J) dimensional normal prior for it (non-centered).

So the second model should look somthing like this:

parameters { 
  vector[J] alpha0;
  vector[J] alphai_raw[H];
  cholesky_factor_corr[J] L_Omega_alphai;
  vector<lower=0>[J] sigma_alpha;
  
  cholesky_factor_corr[J] L_Omega;
  vector<lower=0>[J] L_sigma; // I would call this just "sigma" the L_ usually signifies the cholesky factor, but these are normal standard deviations
  
  matrix[J, K] beta;
} 

transformed parameters {
  vector[J] alphai[H];
  for (h in 1:H){
    alphai[h] = alpha0 + diag_pre_multiply(sigma_alpha, L_Omega_alphai) * alphai_raw[h]
  }
}

model {
  vector[J] mu[N];
  
  alphai_raw ~ std_normal();
  alpha0 ~ cauchy(0, 1);

  for (n in 1:N) 
    mu[n] = alphai[id[n]] + beta * X[n]; 

  to_vector(beta) ~ normal(0, 2);
  L_Omega ~ lkj_corr_cholesky(1); 
  L_sigma ~ student_t(3, 0, 2);
  
  Y ~ multi_normal_cholesky(mu, diag_pre_multiply(L_sigma, L_Omega));
}

I haven’t tested this code, but it should roughly look like this…

Cheers,
Max

1 Like

Thanks, Max!

I have tried it and R tells me there are “No matches for: vector[ ] ~ std_normal()”. This refers to the line where we define alphai_raw ~ std_normal();.

How can I fix this?

Whoops.

In the model block it should be:

for (h in 1:H) alphai_raw[h] ~ std_normal();

Sorry!
Max

1 Like

Yes, that makes sense. I will report back if all went well once the estimation has finished.

1 Like