Centered vs NCP Question -> Mixed parameterisation & Correlations

Background: I’m trying to figure out how to do a mixed parameterisation (as per @betanalpha 's excellent course & case studies) but where effects are correlated across responses.

Simpler problem: I’m wondering what actually defines the centered vs non-centered parameterisations - is it the existence of the parameter, or the positioning of the prior?

e.g.

theta = mu + tau * theta_tilde;
mu ~ normal(0, 5);   
tau ~ normal(0, 5); 
theta_tilde ~ normal(0, 1); 

vs

theta = mu + tau * theta_tilde;
mu ~ normal(0, 5);
tau ~ normal(0, 5); 
theta ~ normal(mu, tau); 

Is #2 enough to define it as centered, or is the existence of an unconstrained theta_tilde enough to define it as non-centered?

If #2 is centered, I could correlate levels across responses with a mixed parameterisation, something like this (not tested, so please excuse what may amount to a whole lot of rubbish)!

data {
   ...
   int<lower=1> M; // Number of responses;
   int<lower=1> K; // Number of levels
   int<lower=0, upper=K> K_cp; // Number of levels for centered parameterisation
   int<lower=1, upper=K> cp_idx[K_cp];  Index for cp levels
   int<lower=0, upper=K> K_ncp;  // Number of levels for non-centered parameterisation
   int<lower=1, upper=K> ncp_idx[K_ncp]; // Index for ncp levels
}
parameters {
   ...
   cholesky_factor_corr[M] L_u; // Correlation matrix amongst responses
   matrix[M, K] z_u; // Matrix of levels, standardised for the ncp & non-standardised for the cp levels.
   vector<lower=0>[M] tau;
}
transformed parameters {
  ...
  matrix[M, K] theta; 
  theta[1:M, ncp_idx] = diag_pre_multiply(tau, L_u) * z_u[1:M, ncp_idx];
  theta[1:M, cp_idx] = L_u * z_u[1:M, cp_idx];
}

model {
   // Let's assume mu = 0;
   tau ~ normal(0, 5);
   L_u ~ lkj_corr_cholesky(1.0);
   to_vector(z_u[1:M, ncp_idx]) ~ std_normal();
   to_vector(theta[1:M, cp_idx]) ~ normal(0, tau);
   ...
}

Thoughts/comments welcome, but please be gentle!

1 Like

OK, so in some tests on non-correlated heirarchies, this performs like a true mixed parameterisation, so I think I have my answer - it’s the location of the prior, not the existence of the intermediate variable, & I should be able to implement this in a correlated context. Fingers crossed!

data {
   int<lower=1> N; // Number of observations
   vector[N] y; // Observations
   int<lower=1> K; // Number of levels
   int<lower=0, upper=K> indiv_idx[N]; // Context
   int<lower=0, upper=K> K_cp; // Number of levels for centered parameterisation
   int<lower=1, upper=K> cp_idx[K_cp];  Index for cp levels
   int<lower=0, upper=K> K_ncp;  // Number of levels for non-centered parameterisation
   int<lower=1, upper=K> ncp_idx[K_ncp]; // Index for ncp levels
}

parameters {
  real mu;
  real<lower=0> tau;
  real<lower=0> sigma;
  vector[K] theta_tilde;
}
transformed parameters {
  vector[K] theta;
  theta[ncp_idx] = mu + tau * theta_tilde[ncp_idx];
  theta[cp_idx] = theta_tilde[cp_idx];
}

model {
  mu ~ normal(0, 5);
  tau ~ normal(0, 5);
  sigma ~ normal(0, 1);
  target += normal_lpdf(theta_tilde[ncp_idx] | 0, 1);
  target += normal_lpdf(theta_tilde[cp_idx] | mu, tau);
  y ~ normal(theta[indiv_idx], sigma);
}

Just to followup – what defines the centered and non-centered parameterizations is the variable in the parameters block.

In the nominal centered parameterization we work directly with the latent variable that captures an individual behavior drawn from the assumed hierarchical population model.

parameters {
  real theta;
  real mu;
  real<lower=0> tau;
  ...
}

model {
  theta ~ normal(mu, tau);
  ...
  // `theta` used elsewhere in the model
}

In the non-centered parameterization we work instead with a standardized residual – a variable that captures how much an individual behavior deviates from the center of the population model relative to the population scale. The absolute individual behavior can then be reconstructed from this residual and the population configuration.

parameters {
  real theta_tilde;
  real mu;
  real<lower=0> tau;
  ...
}

transformed parameters {
   // Individual behavior is deterministically reconstructed from standardized residual
   real theta = mu + tau * theta_tilde;
}

model {
  // residual model is standard normal independent of configuration of population model
  theta_tilde ~ normal(0, 1);  
  ...
  // `theta` reconstructed above used elsewhere in the model
}

Although these two parameterizations define the same model they emphasize different aspects, which can lead to different posterior geometries, and hence computational performance, in different circumstances. The centered parameterization does better when the likelihood function most strongly constrains theta directly, so that theta is only weakly coupled to mu and tau, while the non-centered parameterization does better when the population model most strongly constrains theta, so that theta is strongly coupled to mu and tau.

When we have a population of multiple individuals they can be parameterized independently of each other, depending on the behavior of the individual component likelihood functions that inform each fo those individual behaviors.

When the individual behavior is captured by multiple variables modeled with a multivariate normal population model the situation is similar conceptually, but the organization becomes a bit more complicated. Your implementation of a mixed parameterization for a multivariate normal hierarchical model does look correct upon visual inspection, although I haven’t tested it!

1 Like