Divergence after non-centered parameterization

Hello Stan folks,

I’m trying to run the following hierarchical model with the attached stan model. However, I still encounter divergence problem after non-centered parameterization (I hope I did the non-centered parameterization right).
Would you please take a look on my code and possibly help me on this? Many thanks.

The model in my mind…
\beta_{zpSR} \sim b_0 + b_1\times Omni + b_2\times Omni^2
\mu_{phyDen} \sim X\times\beta + X_{zpSR} \times \beta_{zpSR}
phyDen \sim N(\mu_{phyDen}, \sigma_{phyDen})

The stan model I wrote…

data {
  int<lower=1> N; // the number of observations
  int<lower=1> K1; // number of column of the model matrix (5)
  matrix[N, K1] X; // N (~1200 obs; ) by K1 (5 predictors) model matrix 
  vector[N] phyDen; // response variable, Y
  vector[N] zpSR; // 
  int<lower=1> K2; // number of column of the model matrix determining the effects of zpSR (3)
  matrix[N, K2] X2; // the model matrix determining the effects of zpSR
parameters {
  vector[K1] gamma;
  vector<lower=0>[K1] tau;
  vector[K1] beta_raw[N];
  vector[K2] gamma_beta_zpSR;
  vector<lower=0>[K2] tau_beta_zpSR;
  vector[K2] b_raw[N];
  real<lower=0> sigma_phyDen; //standard deviation of the individual observations
transformed parameters{
  vector[K1] beta[N]; 
  vector[K2] b[N];  
  vector[N] mu_phyD;
  vector[N] beta_zpSR;
  for(i in 1:N){
    beta[i] = gamma + tau .* beta_raw[i];
    b[i] = gamma_beta_zpSR + tau_beta_zpSR .* b_raw[i];
  for (i in 1:N){
    beta_zpSR[i] = X2[i]*b[i];
    mu_phyD[i] = X[i]*beta[i] + zpSR[i]*beta_zpSR[i];
model {
  gamma ~ normal(0,5); //weakly informative priors on the regression coefficients
  tau ~ cauchy(0,2.5); //weakly informative priors
  gamma_beta_zpSR ~ normal(0,5); //weakly informative priors on the regression coefficients
  tau_beta_zpSR ~ cauchy(0,2.5); //weakly informative priors
  sigma_phyDen ~ gamma(2,0.1); //weakly informative priors
  for (i in 1:N){
    beta_raw[i] ~ normal(0, 1); 
    b_raw[i] ~ normal(0, 1);
  phyDen ~ normal(mu_phyD, sigma_phyDen);
generated quantities {
  vector[N] log_lik;
  vector[N] y_pred;
  for (n in 1:N){
    log_lik[n] = normal_lpdf(phyDen[n] | mu_phyD[n], sigma_phyDen);
  for (n in 1:N){
    y_pred[n] = normal_rng(mu_phyD[n], sigma_phyDen);    

The data is here if necessary…

For non centering you should have the same number of parameters as your original model, but the way you scale them using the standard deviation is the right idea

1 Like


Given my understanding of hierarchical models, I think it would be more interesting to estimate \beta_{spSR} rather than computing it directly like you did. I would reformulate your model as follow:

\mu_{\beta_{zpSR}} = b_0 + b_1 \times Omni + b_2 \times Omni2\\ \beta_{zpSR} \sim N(\mu_{\beta_{zpSR}}, \sigma_{\beta_{zpSR}})\\ \mu_{phyDen} = X \times \beta + X_{zpSR} \times \beta_{zpSR}\\ phyDen∼N(\mu_{phyDen}, \sigma_{phyDen})\\

EDIT : Typo

1 Like