# Poisson Hierarchical Model Non-centered Parametrization Hello everyone,

I am trying to fit the model in the picture on simulated data. I used the same model for generating data except that the covariance matrix was an identity matrix, and the beta bars were standard normal random variables. I have tried both centered and non-centered parameterization; I encountered with following problems.

Firstly, there were some divergent samples in the centered model. Before the switch to the non-centered version, I tried the handle the problem by increasing the parameter of LKJ distribution and decreasing the step size. I worked and posterior beta distributions almost centered around the generated beta value. But ıt was not the same for beta bars. Their posterior distributions are far away from generated beta bars. I wonder how I can handle this problem.

Secondly, I tried non-centered parameterization. It worked, but run time was ten times higher than centered parameterization for the same number of iterations. Is that normal? I thought run time would decrease.
Also, chains did not mix, and almost all iterations exceeded the max_treedepth even for max_treedepth=13.

I shared the data simulation code and centered and non-centered models in this order. I would be very happy if anyone could help me.

### sim

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Hyperparameters*/
vector<lower=0>[M] sigma; // diagonal values of the covariance matrix, equals to ones matrix.
}
generated quantities {
vector[M] betabar; // mean vector of multivariate normal distribution
vector[M] beta; // regression coefficients

real<lower=0> a; // prior for relative risk
vector<lower=0>[N] wi; //relative risk

vector[N] xi; // mean of poisson distribution

array[N] int<lower=0> y_rep; // response variable

for (i in 1 : M) {
betabar[i] = normal_rng(0, 1);
}
beta = multi_normal_rng(betabar, diag_matrix(sigma));

a = gamma_rng(1, 0.01);
for (j in 1 : N) {
wi[j] = gamma_rng(a, a);
}

xi = exp(X * beta + log(offset) + log(wi));

for (z in 1 : N) {
y_rep[z] = poisson_rng(xi[z]);
}
}
``````

### NB2 centered

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Outcome */
array[N] int<lower=0> y;
/* Hyperparameters*/
vector<lower=0>[M] sigma; //diagonal values of the covariance matrix, equals to 10s vector
}
parameters {
vector[M] beta; // coeffiencts in linear equation
real<lower=0> a; // prior for relative risk
vector<lower=0>[N] wi; // relative risk
vector[M] betabar; // mean vector of multivariate normal distribution
cholesky_factor_corr[M] Lcorr; // cholesky factor (L_u matrix for R)
}
transformed parameters {
vector[N] xi; // mean vector of poisson distribution
corr_matrix[M] R; // correlation matrix
cov_matrix[M] Sigma; // covariance matrix

xi = exp(X * betabar + log(offset) + log(wi));
R = multiply_lower_tri_self_transpose(Lcorr);
}
model {
target += normal_lpdf(betabar | 0, 1);
target += lkj_corr_cholesky_lpdf(Lcorr | 4);
target += multi_normal_lpdf(beta | betabar, Sigma);
target += gamma_lpdf(a | 1, 0.01);

target += gamma_lpdf(wi | a, a);
target += poisson_lpmf(y | xi);
}
``````

### Non-centered

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Outcome */
array[N] int<lower=0> y;
/* Hyperparameters*/
vector<lower=0>[M] sigma; //diagonal values of the covariance matrix, equals to 10s vector
}
parameters {
vector[M] delta; // std normal variable that multiplied with covariance matrix
vector[M] delta2; // std normal variable that multiplied with mean
real<lower=0> a;

vector<lower=0>[N] wi; // relative risk
cholesky_factor_corr[M] Lcorr; // cholesky factor (L_u matrix for R)
}
transformed parameters {
corr_matrix[M] R; // correlation matrix
vector[M] beta; // regression coefficients
vector[N] xi; // mean of poisson

R = multiply_lower_tri_self_transpose(Lcorr);

beta = delta2 * 10 + quad_form_diag(R, sigma) * delta;
xi = exp(X * beta + log(offset) + log(wi));
}
model {
target += normal_lpdf(delta2 | 0, 1);
target += lkj_corr_cholesky_lpdf(Lcorr | 2);
target += normal_lpdf(delta | 0, 1);
target += gamma_lpdf(a | 1, 0.01);

target += gamma_lpdf(wi | a, a);
target += poisson_lpmf(y | xi);
}
``````

Hey there!

I took liberty to format your code a bit. I didn’t have time to go into much detail (yet), but the thing that stood out was that it appears you want to fit a multi-level model, but you don’t have groups. Is that correct? (You are trying to fit varying effect for each individual?)

@Max_Mantei Hey, thanks for the answer. I also posted it in the modeling section. Somebody helped me to understand what I was really doing. I am very new to hierarchical modeling so, I did not know to model a hierarchical model. It turned out that what was doing is basically fit a regression model. I changed my code and built a multilevel model with one group level and observation level variable. I have to extend it to a non-nested hierarchical model since I have 3 different group levels and 1 observation level. Can you suggest to me any study case like that? Also, let’s say I want to use multivariate priors, can I use it for group-level intercepts or just for observation level variables. Lastly, what is the reason behind using multivariate priors? Thank you again, I really appreciate it.

Hey there! I don’t have time for a more elaborate answer right now, but you questions seem fundamental enough that I suggest checking out the book Statistical Rethinking, or the lecture series of its author: Richard McElreath - YouTube

Not sure if I get that right, but for hierarchical models to use multivariate normal priors for more than one “variable”. I.e. an intercept AND a slope. The reasoning behind the multivariate prior is that those parameters are expected to be correlated. For example, for a line to fit a cloud of points roughly, when you increase the intercept you probably have to decrease the slope… That’s at least the intuition ;)

The book (or lecture) will go into much more detail and hopefully you’ll find what you need!

Cheers,
Max

1 Like