# Poisson Hierarchical Model Non-centered Parametrization

Hello everyone,

I am trying to fit the model in the picture on simulated data. I used the same model for generating data except that the covariance matrix was an identity matrix, and the beta bars were standard normal random variables. I have tried both centered and non-centered parameterization; I encountered with following problems.

Firstly, there were some divergent samples in the centered model. Before the switch to the non-centered version, I tried the handle the problem by increasing the parameter of LKJ distribution and decreasing the step size. I worked and posterior beta distributions almost centered around the generated beta value. But Ä±t was not the same for beta bars. Their posterior distributions are far away from generated beta bars. I wonder how I can handle this problem.

Secondly, I tried non-centered parameterization. It worked, but run time was ten times higher than centered parameterization for the same number of iterations. Is that normal? I thought run time would decrease.
Also, chains did not mix, and almost all iterations exceeded the max_treedepth even for max_treedepth=13.

I shared the data simulation code and centered and non-centered models in this order. I would be very happy if anyone could help me.

### sim

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Hyperparameters*/
vector<lower=0>[M] sigma; // diagonal values of the covariance matrix, equals to ones matrix.
}
generated quantities {
vector[M] betabar; // mean vector of multivariate normal distribution
vector[M] beta; // regression coefficients

real<lower=0> a; // prior for relative risk
vector<lower=0>[N] wi; //relative risk

vector[N] xi; // mean of poisson distribution

array[N] int<lower=0> y_rep; // response variable

for (i in 1 : M) {
betabar[i] = normal_rng(0, 1);
}
beta = multi_normal_rng(betabar, diag_matrix(sigma));

a = gamma_rng(1, 0.01);
for (j in 1 : N) {
wi[j] = gamma_rng(a, a);
}

xi = exp(X * beta + log(offset) + log(wi));

for (z in 1 : N) {
y_rep[z] = poisson_rng(xi[z]);
}
}
``````

### NB2 centered

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Outcome */
array[N] int<lower=0> y;
/* Hyperparameters*/
vector<lower=0>[M] sigma; //diagonal values of the covariance matrix, equals to 10s vector
}
parameters {
vector[M] beta; // coeffiencts in linear equation
real<lower=0> a; // prior for relative risk
vector<lower=0>[N] wi; // relative risk
vector[M] betabar; // mean vector of multivariate normal distribution
cholesky_factor_corr[M] Lcorr; // cholesky factor (L_u matrix for R)
}
transformed parameters {
vector[N] xi; // mean vector of poisson distribution
corr_matrix[M] R; // correlation matrix
cov_matrix[M] Sigma; // covariance matrix

xi = exp(X * betabar + log(offset) + log(wi));
R = multiply_lower_tri_self_transpose(Lcorr);
}
model {
target += normal_lpdf(betabar | 0, 1);
target += lkj_corr_cholesky_lpdf(Lcorr | 4);
target += multi_normal_lpdf(beta | betabar, Sigma);
target += gamma_lpdf(a | 1, 0.01);

target += gamma_lpdf(wi | a, a);
target += poisson_lpmf(y | xi);
}
``````

### Non-centered

``````data {
/* Dimensions */
int<lower=0> M; // number of varaibles
int<lower=0> N; // number of observations
/* Design Matrix */
matrix[N, M] X;
/* offset */
vector<lower=0>[N] offset;
/* Outcome */
array[N] int<lower=0> y;
/* Hyperparameters*/
vector<lower=0>[M] sigma; //diagonal values of the covariance matrix, equals to 10s vector
}
parameters {
vector[M] delta; // std normal variable that multiplied with covariance matrix
vector[M] delta2; // std normal variable that multiplied with mean
real<lower=0> a;

vector<lower=0>[N] wi; // relative risk
cholesky_factor_corr[M] Lcorr; // cholesky factor (L_u matrix for R)
}
transformed parameters {
corr_matrix[M] R; // correlation matrix
vector[M] beta; // regression coefficients
vector[N] xi; // mean of poisson

R = multiply_lower_tri_self_transpose(Lcorr);

beta = delta2 * 10 + quad_form_diag(R, sigma) * delta;
xi = exp(X * beta + log(offset) + log(wi));
}
model {
target += normal_lpdf(delta2 | 0, 1);
target += lkj_corr_cholesky_lpdf(Lcorr | 2);
target += normal_lpdf(delta | 0, 1);
target += gamma_lpdf(a | 1, 0.01);

target += gamma_lpdf(wi | a, a);
target += poisson_lpmf(y | xi);
}
``````

Hey there!

I took liberty to format your code a bit. I didnâ€™t have time to go into much detail (yet), but the thing that stood out was that it appears you want to fit a multi-level model, but you donâ€™t have groups. Is that correct? (You are trying to fit varying effect for each individual?)

@Max_Mantei Hey, thanks for the answer. I also posted it in the modeling section. Somebody helped me to understand what I was really doing. I am very new to hierarchical modeling so, I did not know to model a hierarchical model. It turned out that what was doing is basically fit a regression model. I changed my code and built a multilevel model with one group level and observation level variable. I have to extend it to a non-nested hierarchical model since I have 3 different group levels and 1 observation level. Can you suggest to me any study case like that? Also, letâ€™s say I want to use multivariate priors, can I use it for group-level intercepts or just for observation level variables. Lastly, what is the reason behind using multivariate priors? Thank you again, I really appreciate it.

Hey there! I donâ€™t have time for a more elaborate answer right now, but you questions seem fundamental enough that I suggest checking out the book Statistical Rethinking, or the lecture series of its author: Richard McElreath - YouTube

Not sure if I get that right, but for hierarchical models to use multivariate normal priors for more than one â€śvariableâ€ť. I.e. an intercept AND a slope. The reasoning behind the multivariate prior is that those parameters are expected to be correlated. For example, for a line to fit a cloud of points roughly, when you increase the intercept you probably have to decrease the slopeâ€¦ Thatâ€™s at least the intuition ;)

The book (or lecture) will go into much more detail and hopefully youâ€™ll find what you need!

Cheers,
Max

1 Like