 # Non-centered parameterization of a hierarchical negative binomial model in Stan

I’m trying to estimate the effect of a predictor on count data. I also want to account for time-dependent trends in the counts by including 3 autoregressive terms. I input these t-k data instead of creating them in the stan model.

These data are grouped, so I want to incorporate varying parameter estimates and intercepts for each group.

When I run this model, I get many divergent transitions unless I run it for many iterations (~20,000). I think the likelihood surface is too complex and causing the “funnel” problem. A standard recommendation is to then implement non-centered parameterization. All the documentation/examples I can find for non-centered parameterization are for normally distributed data however, and I can’t quite figure out how to implement it for a negative binomial model.

Does this reparameterization even make sense? Or maybe there’s a better way to get this to fit more efficiently?

``````data {
int<lower=0> N; //number of observations
int count[N]; //Outcome variable
int Ngroups; //number of groups
int groupID[N]; //group ID
real predictor[N]; //Predictor variable
real ARcount1[N]; // count t-1
real ARcount2[N]; //count t-2
real ARcount3[N]; //count t-3
}

parameters {
real Intercept;  //global intercept
real AR_term1[Ngroups]; //parameter for 1st autoregressive term
real AR_term2[Ngroups]; //parameter for 2ndst autoregressive term
real AR_term3[Ngroups]; //parameter for 3rdst autoregressive term
real<lower=0> sigma_gr; // sigma for estimating the group intercepts
real slope[Ngroups]; //the regression parameters on the predictor
real ran_intercept[Ngroups]; //Intercept for the groups
real<lower=0>  phi; //the overdispersion parameters
}

model {
phi ~ cauchy(0, 2.5);
sigma_gr ~normal(0,1);
Intercept ~ normal(0,1); //priors following Gelman 2008
slope1 ~ cauchy(0,2.5);
AR_term1 ~ cauchy(0,2.5);
AR_term2 ~ cauchy(0,2.5);
AR_term3 ~ cauchy(0,2.5);

for (n in 1:N){

count[n] ~ neg_binomial_2(exp(Intercept + AR_term1[groupID[n]]*ARcount1[n]+
AR_term2[groupID[n]]*ARcount2[n]+AR_term3[groupID[n]]*ARcount3[n]+
slope1[groupID[n]]*predictor[n]+ran_intercept[groupID[n]]), phi);

}

for(j in 1:Ngroups){
ran_intercept[j]~normal(0,sigma_gr);  //varying intercept by group
}
}

generated quantities {  //generate within-sample predictions for validation

int<lower = 0> count_pred[N];

for (i in 1:N){

count_pred[i] = neg_binomial_2_rng(exp(Intercept + AR_term1[groupID[i]]*ARcount1[i]+
AR_term2[groupID[i]]*ARcount2[i]+AR_term3[groupID[i]]*ARcount3[i]+
slope1[groupID[i]]*predictor[i]+ran_intercept[groupID[i]]), phi);

}}

``````

I’m working on a hierarchical example for the manual that is simultaneously more pedagogically straightforward as well as more performant. It’s not quite ready yet in terms of text, but the stan code an R code to simulate/fit it are here. The stan code shows how to include both centered/non-centered in the same model, which in turn may make it more clear what structures/variables in the model are affected by that choice and more importantly which aren’t. Those that aren’t include the observation-level structure, so you should be able to swap out the normal for a negative binomial easily.

You may need to take the log of the autoregressive counts. That’s what I typically see for this type of model. Or you do log(x+1) to avoid log(0).

You can also use neg-binom-2-log-glm instead of neg_binomial_2.

If this still fails, you may need to reparameterize your model by following prior choice recommendation.
.