# Hierarchical zero-inflated model: effective sample size decreases with number of individuals

I am running a Hurdle model that is integreated in a Hidden Markov Model. Indicated by Rhat, all model parameters converge quite good. However, the effective sample size of the individual-specific intercepts (alphai, alphaj) and the state-specific intercepts (mu, nu) decrease drastically if I add data of more individuals.

Unfortunately I cannot to share the full code, so I am posting the parts of the Stan model that might be relevant to answer this question:

``````data{
int<lower = 1> N; // number of observations
int<lower = 0> S; // number of states
int<lower = 1> H; // number of individuals
int<lower = 1> id[N]; // identifier of individuals
int<lower = 0> K1; // number of covariates inside delta in Bernoulli part
int<lower = 0> K2; // number of covariates inside delta in Lognormal part
matrix[N, K1] C1; // matrix of covariates for Bernoulli part
matrix[N, K2] C2; // matrix of covariates for Lognormal part
int<lower = 0, upper = 1> y[N]; // binary decision
real q[N]; // Hurdle: Dependent variable we want to model conditional on y
}

parameters {
ordered[S] mu; // state-dependent intercepts in Bernoulli part
vector[S] nu; // state-dependent intercepts in Lognormal part
real alphaj[H]; // individual-specific intercept in Bernoulli part
real alphai[H]; // individual-specific intercept in Lognormal part
real<lower = 0> sigma_alphai;
real<lower = 0> sigma_alphaj;
real<lower = 0> sigma_q;
vector[K1] delta1;
vector[K2] delta2;
}
model {
// priors
mu ~ normal(0, 1)T[, mu];
mu ~ normal(0, 2);

nu ~ normal(0, 1);
nu ~ normal(0, 2);

alphai ~ normal(0, sigma_alphai);
alphaj ~ normal(0, sigma_alphaj);
...
for (t in 2:N) {
target += log_sum_exp(gamma);
for (k in 1:S){
gamma_prev[k] =  bernoulli_logit_lpmf(y[t] | alphaj[id[t]] + mu[k] + C1[t]*delta1);
if(y[t] == 1){
gamma_prev[k] += lognormal_lpdf(q[t] | alphai[id[t]] + nu[k] + C2[t]*delta2, sigma_q);
...
``````

This is the output for the individual-specific intercepts alphai and alphaj in case of 10 indivudals.

And this is the output using data of 200 individuals (first 10 alphai and alphaj):

What can I check to identify the cause of this issue? And, even more important, what can I do against it. Thanks in advance.

Try a non-centered parameterization on these:

``````alphai ~ normal(0, sigma_alphai);
alphaj ~ normal(0, sigma_alphaj);
``````

I think you can do this by just doing:

``````real<multiplier=sigma_alphai> alphaj[H]; // individual-specific intercept in Bernoulli part
real<multiplier=sigma_alphaj> alphai[H]; // individual-specific intercept in Lognormal part
``````

The explanation for what this is and why it can help is here: Diagnosing Biased Inference with Divergences

You didn’t mention divergences though – were you getting any?

1 Like

Thank you, Ben. I didn’t get any divergences.

Just to be sure: Using NCP with the approach you are suggesting, there is no need to change the priors on alphaj, alphai, sigma_alphaj, and sigma_alphaj, they remain the same? Because when I applied the NCP in the past, I remember I had to use `std_normal` priors, like this?

``````parameters {
...
real<multiplier=sigma_alphai> alphaj[H]; // individual-specific intercept in Bernoulli part
real<multiplier=sigma_alphaj> alphai[H]; // individual-specific intercept in Lognormal part
real<lower = 0> sigma_alphai;
real<lower = 0> sigma_alphaj;
...
}
model {
// priors
...
alphai ~ std_normal();
alphaj ~ std_normal();
sigma_alphai ~ cauchy(0, 1);
sigma_alphaj ~ cauchy(0, 1);

``````

Yeah, it’s a different way of doing the non-centered parameterization (Updated non-centered parametrization with offset and multiplier). Hope I got the syntax right on it :D

1 Like

That’s good to know, I will give it a run like this and report back if it helped to increase ESS. Thanks for your support.