# Hierarchical hurdle model in Stan?

Hello! I have N observations, divided unequally between J subjects. Each observation y[i] is sampled from a Poisson distribution, with the mean lambda varying according to the subject. The subject-specific lambdas are sampled from a gamma distribution (this is a simplified made-up example so it doesn’t matter if it’s silly). So far so good - that is what is implemented in the Stan code below.

Now I want to say that for proportion theta of the subjects, the lambdas are fixed at 0 and are thus not sampled from the gamma distribution. Is there any way to do this in Stan? I could use the normal syntax used for hurdle models (using target += log_sum_exp(…)) but I don’t think it would do what I want it to do. It would be like saying that for EACH observation y[i], it has a probability theta of being 0, and a probability of 1-theta of being sampled from the gamma distribution. It would lose the hierarchical structure whereby the lambda is the same for all the observations coming from the same subject. I would be really grateful for any advice!

``````data {
// number of observations
int<lower=0> N;
// number of subjects
int<lower=0, upper=N> J;
// dependent variable
int<lower=0> y[N];
// subject IDs
int<lower=0, upper=J> z[N];
}

parameters {
// parameters of the gamma distribution
real<lower=0> alpha;
real<lower=0> beta;
// subject-specific means
real<lower=0> lambda[J];
}

model {
// priors on the parameters of the gamma distribution
alpha ~ uniform(0, 100);
beta ~ uniform(0, 100);

lambda ~ gamma(alpha, beta);
// probability of observed data
for (n in 1:N)
y[n] ~ poisson(lambda[z[n]]);
}
``````

Some R code that simulates data and fits the model to it (badly, because the mixture of distributions is not captured):

``````# generate subject-specific lambdas for 20 subjects
lambdas <- rgamma(20, 10, 0.6)
# add 10 subjects in the subclass of 0s
lambdas <- c(lambdas, rep(0, 10))
hist(lambdas)
# number of observations per subject
obs_counts <- rpois(30, 20)
y <- c()
z <- c()
for (subject in 1:30) {
y <- c(y, rpois(obs_counts[subject], lambdas[subject]))
z <- c(z, rep(subject, obs_counts[subject]))
}
stan_data <- list(N = length(y), J = 30,
y = y, z = z)
model <- stan("example_for_discourse.stan", data = stan_data)
``````

Now I want to say that for proportion theta of the subjects, the lambdas are fixed at 0 and are thus not sampled from the gamma distribution.

It’s a mixture model and it can be implemented

Just to clarify, if `lambda = 0`, then `y = 0`, so that simplifies things and means it’s not really a mixture but a zero-augmented, hurdle model and you shouldn’t need to use `log_sum_exp`.

For the non-hierarchical hurdle probability, you’d have something like (using `log_sum_exp` but the same could be achieved with `log_mix`):

``````parameters{
real<lower=0, upper=1> phi;
...
}

model{
...
phi ~ beta(1, 1);
for(n in 1:N){
// if y is 0, we know lambda must be 0 with probability phi
if(y[n] == 0) target += log(phi);
// otherwise, y is poisson distributed with probability 1 - phi
else target += log1m(phi) + poisson_lpmf(y[n] | lambda[z[n]]);
}
}
``````

where `phi` is your hurdle probability, constant across subjects and time periods.

For the hierarchical case, you can expand `phi` to vary over subjects. Below, I’m using a non-centered parameterization of `phi` on the log-odds scale (not run, apologies for any mistakes!):

``````parameters{
real phi_logit_z[J];
// overall mean of phi on logit scale
real phi_logit_mu;
// among subject SD of phi on logit scale
real<lower=0> phi_sigma;
...
}

transformed parameters{
// generate logit-scale phi & transfrom to (0, 1) interval per subject
real<lower=0, upper=1> phi[J] = inv_logit(phi_logit_mu + phi_sigma * phi_logit_z);
}

model{
...
// must be a standard normal
phi_logit_z ~ std_normal();
// example, weakly informative priors
phi_logit_mu ~ std_normal();
phi_sigma ~ std_normal();

for(n in 1:N){
if(y[n] == 0) target += log(phi[z[n]]);
else target += log1m(phi[z[n]]) + poisson_lpmf(y[n] | lambda[z[n]]);
}
}
``````

I think I’ve understood your example. Definitely try the above on simulated data first, though.

@aakhmetz @cgoold Thank you so much for taking the time to look at my problem! The thing is, I’m not sure that this model is exactly the model that I’m trying to express. What I was trying to express was to have a discrete subset of subjects where lambda is always 0, and another discrete subset where it is always sampled from this gamma distribution. Whereas with this solution, it’s that every subject has some probability of being sampled from one versus the other distribution. Do you see what I mean? Or am I misunderstanding what your model is doing?

However, it could be that the model I imagined really is impossible to fit in Stan, as we cannot have discrete parameters. However, I’m thinking again about the scientific problem and it may actually be possible to reconceptualise it in the way that is suggested by the model @cgoold wrote. What’s more, I could potentially sample phi from a beta distribution, and set very low priors on alpha and beta to encourage “pseudo-discrete” solutions.

@cgoold I haven’t had time to try to implement and test the solution you suggested but I will do so soon, and will let you know if I come up with anything interesting. Thanks again, it’s really appreciated!

You need to compute the likelihoods per subject which is a bit annoying if your data is not grouped.

``````model {
// priors on the parameters of the gamma distribution
alpha ~ uniform(0, 100);
beta ~ uniform(0, 100);

lambda ~ gamma(alpha, beta);
for (j in 1:J) {
int has_nonzero = 0;
for (n in 1:N) {
if (z[n] == j && y[n] != 0) {
has_nonzero = 1;
break;
}
}
if (has_nonzero) {
// lambda is not zero for this subject
for (n in 1:N) {
if (z[n] == j) {
y[n] ~ poisson(lambda[j]);
}
}
} else {
// could be zero, needs mixture modeling
real likelihood_zero = 0;
real likelihood_nonzero = 0;
for (n in 1:N) {
if (z[n] == j) {
likelihood_zero += poisson_lpmf(y[n] | 0.0);
likelihood_nonzero += poisson_lpmf(y[n] | lambda[j]);
}
}
target += log_mix(phi, likelihood_zero, likelihood_nonzero);
}
}
}
``````

(Of course `poisson_lpmf(y[n] | 0.0)` is always zero but I left it in to illuminate the general formula)

Hello! I have now gone through your code in detail, and have tested it on simulated data. There is one bit where I’m not sure if it is me or you who is misunderstanding something. If y does not equal 0, lambda cannot equal 0. But when y does equal 0, this does not imply that lambda is 0, right? So in the first bit of code you wrote, I think the model has to be reformulated like this:

``````model {
lambda ~ gamma(alpha, beta);
phi ~ beta(1, 1);
for(n in 1:N){
// if y is 0, we know that lambda either belongs to the 0 class (with probability phi)
// or was sampled from a poisson distribution with probability 1 - phi
if(y[n] == 0) target += log_sum_exp(log(phi), log1m(phi) + poisson_lpmf(y[n]|lambda[z[n]]));
// otherwise, y is poisson distributed with probability 1 - phi
else target += log1m(phi) + poisson_lpmf(y[n] | lambda[z[n]]);
}
}
``````

And then the analogous change in the hierarchical model as well. Right?

Hello! Sorry for not replying sooner, I wanted to take the time to think about it properly. The code you wrote was very useful in terms of getting me to think better about the problem. I think that the model that I was trying to formulate genuinely didn’t make sense in the way that I was trying to formulate it. It wasn’t a Stan problem, it was a thinking problem. So I will go with a solution akin to what you suggested. Thank you very much!

@cgoold @nhuurre Thank you so much for your help, this is such a friendly forum!

Ah, yes, sorry! I misunderstood your example there. Apologies.

No worries, I was just double-checking that I hadn’t misunderstood your misunderstanding!