Hierarchical hurdle model in Stan?

Rosina · March 30, 2022, 5:50pm

Hello! I have N observations, divided unequally between J subjects. Each observation y[i] is sampled from a Poisson distribution, with the mean lambda varying according to the subject. The subject-specific lambdas are sampled from a gamma distribution (this is a simplified made-up example so it doesn’t matter if it’s silly). So far so good - that is what is implemented in the Stan code below.

Now I want to say that for proportion theta of the subjects, the lambdas are fixed at 0 and are thus not sampled from the gamma distribution. Is there any way to do this in Stan? I could use the normal syntax used for hurdle models (using target += log_sum_exp(…)) but I don’t think it would do what I want it to do. It would be like saying that for EACH observation y[i], it has a probability theta of being 0, and a probability of 1-theta of being sampled from the gamma distribution. It would lose the hierarchical structure whereby the lambda is the same for all the observations coming from the same subject. I would be really grateful for any advice!

data {
  // number of observations
  int<lower=0> N;
  // number of subjects
  int<lower=0, upper=N> J;
  // dependent variable
  int<lower=0> y[N];
  // subject IDs
  int<lower=0, upper=J> z[N];
}

parameters {
  // parameters of the gamma distribution
  real<lower=0> alpha;
  real<lower=0> beta;
  // subject-specific means
  real<lower=0> lambda[J];
}

model {
  // priors on the parameters of the gamma distribution
  alpha ~ uniform(0, 100);
  beta ~ uniform(0, 100);
    
  lambda ~ gamma(alpha, beta); 
  // probability of observed data
  for (n in 1:N) 
    y[n] ~ poisson(lambda[z[n]]);
}

Some R code that simulates data and fits the model to it (badly, because the mixture of distributions is not captured):

# generate subject-specific lambdas for 20 subjects
lambdas <- rgamma(20, 10, 0.6)
# add 10 subjects in the subclass of 0s
lambdas <- c(lambdas, rep(0, 10))
hist(lambdas)
# number of observations per subject
obs_counts <- rpois(30, 20)
y <- c()
z <- c()
for (subject in 1:30) {
  y <- c(y, rpois(obs_counts[subject], lambdas[subject]))
  z <- c(z, rep(subject, obs_counts[subject]))
}
stan_data <- list(N = length(y), J = 30,
                  y = y, z = z) 
model <- stan("example_for_discourse.stan", data = stan_data)

aakhmetz · April 1, 2022, 6:16am

Now I want to say that for proportion theta of the subjects, the lambdas are fixed at 0 and are thus not sampled from the gamma distribution.

It’s a mixture model and it can be implemented

cgoold · April 1, 2022, 8:05am

Just to clarify, if lambda = 0, then y = 0, so that simplifies things and means it’s not really a mixture but a zero-augmented, hurdle model and you shouldn’t need to use log_sum_exp.

For the non-hierarchical hurdle probability, you’d have something like (using log_sum_exp but the same could be achieved with log_mix):

parameters{
    real<lower=0, upper=1> phi;
    ...
}

model{
    ...
    phi ~ beta(1, 1);
    for(n in 1:N){
         // if y is 0, we know lambda must be 0 with probability phi
         if(y[n] == 0) target += log(phi);
         // otherwise, y is poisson distributed with probability 1 - phi
         else target += log1m(phi) + poisson_lpmf(y[n] | lambda[z[n]]);
   }
}

where phi is your hurdle probability, constant across subjects and time periods.

For the hierarchical case, you can expand phi to vary over subjects. Below, I’m using a non-centered parameterization of phi on the log-odds scale (not run, apologies for any mistakes!):

parameters{
    real phi_logit_z[J];
    // overall mean of phi on logit scale
    real phi_logit_mu;
    // among subject SD of phi on logit scale
    real<lower=0> phi_sigma;
    ...
}

transformed parameters{
   // generate logit-scale phi & transfrom to (0, 1) interval per subject
    real<lower=0, upper=1> phi[J] = inv_logit(phi_logit_mu + phi_sigma * phi_logit_z);
}

model{
    ...
    // must be a standard normal
    phi_logit_z ~ std_normal();
    // example, weakly informative priors
    phi_logit_mu ~ std_normal();
    phi_sigma ~ std_normal();

    for(n in 1:N){
         if(y[n] == 0) target += log(phi[z[n]]);
         else target += log1m(phi[z[n]]) + poisson_lpmf(y[n] | lambda[z[n]]);
   }
}

I think I’ve understood your example. Definitely try the above on simulated data first, though.

Rosina · April 1, 2022, 8:24pm

@aakhmetz @cgoold Thank you so much for taking the time to look at my problem! The thing is, I’m not sure that this model is exactly the model that I’m trying to express. What I was trying to express was to have a discrete subset of subjects where lambda is always 0, and another discrete subset where it is always sampled from this gamma distribution. Whereas with this solution, it’s that every subject has some probability of being sampled from one versus the other distribution. Do you see what I mean? Or am I misunderstanding what your model is doing?

However, it could be that the model I imagined really is impossible to fit in Stan, as we cannot have discrete parameters. However, I’m thinking again about the scientific problem and it may actually be possible to reconceptualise it in the way that is suggested by the model @cgoold wrote. What’s more, I could potentially sample phi from a beta distribution, and set very low priors on alpha and beta to encourage “pseudo-discrete” solutions.

@cgoold I haven’t had time to try to implement and test the solution you suggested but I will do so soon, and will let you know if I come up with anything interesting. Thanks again, it’s really appreciated!

nhuurre · April 2, 2022, 8:27am

You need to compute the likelihoods per subject which is a bit annoying if your data is not grouped.

model {
  // priors on the parameters of the gamma distribution
  alpha ~ uniform(0, 100);
  beta ~ uniform(0, 100);
    
  lambda ~ gamma(alpha, beta); 
  for (j in 1:J) {
    int has_nonzero = 0;
    for (n in 1:N) {
      if (z[n] == j && y[n] != 0) {
        has_nonzero = 1;
        break;
      }
    }
    if (has_nonzero) {
      // lambda is not zero for this subject
      for (n in 1:N) {
        if (z[n] == j) {
          y[n] ~ poisson(lambda[j]);
        }
      }
    } else {
      // could be zero, needs mixture modeling
      real likelihood_zero = 0;
      real likelihood_nonzero = 0;
      for (n in 1:N) {
        if (z[n] == j) {
          likelihood_zero += poisson_lpmf(y[n] | 0.0);
          likelihood_nonzero += poisson_lpmf(y[n] | lambda[j]);
        }
      }
      target += log_mix(phi, likelihood_zero, likelihood_nonzero);
    }
  }
}

(Of course poisson_lpmf(y[n] | 0.0) is always zero but I left it in to illuminate the general formula)

Rosina · April 3, 2022, 10:54am

Hello! I have now gone through your code in detail, and have tested it on simulated data. There is one bit where I’m not sure if it is me or you who is misunderstanding something. If y does not equal 0, lambda cannot equal 0. But when y does equal 0, this does not imply that lambda is 0, right? So in the first bit of code you wrote, I think the model has to be reformulated like this:

model {
    lambda ~ gamma(alpha, beta); 
    phi ~ beta(1, 1);
    for(n in 1:N){
         // if y is 0, we know that lambda either belongs to the 0 class (with probability phi)
         // or was sampled from a poisson distribution with probability 1 - phi
         if(y[n] == 0) target += log_sum_exp(log(phi), log1m(phi) + poisson_lpmf(y[n]|lambda[z[n]]));
         // otherwise, y is poisson distributed with probability 1 - phi
         else target += log1m(phi) + poisson_lpmf(y[n] | lambda[z[n]]);
   }
}

And then the analogous change in the hierarchical model as well. Right?

Rosina · April 3, 2022, 10:59am

Hello! Sorry for not replying sooner, I wanted to take the time to think about it properly. The code you wrote was very useful in terms of getting me to think better about the problem. I think that the model that I was trying to formulate genuinely didn’t make sense in the way that I was trying to formulate it. It wasn’t a Stan problem, it was a thinking problem. So I will go with a solution akin to what you suggested. Thank you very much!

Rosina · April 3, 2022, 11:00am

@cgoold @nhuurre Thank you so much for your help, this is such a friendly forum!

cgoold · April 4, 2022, 9:07am

Ah, yes, sorry! I misunderstood your example there. Apologies.

Rosina · April 4, 2022, 9:13am

No worries, I was just double-checking that I hadn’t misunderstood your misunderstanding!

Topic		Replies	Views
Specifying hurdle model with generated & derived quantities Modeling rstan , specification , ecology	3	512	September 13, 2023
Help with Bernoulli sampling in model block Modeling rstan	4	155	July 5, 2024
Hierarchical zero-inflated model: effective sample size decreases with number of individuals Modeling hierarchical-model	4	581	February 5, 2021
Help with specification of hierarchical priors? Modeling	9	1310	December 4, 2018
Estimating model with moment constraints Modeling	1	383	May 10, 2019

Hierarchical hurdle model in Stan?

Related topics