Zero Inflated Gamma model

I am trying to fit a Zero Inflated (or Hurdle) Gamma model, following the example in section 13.7 on the manual. I use mean and variance to parameterize the Gamma distribution because they are more intuitive for interpreting results, but I don’t think this is relevant. Almost the exact model but with the poisson_lpmg instead of gamma_lpdf works fine, what am I missing? Am I not supposed to mix lpmf and Ldp.? Any suggestions would be appreciated!

// zero inflated gamma
data {
  int<lower=1> n_obs;
  real y_mean; 
  real y_vari; 
  real non_zero;
  real<lower=0>  y[n_obs];
}

parameters {
  real t; 
  real m; 
  real v;   
}


transformed parameters {
  real<lower=0, upper=1> theta; // probability of zero
  real<lower=0> mu; // mean 
  real<lower=0> va; // variance
  
  theta = inv_logit(t);
  mu = exp(m);
  va = exp(v);
}

model {
  t ~ normal(logit(non_zero), 1);
  m ~ normal(log(y_mean),1);
  v ~ normal(log(y_vari),1);
  
  // likelihood: Zero_Inflated Gamma
  for (i in 1:n_obs) {
    if (y[i] == 0) {
      target += log_sum_exp(bernoulli_lpmf(1 | theta), 
                            bernoulli_lpmf(0 | theta) +
                            gamma_lpdf(y[i] | mu*mu/va, mu/va));
    } else {
      target += bernoulli_lpmf(0 | theta) +
                gamma_lpdf(y[i] | mu*mu/va, mu/va);
    }
  }
}

I made up some data and run the model but I can not make it work. I get “… chains where the estimated Bayesian Fraction of Missing Information was low”

library(tidyverse)
library(rstan)

# total sample number
n <- 2000
# probability of zero
theta <- 0.3
# gamma pars, these are close to the values in my actual data
mu = exp(-4.5); va = exp(-11)
shape = mu^2/va; rate = mu/va

df <- tibble(y=rgamma(n*(1-theta), shape, rate)) %>%
	bind_rows(tibble(y=rep(0, n*theta)))

p <- ggplot(df, aes(x=y)) +
	geom_histogram()

model_data <- list(
	n_obs = nrow(df), 
	y_mean = df %>% filter(y>0) %>% pull(y) %>% mean(), 
	y_vari = df %>% filter(y>0) %>% pull(y) %>% var(), 
	non_zero = df %>% filter(y==0) %>% nrow()/nrow(df),  
	y=df$y
	)

model_name <- "zig simulation.stan"
model <- stan(file=model_name,
              data = model_data,
              cores = getOption("mc.cores", 1L))

summary(model)$summary

p.s. looking around the internet I found some discussion on this (Zero Inflated Gamma) by Richard McElreath here and here but the code he provides seems to be outdated… and anyway I need to to know what I am doing wrong.

2 Likes

…ok I now realize my folly ಥ_ಥ
This isn’t really a zero inflated model, in the sense that the zeros only come from the bernoulli process, as the gamma does not produce any zeros.

So instead of this

// likelihood: Zero_Inflated Gamma
for (i in 1:n_obs) {
    if (y[i] == 0) {
      target += log_sum_exp(bernoulli_lpmf(1 | theta), 
                            bernoulli_lpmf(0 | theta) +
                            gamma_lpdf(y[i] | mu*mu/va, mu/va));
    } else {
      target += bernoulli_lpmf(0 | theta) +
                gamma_lpdf(y[i] | mu*mu/va, mu/va);
    }
  }

I used this

// likelihood: Gamma hurdle
  for (i in 1:n_obs) {
    if (y[i] == 0) {
      target += bernoulli_lpmf(1 | theta);
    } else {
      target += bernoulli_lpmf(0 | theta) +
                gamma_lpdf(y[i] | mu*mu/va, mu/va);
    }
  }

and that seem to work just fine!!
I think this is essentially fitting two models, bernoulli for zeros and gamma for the rest. Is there an advantage or trade-off to doing it this, or fitting two models separately?

2 Likes

When you do two separate models it’s easier to vectorize… if speed is an issue. Also, you could try gamma_lpdf(y | inverse(va), inverse(va)/mu), which tends to be more stable. Even better if you have something like inverse_va in your parameter block and “sample” it directly.

3 Likes

Hello Max!

I use regularly such parametrization of the gamma distribution, and I’m very interested by your comment.

Could you please explain how you get this formulation and conclusion? Or could you suggest a reference? :)

Thank you!
Lucas

Hi Lucas,

I think I found this on the old Stan Google mailing list. I tried it and it tended to work better for my applications.

But it’s actually pretty easy to derive this formulation from the perspective of a GLM framework, i.e. that the variance function is V(\mu)=\mu^2 for the Gamma GLM. With dispersion parameter \phi the variance is given by \phi V(\mu)=\phi \mu^2. Then plug in a/b for the mean \mu and a/b^2 for the variance, i.e. a/b^2 = \phi (a/b)^2, and solve for a and b. Then you should end up with something like a=\phi^{-1} and b=\phi^{-1} / \mu, where a and b are of course the parameters of the Gamma distribution as implemented in Stan.

I’m not entirely sure if that’s what you’ve been looking for, but I hope it helps.

Cheers, Max

3 Likes

Thank you Max, you are perfectly right, and it was exactly what I was looking for!

Cheers,
Lucas

@Max_Mantei thank you
I will try gamma_lpdf(y | inverse(va), inverse(va)/mu), any tricks to speed up sampling are appreciated!

Hi there,

What manual are you talking about here, I am currently doing some research work on zero-inflated gamma model, would you mind sharing the data from the example you metioned? I really appreciate it.

Xiao

Hello Xiao,

the Gamma GLM can not really be zero-inflated - because it has zero probability density at zero. You are probably looking for a hurdle model (sometimes called two-part model, because you could estimate each part separately). In the Stan manual (on the Stan website) there is a short discussion of a Poisson hurdle model, which is even more complicated, because of the truncation of the Poisson at zero. You won’t have that in a Gamma hurdle model.

@Xwang1020 in my original post I was referring to Stan User’s Guide. There is an example with code for a Zero Inflated Poisson model. I used the previous versions of the manual so my chapter reference might be a bit off.

Thanks a lot !

Do we need to assign a prior distribution for the Variance term?

I believe Stan will just use a uniform distribution for prior if you don’t specify it.