I’m trying to fit a simple gamma hurdle (aka zero inflated gamma) model in Rstan and as a simple use case I have only 1 predictor (binary variable). I tried to model it after the brms implementation (except for not centering). The brms model finishes fine, but my simple coding below does not - it looks like it will take hours and hours to complete. I am curious, is there something fundamentally wrong with my implementation?
TARGET_D is zero or a continuous variable
TREAT is the binary treatment variable
I am new to Stan and trying to learn.
stan_str_=
'
data{
int N;
vector<lower=0, upper=1>[N] TREAT; // 1 == treated, 0 = control
real y [N]; // amount donated
}
parameters{
real int_gamma; // this is the intercept in the gamma part
real beta_gamma; // this is the beta on treatment in the gamma part
real int_logistic; // this is the intercept in the logistic part
real beta_logistic; // this is the beta on treatment in the logit part
real <lower=0> shape; // alpha from gamma dist
}
transformed parameters {
}
model{
//priors (BRMS leave the betas and rate as defaults I guess). They centered the design matrix so I
//didnt use their priors but used normal centered on zero that should contain their mean results
target += normal_lpdf(int_gamma | 0,5);
target += normal_lpdf(shape | 0,5);
target += normal_lpdf(int_logistic | 0, 5);
//build up log lik
//https://seananderson.ca/2014/04/08/gamma-glms/
//adding and not multiplying in the "else" since assuming taking logs (log-lik)
// regarding bernouli, it accepts just the linear predictor (does logit transform itself)
// recall bernoulli likelihood: p(x)=p^x(1−p)^(1−x)
// when y = 0 we normally want log((1−p)) but p in this parametrization is
//prob y ==0 so we want log(p) or equiavelntly compute bernoulli( 1 | 1-p)
//which we do when y=0 (see example above to see they are the same)
for (n in 1:N)
{
if (y[n] == 0)
{
target += bernoulli_logit_lpmf( 1 | int_logistic + beta_logistic * TREAT[n]);
}
else
{
// target += bernoulli_logit_lpmf( 0 | hu[n]) + gamma_lpdf(y | shape , shape / exp(mu[n]));
target += bernoulli_logit_lpmf( 0 | int_logistic + beta_logistic * TREAT[n]) + gamma_lpdf(y | shape , shape / exp(-(int_gamma + beta_gamma * TREAT[n])));
}
}
}
'
fit1 <- stan(model_code = stan_str_,
data = list(N=nrow(kdd),y=kdd$TARGET_D,TREAT=kdd$TREAT),
warmup = 500, iter = 500, chains = 4, cores = 4, thin = 1)
Example of the output showing how long this will take:
Chain 1: Gradient evaluation took 6.1202 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 61202 seconds.
Chain 1: Adjust your expectations accordingly!