Help understanding znb model

Paul_Lantos · July 18, 2019, 11:49pm

This is more of a statistical question than an interface one - I’ve got a very zero inflated count dataset, and I’m using a zinb model in brms. Ordinarily I’d put in offset(pop) in a typical Poisson model, but in this case it barely influences the model estimate (whereas if you graph it there is a STRONG correlation between cases and population):

mod1 ← brm( cases ~ 1 , data=bg , family=zero_inflated_negbinomial )
mod2 ← brm( cases ~ offset(POP2010) , data=bg , family=zero_inflated_negbinomial )

summary(mod1)

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 0.88 0.13 0.65 1.15 462 1.01

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
shape 0.38 0.08 0.28 0.59 189 1.01
zi 0.09 0.07 0.00 0.26 279 1.01

summary(mod2)
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 0.79 0.12 0.56 1.05 1099 1.00

Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
shape 0.44 0.09 0.32 0.65 919 1.00
zi 0.10 0.07 0.00 0.26 965 1.00

Clearly the output is very similar - so is the offset really accomplishing anything in this model? I thought the intercept could be interpreted like a rate ratio.

Thanks

paul.buerkner · July 26, 2019, 10:20am

I would expect the offset to be included correctly, but you can also post the generated stan code to make sure.

However, the offset thing may not be equivalent to modeling the ratio as is the case for standard poisson/negbinomial model since we have the zero-inflation part additionally. I haven’t thought of how to model ratios in the zero-inflated case though so I can’t tell who if/how to achieve this exactly.

Paul_Lantos · July 27, 2019, 3:57pm

Thanks - this is the Stan code. I don’t see where the offset appears in the code, but I don’t look at the Stan code often (probably would be a good idea for me to look more often tbh)

stancode(fit0_znb)
// generated with brms 2.9.3
functions {

/* zero-inflated negative binomial log-PDF of a single response

Args:
y: the response value
mu: mean parameter of negative binomial distribution
phi: shape parameter of negative binomial distribution
zi: zero-inflation probability
Returns:
a scalar to be added to the log posterior
/
real zero_inflated_neg_binomial_lpmf(int y, real mu, real phi,
real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
neg_binomial_2_lpmf(0 | mu, phi));
} else {
return bernoulli_lpmf(0 | zi) +
neg_binomial_2_lpmf(y | mu, phi);
}
}
/ zero-inflated negative binomial log-PDF of a single response
logit parameterization of the zero-inflation part
Args:
y: the response value
mu: mean parameter of negative binomial distribution
phi: shape parameter of negative binomial distribution
zi: linear predictor for zero-inflation part
Returns:
a scalar to be added to the log posterior
/
real zero_inflated_neg_binomial_logit_lpmf(int y, real mu,
real phi, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
neg_binomial_2_lpmf(0 | mu, phi));
} else {
return bernoulli_logit_lpmf(0 | zi) +
neg_binomial_2_lpmf(y | mu, phi);
}
}
/ zero-inflated negative binomial log-PDF of a single response
log parameterization for the negative binomial part
Args:
y: the response value
eta: linear predictor for negative binomial distribution
phi: shape parameter of negative binomial distribution
zi: zero-inflation probability
Returns:
a scalar to be added to the log posterior
/
real zero_inflated_neg_binomial_log_lpmf(int y, real eta,
real phi, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_lpmf(1 | zi),
bernoulli_lpmf(0 | zi) +
neg_binomial_2_log_lpmf(0 | eta, phi));
} else {
return bernoulli_lpmf(0 | zi) +
neg_binomial_2_log_lpmf(y | eta, phi);
}
}
/ zero-inflated negative binomial log-PDF of a single response
log parameterization for the negative binomial part
logit parameterization of the zero-inflation part
Args:
y: the response value
eta: linear predictor for negative binomial distribution
phi: shape parameter of negative binomial distribution
zi: linear predictor for zero-inflation part
Returns:
a scalar to be added to the log posterior
*/
real zero_inflated_neg_binomial_log_logit_lpmf(int y, real eta,
real phi, real zi) {
if (y == 0) {
return log_sum_exp(bernoulli_logit_lpmf(1 | zi),
bernoulli_logit_lpmf(0 | zi) +
neg_binomial_2_log_lpmf(0 | eta, phi));
} else {
return bernoulli_logit_lpmf(0 | zi) +
neg_binomial_2_log_lpmf(y | eta, phi);
}
}
// zero_inflated negative binomial log-CCDF and log-CDF functions
real zero_inflated_neg_binomial_lccdf(int y, real mu, real phi, real hu) {
return bernoulli_lpmf(0 | hu) + neg_binomial_2_lccdf(y | mu, phi);
}
real zero_inflated_neg_binomial_lcdf(int y, real mu, real phi, real hu) {
return log1m_exp(zero_inflated_neg_binomial_lccdf(y | mu, phi, hu));
}
}
data {
int<lower=1> N; // number of observations
int Y[N]; // response variable
// data for splines
int Ks; // number of linear effects
matrix[N, Ks] Xs; // design matrix for the linear effects
// data for spline t2(long,lat)
int nb_1; // number of bases
int knots_1[nb_1]; // number of knots
// basis function matrices
matrix[N, knots_1[1]] Zs_1_1;
matrix[N, knots_1[2]] Zs_1_2;
matrix[N, knots_1[3]] Zs_1_3;
vector[N] offset;
int<lower=1> K_zi; // number of population-level effects
matrix[N, K_zi] X_zi; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc_zi = K_zi - 1;
matrix[N, Kc_zi] Xc_zi; // centered version of X_zi without an intercept
vector[Kc_zi] means_X_zi; // column means of X_zi before centering
for (i in 2:K_zi) {
means_X_zi[i - 1] = mean(X_zi[, i]);
Xc_zi[, i - 1] = X_zi[, i] - means_X_zi[i - 1];
}
}
parameters {
// temporary intercept for centered predictors
real temp_Intercept;
// spline coefficients
vector[Ks] bs;
// parameters for spline t2(long,lat)
// standarized spline coefficients
vector[knots_1[1]] zs_1_1;
// standard deviations of the coefficients
real<lower=0> sds_1_1;
// standarized spline coefficients
vector[knots_1[2]] zs_1_2;
// standard deviations of the coefficients
real<lower=0> sds_1_2;
// standarized spline coefficients
vector[knots_1[3]] zs_1_3;
// standard deviations of the coefficients
real<lower=0> sds_1_3;
real<lower=0> shape; // shape parameter
vector[Kc_zi] b_zi; // population-level effects
// temporary intercept for centered predictors
real temp_zi_Intercept;
}
transformed parameters {
// actual spline coefficients
vector[knots_1[1]] s_1_1 = sds_1_1 * zs_1_1;
// actual spline coefficients
vector[knots_1[2]] s_1_2 = sds_1_2 * zs_1_2;
// actual spline coefficients
vector[knots_1[3]] s_1_3 = sds_1_3 * zs_1_3;
}
model {
// initialize linear predictor term
vector[N] mu = temp_Intercept + rep_vector(0, N) + Xs * bs + Zs_1_1 * s_1_1 + Zs_1_2 * s_1_2 + Zs_1_3 * s_1_3 + offset;
// initialize linear predictor term
vector[N] zi = temp_zi_Intercept + Xc_zi * b_zi;
// priors including all constants
target += normal_lpdf(temp_Intercept | 0, 1);
target += normal_lpdf(zs_1_1 | 0, 1);
target += normal_lpdf(zs_1_2 | 0, 1);
target += normal_lpdf(zs_1_3 | 0, 1);
target += student_t_lpdf(sds_1_1 | 3, 0, 10)
- 1 * student_t_lccdf(0 | 3, 0, 10);
target += student_t_lpdf(sds_1_2 | 3, 0, 10)
- 1 * student_t_lccdf(0 | 3, 0, 10);
target += student_t_lpdf(sds_1_3 | 3, 0, 10)
- 1 * student_t_lccdf(0 | 3, 0, 10);
target += gamma_lpdf(shape | 0.01, 0.01);
target += normal_lpdf(b_zi | 0, 1);
target += normal_lpdf(temp_zi_Intercept | 0, 1);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += zero_inflated_neg_binomial_log_logit_lpmf(Y[n] | mu[n], shape, zi[n]);
}
}
}
generated quantities {
// actual population-level intercept
real b_Intercept = temp_Intercept;
// actual population-level intercept
real b_zi_Intercept = temp_zi_Intercept - dot_product(means_X_zi, b_zi);
}

LucC · July 27, 2019, 4:13pm

In the data block, vector[N] offset is declared. At the start of the model block, it is used to construct the linear predictor. No transformation of offset is done in the Stan code, implying that you need to feed brms the correctly transformed offset (depending on link function, which I assume is the natural logarithm here)

Topic		Replies	Views
How to specify rate() or offset in zero_inflated models using nl Modeling brms	1	357	February 16, 2023
Zero_inflated_poisson with brms Modeling rstan , fitting-issues , brms	4	1577	June 14, 2021
Help interpreting zero inflated poisson model error brms fitting-issues	4	507	February 17, 2021
Model specification of a zero-inflated negative binomial random effect model using brms brms specification	4	850	July 2, 2021
Model fails when using offset term brms	1	451	July 18, 2019

Help understanding znb model

Related topics