# Why are these two similar Zero-inflated Poisson models producing quite different results?

#1

I currently have values generated from a zero-inflated poisson model:

``````df <- data.frame(value=c(rpois(1000, 4), rep(0L, 1000)))  #mixture probability is 0.5
``````

I then summarize these to produce a histogram; each row represents the value and its corresponding count:

``````summarized_df <- df %>% group_by(value) %>% summarize(count=n())
``````

I use a Poisson zero-inflated model for the the probability mass function, , as such:

``````data {
int<lower=1> N; // rows of data
int<lower=0> value[N];
int<lower=0> count[N];
}

parameters {
real<lower=0, upper=1> theta; // probability it is zero
real<lower=0> lambda; // poisson
}

model {
theta ~ beta(0.1, 0.1);
lambda ~ gamma(0.1, 0.1);
for (n in 1:N) {
if (value[n] == 0)  {
target += count[n] * (log_sum_exp(bernoulli_lpmf(1 | theta),
bernoulli_lpmf(0 | theta)
+ poisson_lpmf(value[n] | lambda)));

} else {
target += count[n] * (bernoulli_lpmf(0 | theta)
+ poisson_lpmf(value[n] | lambda));
}
}
}
``````

This produces sensible results:

``````           mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
theta      0.50    0.00 0.01     0.48     0.49     0.50     0.51     0.52  3860    1
lambda     4.00    0.00 0.07     3.87     3.96     4.00     4.05     4.13  4000    1
``````

However, when I attempt to iterate through the counts instead, the results get all wonky:

``````data {
int<lower=1> N; // rows of data
int<lower=0> value[N];
int<lower=0> count[N];
}

parameters {
real<lower=0, upper=1> theta;
real<lower=0> lambda;
}

model {
theta ~ beta(1, 1);
lambda ~ gamma(1, 1);
for (n in 1:N) {
int C;
C = count[N];
for (idx in 1:C) {
if (value[n] == 0)  {
target += log_sum_exp(bernoulli_lpmf(1 | theta),
bernoulli_lpmf(0 | theta) + poisson_lpmf(value[n] | lambda));
} else {
target += bernoulli_lpmf(0 | theta) + poisson_lpmf(value[n] | lambda);
}
}
}
}

mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
theta    0.11    0.00 0.06   0.02   0.07   0.10   0.15   0.26  2917    1
lambda   5.75    0.01 0.51   4.80   5.39   5.74   6.09   6.76  2705    1
``````

Sorry if I’m missing something completely obvious, but my suspicion was that these models would produce roughly the same results.

#2

In the second model you probably meant:

``````C = count[N];
``````

to be

``````C = count[n];
``````

But this is just replacing multiplication with a for loop, which doesn’t seem ideal?

Also your priors are different in the two models. That might affect things. Hope that helps!