# Is one model using binomial equivalent to the other that uses bernoulli?

I am puzzled by two very different ways of modelling a simple binomial case (which, as we know, have a closed-form solution of beta distribution).

Model 1, the straightforward:

``````//The binomial version
data {
int<lower=0> N;
int<lower=0> n;
}

parameters {
real<lower=0, upper=1> p;
}

model {
n ~ binomial(N, p);
}
``````

model 2, the bernoulli:

``````//The bernoulli version
data {
int<lower=0> N;
int<lower=0> n;
}

parameters {
real<lower=0, upper=1> p;
}

model {
target += bernoulli_lpmf(1|p)*n;
target += bernoulli_lpmf(1|1-p)*(N-n);
}
``````

I am getting consistently the same results from both approaches, but I cannot prove that they are always equivalent.

How can I understand or prove it?

A follow-up on this idea. Here are would-be multinomial models with k elements:

``````//The binomial version
data {
int k;
int<lower=0> ns[k];
}

parameters {
simplex[k] p;
}

model {
ns ~ multinomial(p);
}
``````

and

``````//The bernoulli version
data {
int k;
int<lower=0> ns[k];
}

parameters {
simplex[k] p;
}

model {
for(i in 1:k) {
target += bernoulli_lpmf(1|p[i])*ns[i];
}
}
``````

Again, the models seem to be equivalent.

There is one distinct advantage of the “bernoulli” version - one is not forced to have all the data specified. Imagine the situation where the `p` simplex is a k-dimensional function of actual parameters. In the bernoulli case, if we miss some `ns` observations, we simply leave them out in the “model” loop. This is impossible to do in the “multinomial” case.

These two model descriptions have identical log-likelihoods (up to an additive constant), and so are equivalent.

The Binomial log-likelihood is log(binom(N, n)) + n * log(p) + (N - n) * log(1 - p). If you drop the constant then this is exactly what you get with the sum of the two Bernoulli log-likelihoods.

This also holds in the multinomial case. You can replicate the advantage you mention when using the multinomial by setting ns[i] to 0.

1 Like