Hi,
I am trying to implement the loo package with a JAGS Binomial mixture model but, do not understand how to compute the likelihoods or translate STAN statements for JAGS.

Could someone translate these into what I need for JAGS or describe the quantities I need to derive?

It is the same concept for JAGS or any other MCMC sampler. You need each â€śobservationâ€™sâ€ť contribution to the log-likelihood evaluated over all the draws of the parameters. So if your BUGS model had some likelihood such as

y[i] ~ dnorm(mu,tau)

in R, you would need to loop over a statement like this

where mu is a vector that contains all the MCMC draws of the conditional mean and tau is a vector of the same size that contains all the MCMC draws of the conditional precision.

In the end, log_Lik should be a matrix with rows equal to the number of MCMC draws and columns equal to the number of â€śobservationsâ€ť and cells containing the log-likelihood contribution for that iteration by that â€śobservationâ€ť.

It only gets a bit tricky when the â€śobservationsâ€ť are not conditionally independent or there is some multilevel structure or something, so what you are imagining leaving out is not an isolated individual.

This is where the Stan and BUGS languages differ a bit. In BUGS, you read

y[i] ~ dbern(psi)

as something like â€śin a generative model, y[i] is drawn from a Bernoulli distribution with probability psiâ€ť and BUGS tries to work out the full conditional distributions implied by the DAG for that generative process (and falls back to some non-Gibbs sampling stuff if it doesnâ€™t not recognize a full-conditional distribution).

But in Stan

vector[N] psi = inv_logit(eta);
y ~ bernoulli(psi);

or

y ~ bernoulli_logit(eta);

reads as "add the kernel of the log-likelihood of psi onto the accumulated target" which is why I prefer

target += bernoulli_logit(y | eta);

because it is more explicit about what Stan is doing when it goes to draw from this posterior distribution.

So, to your question

the log-likelihood is a real number, not a 0 / 1. This is actually what R does as well, when you call

dbinom(y[i], size = 1, prob = psi[i], log = TRUE)

You can have a problem where the likelihood underflows to zero, which is why it is better to keep everything in log-units by using bernoulli_logit instead of bernoulli. Then it should basically never be the case that the log-likelihood overflows to negative infinity.