I used to used build-in distribution in Stan, for example in the model part, using somthing like beta ~Normal(mu, sigma).
And recently I have read some Stan code that directly defines the log-likelihood as following (a zero-inflated Poisson example):
data {
int<lower=0> N;
int<lower=0> y[N];
}
parameters {
real<lower=0, upper=1> theta;
real<lower=0> lambda;
}
model {
for (n in 1:N) {
if (y[n] == 0)
target += log_sum_exp(bernoulli_lpmf(1 | theta), bernoulli_lpmf(0 | theta) + poisson_lpmf(y[n] | lambda));
else
target += bernoulli_lpmf(0 | theta) + poisson_lpmf(y[n] | lambda);
}
}
I am little bit confused about the function of ‘log_sum_exp’ and ‘target’ here. Also curious why we could directly use ‘target’ here without defining it first?
Both the sampling statements ~ and the target statements target += are used for defining the log-likelihood. The only difference is that the ~ notation drops additive constants in the density, whereas target += keeps those constants.
Additionally the target += notation lets the user define the log-likelihood using more complex functions than are available in Stan, like the zero-inflated example above.
I only noticed this because I was just thinking of the new lupdf functions, but with the new unnormalized versions like normal_lupdf it is already (or will soon be?) possible to drop the constants using target +=. But that’s super new and all the functions used with target in the post above would keep the constants.