Storing log-likelihood in model chunk


May I ask that besides ‘target +=’ statement in the model trunk, is there any other way to store the log-likelihood of substes of data? For example, could we define a null vector to store the log-likelihood? (eg. My data includes several components and I would like to save the total sum of log-likelihood of each component separately).

Also may I ask that could I use expression like 'target += sum(normal_lpdf(y|mu, sigma)) ’ to calculate the sum of log likelihood of a normal sample where y, mu and sigma are all vectors? If not, is there any expression to implement such task?


You can compute the components in the transformed parameters block. For example

data {
  int N;
  vector[N] y;
parameters {
  vector[N] x;
transformed parameters {
  vector[N] lk;
  for (i in 1:N)
    lk[i] = normal_lpdf(y[i]|x[i],1);
model {
  x ~ normal(0,1); // prior
  target += lk; // likelihood

The vectorized normal_lpdf(y|mu,sigma) calculates the summed log-likelihood, no need for explicit sum().

Thanks and it’s great to know that normal_lpdf() is vectorized in Stan. However, an additional question is that how could I know whether certain function is vectorized or not since in Stan Functions Reference it said that the input of normal_lpdf() should be real instead of a vector thus I think it is not a vectorized function (16.1 Normal distribution | Stan Functions Reference). May I ask that do I have certain misunderstanding about it?


I think almost all distributions are vectorized.

The signature given in the Functions Reference is

real normal_lpdf(reals y | reals mu, reals sigma)

It’s not very clear but the input is reals, not real. That means it can take either real or any container of reals, i.e. vector, row_vector or an array real[]. (IIRC matrix doesn’t work, though…it’s not very clear.)
The output is real and that tells you that it’s going to give a single number no matter what the input is.
In contrast the rng has signature

R normal_rng(reals mu, reals sigma)

and you can see that this too can take reals (i.e. vectorized) input but the output is even more mysterious R type. The R means that the output is real if all inputs are real and the output is an array of real if any of the inputs are either arrays or vectors.

Note that if you’re just looking to access the sum log likelihood, it’s automatically available in the interfaces as the “lp__” parameter.

Edit: oops! Just realized that’s incorrect. I was thinking log probability. If you want the likelihood alone (I.e. without the contribution of the priors), you do have to compute/store it as described here.