Storing log-likelihood in model chunk

stan_beginer · January 28, 2021, 3:18am

Hi,

May I ask that besides ‘target +=’ statement in the model trunk, is there any other way to store the log-likelihood of substes of data? For example, could we define a null vector to store the log-likelihood? (eg. My data includes several components and I would like to save the total sum of log-likelihood of each component separately).

Also may I ask that could I use expression like 'target += sum(normal_lpdf(y|mu, sigma)) ’ to calculate the sum of log likelihood of a normal sample where y, mu and sigma are all vectors? If not, is there any expression to implement such task?

Thx!

nhuurre · January 28, 2021, 8:30am

You can compute the components in the transformed parameters block. For example

data {
  int N;
  vector[N] y;
}
parameters {
  vector[N] x;
}
transformed parameters {
  vector[N] lk;
  for (i in 1:N)
    lk[i] = normal_lpdf(y[i]|x[i],1);
}
model {
  x ~ normal(0,1); // prior
  target += lk; // likelihood
}

The vectorized normal_lpdf(y|mu,sigma) calculates the summed log-likelihood, no need for explicit sum().

stan_beginer · January 28, 2021, 9:48pm

Thanks and it’s great to know that normal_lpdf() is vectorized in Stan. However, an additional question is that how could I know whether certain function is vectorized or not since in Stan Functions Reference it said that the input of normal_lpdf() should be real instead of a vector thus I think it is not a vectorized function (16.1 Normal distribution | Stan Functions Reference). May I ask that do I have certain misunderstanding about it?

Thx!

nhuurre · January 28, 2021, 10:06pm

I think almost all distributions are vectorized.

The signature given in the Functions Reference is

real normal_lpdf(reals y | reals mu, reals sigma)

It’s not very clear but the input is reals, not real. That means it can take either real or any container of reals, i.e. vector, row_vector or an array real[]. (IIRC matrix doesn’t work, though…it’s not very clear.)
The output is real and that tells you that it’s going to give a single number no matter what the input is.
In contrast the rng has signature

R normal_rng(reals mu, reals sigma)

and you can see that this too can take reals (i.e. vectorized) input but the output is even more mysterious R type. The R means that the output is real if all inputs are real and the output is an array of real if any of the inputs are either arrays or vectors.

mike-lawrence · January 28, 2021, 11:28pm

Note that if you’re just looking to access the sum log likelihood, it’s automatically available in the interfaces as the “lp__” parameter.

Edit: oops! Just realized that’s incorrect. I was thinking log probability. If you want the likelihood alone (I.e. without the contribution of the priors), you do have to compute/store it as described here.

Topic		Replies	Views
Writing log likelihood directly in Stan General	3	1811	November 26, 2020
Defining log likelihood in transformed parameters General	2	591	June 19, 2021
Question about vectorization statement Modeling	3	551	August 21, 2017
Log-Likelihood with different parts Modeling	4	491	July 31, 2020
Write model for log-likelihood in Stan General	9	2654	December 11, 2020

Storing log-likelihood in model chunk

Related topics