I am writing a custom function to generate a co-variance matrix for use in GP regression. Element (i,j) of the matrix is the product of the exponentiated quadratic co-variance function (cov_exp_quad) and g(x_i)*g(x_j) where g(.) is the pdf of a univariate probability distribution such as the normal distribution or Student’s-t.

My problem is that I cannot find in Stan a function that just calculates the pdf, only the functions for incrementing the log-likelihood.

Do such functions exist?

Thanks

yes, e.g.- normal_lpdf

Thanks. I must be blind not to have seen it!

Well that calculates the *log* pdf. If you do actually need the pdf (rare in Stan but occasionally necessary) you can exponentiate.

If our documentation is so terrible that that’s not apparent we really need to fix it!

I had noticed that, Jonah, and as a relatively new user of Stan I am still puzzled why using the pdf itself should be so rare. The exponentiation is an overhead I could do without as it is needed in my application on every iteration, which cannot be efficient.

non-lpdf calculation overflows so easily there’s almost no point to having it. It’s an extremely common mistake in naive implementations of pretty much any statistical calculation.

It’s what @sakrejda said (and also underflow, which is even more common than overflow). It’s very arithmetically unstable to deal with densities on anything but the log scale.

What are you doing with densities that you need them on something other than the log scale?

I am attempting GP regression with a variance-covariance function which is

based on the product of densities with a quadratic exponential correlation

function. In the case of the normal density some simplification is feasible

analytically but I would still need to evaluate the normal distribution

function.

Most of the libraries providing pdf are computing first lpdf and then exponentiating in the end (there might be rare exceptions where numerical accuracy can be better if not going through lpdf). In Stan if you call lpdf and exponentiate that in Stan code you will not get any overhead as the exponentiating is done in C++ anyway. So don’t worry about the overhead in your case.

Aki

Thanks Aki: very reassuring.

Jeremy

If `exp()`

is being called with something involving only data, then there’s only the overhead of the call to `exp()`

. If it’s done with parameters, there’s additional overdiff overhead.

But the larger concern is underflow/overflow if you take the probability functions off the log scale.

Check out the “Automatic Relevance Determination” section on GPs in the manual. Is something like that what you’re doing?

And if you really need a log of a sum, best to stay in the log scale anyway and use log_sum_exp: http://andrewgelman.com/2016/06/11/log-sum-of-exponentials/

If you don’t get this working, post the code you have and maybe someone can point out what’s wrong.

Hope that helps!

Thanks bbbales2 for the link to Bob Carpenter’s public service announcement. I am now coding, as you suggest, using log_sum_exp. Let’s see if that works.

Jeremy