Old math issue about `score()` that may be easily resolved

Add score function issue #461 says that we need nested autodiff and when I search the repo for it apparently it’s already there. Maybe @stevebronder or @syclik knows if we can expose this.


I think the problem with a score function wouldn’t be nested autodiff, would be higher order autodiff.

A score function is like a derivative of a likelihood if I understand correctly, so that’s something like:

s = f(\theta) = \frac{\partial lp}{\partial \theta}

And then if we’re gonna do reverse mode autodiff on this we’ll need to be able to compute \bar{s} \nabla f

And gradients of f are gonna be second derivatives of the function lp. Or something like this?

Edit: Fixed some problems with eqs

What if we only do this for gen quantities at first as suggested in https://github.com/stan-dev/stan/issues/605#issuecomment-37674197? It would wrap hessian() and then call the log-likelihood for each obs

Ooof, where I know of score functions being used are in the likelihood, and the \theta are parameters.

For now if people want score functions they’ll have to derive them themselves, which isn’t great, but doing this would require either higher order autodiff or limiting where it’s used which are both pretty rough.

Deceptively complicated issue, this one :D.

1 Like

Bringing this backup. Does the score function need higher order or just the hessian? We compute hessians for the prim methods of the algebra solvers so that would be okay right?

1 Like

A little OT, but this “Score” quantity is a new metric to me and has piqued my curiosity: Would I be correct in my understanding that it conveys a gradient that is likelihood-only/prior-agnostic whereas the gradient computed behind-the-scenes by Stan’s AD stuff when sampling reflects the topography implied by the combination of the likelihood and priors?