Obtain gradient of every target+=

I just implemented a function in R to compute subjectwise gradient contributions from my stan models, useful for misspecification checks, but it would also be wonderful to get observation wise gradients. I can’t see an easy way to do this outside of stan because of the dependencies in the structure, has anyone thought in this direction before? It would be a neat feature, I think…

3 Likes

Maybe code the model such that the Stan model can be configured to only compute the log lik for a specific data item?

Is there a ref for what you are doing?

2 Likes

hmmm true that’s a possibility. Inefficient but perhaps not too bad. I think there are lots of approaches that use the ‘score’ (gradient contribution) for various checks. individual parameter contributions is one small set of work I have some connection to… https://www.tandfonline.com/doi/full/10.1080/10705511.2019.1667240?af=R

1 Like

That is an interesting idea. About subject-wise vs observation-wise, I was involved in a paper where we considered this in the context of LMM. Not sure whether it is helpful, but see Sec 3.1 here:

https://www.jstatsoft.org/article/view/v087c01

4 years on and I’m back to thinking about this, it really would be helpful for fast post-hoc misspecification checks / model enhancements. Has anything changed in the stan backend that might make this easier to obtain? When there are dependencies in the dataset, obtaining the observation-wise gradient contributions at present actually seems to require that I compute the likelihood of the ‘almost full’ (n-1 rows) dataset for every row n, and subtract this from the full likelihood – I don’t think treating the rows individually works. Here’s one use case, though I’m more interested in time series.
https://psyarxiv.com/jw8xb/

Not sure, but maybe the bridgestan package could help?

Doesn’t look like it. Thinking about it I guess it would have to be an internal Stan method that would store and output the array of gradients, incremented for each target+= call. Would be interested to hear from e.g. @Bob_Carpenter how hard such a thing might be to implement… ‘easy’ might motivate me to look at c++ again ;)

1 Like