I implemented a RL model in Stan. The model uses a TD-learning mechanism to compute a choice probability theta
for each trial. Observations (choices) are in turn sampled with y ~ bernoulli(theta)
.
MCMC diagnostics look good and parameter estimates look reasonable. However, when I check the model using loo
I observe some serious miscalibration:
My understanding is that the above indicates that the model’s predictive density is over-dispersed compared to the data. And when I look at the QQ plot, I see this:
There are many LOO-PIT values of 1. I am not sure what to make of this. Has anyone encountered an anomaly like this before? Does anyone have any ideas about what could cause this?
Thank you for your time…