LBA sampling: Stan vs particle MCMC vs annealed importance sampling


#1

Thought this newly-posted pre-print might interest some here.

The “LBA” model is a very useful model in the niche context of cognitive science experiments using speeded decision tasks. Like the diffusion (a.k.a. Wiener) model, it permits you to observe response times & response accuracies and make inference on latent quantities like information processing efficiency and speed-vs-accuracy bias. Unlike the diffusion, LBA can be extended to more than 2 decision options and (if I understand correctly) has a more tractable likelihood. Annis et al (2017) showed how to implement a couple simple variants of LBA in Stan, including a hierarchical version, but the pre-print linked above claims easier parallelism. Wonder how that claim is going to fare when the Stan MPI stuff is ready?


#2

Yeah that’s really interesting, I was just working through this recent paper: https://arxiv.org/abs/1806.09996 which references Stan quite a lot (but not LBA so it may be tangential to your interest). The researcher has created an R program to extract posterior draws from rstan to compute the DIC (deviance information criteria).


#3

Ready! CmdStan 2.18 was released a couple weeks ago. I don’t know if there’s a timeline for rstan or pystan 2.18.

Did they do any validation such as simulation-based calibration that their sampler actually got the right answers including uncertainty intervals? Or did they at least simulate data and try to fit it? I saw some discussion of simulation on page 20, but they were only looking at means and seem to put a lot of faith in single chain correlation measures, which we know isn’t a very good way to test convergence in slowly-mixing chains.

As far as the basic methods, we’ve evaluated them and are not interested:

  • like other particle methods in non-conjugate settings, ter Braak’s differential evolution doesn’t scale well with dimension

  • importance sampling also fails to scale with dimension because the draws get further and further away from the posterior (unless for some reason you already know the posterior!)

I also don’t think our Wiener model code is highly optimized, so there’s certainly gains to be made there.

We’re not exactly encouraging people to compute marginal data likelihoods for Bayes factors.