N_eff BDA3 vs. Stan

It looks like RStan doesn’t: rstan/rstan/src/chains.cpp. I didn’t realize
it wasn’t using Stan’s structure. I think we should consolidate code if we
can.

Thanks for your work on chains.hpp, the changes look great.

Getting the interfaces to use the same code is or should be a top priority.

I tried at one point to have PyStan use chains.hpp but it turned out to
be really slow.

The main difficulty is exposing some sort function in the services
namespace which is easy and computationally convenient for the
interfaces to call. The function signature should only involve std data
structures, no Eigen data structures. You can, however, assume that any
interface interested in calling the chains functions will be able to
arrange the samples into contiguous memory. (The RStan people can
correct me if I’m wrong about this.)

Should I make pull requests for rstan/rstan/src/chains.cpp and pystan/pystan/_chains.pyx ?

Yeah, although it may require more hacking to get it to work.

Why there is effective_sample_size and effective_sample_size2 ? I can read the comment but I don’t understan it. Is effective_sample_size2 really used?

Hey @avehtari,

I am trying to implement this computation in PyMC3.

I have a question for you. In your tests, what exactly is dimensions? Is it the length of the chains or number of chains or something else?

Thanks in advance!
Sharan

I don’t know what dimensions you are referring. There is not such variable in the parts of the code I modified.

The more hacking seems to refer to the fact that it’s really difficult to build rstan. There are not much instructions and even @bgoodri keeps guessing what I should do to get it built…

PyStan build was easier (only couple hours to figure how to install and run in place), but I haven’t been able to figure out how to run tests, or compare results with CmdStan (same random seeds doesn’t seem to provide same answer). I’m novice in Python, so I’m now stuck, and would appreciate help how to load values of chains from csv-files (one csv-file per chain) and run ess function for those chains. @ariddell can you help?

If everything you need upstream has been merged into develop, try

source("https://raw.githubusercontent.com/stan-dev/rstan/develop/install_StanHeaders.R", echo = TRUE)

and then installing the modified rstan.

Where does this install those headers?

.libPaths()[1]

Now CmdStan, RStan and PyStan N_eff computations all use Geyer’s initial monotone sequence and results match at least with 6 digit accuracy when using exactly same chains (blocker.1.csv and blocker.2.csv).

Based on the experience it would be great if this kind of computational code would be shared…