Rstan ESS calculation from CmdStan files

sakrejda · May 9, 2017, 2:05pm

I ran a simulation study on a cluster so I have hundreds of CmdStan output files. If I load them with read_stan_csv I only get one chain at a time so I can’t calculate the multi-chain ESS/Rhat. Is there a way to do these calls using rstan? I got the single-chain rhat but that’s not all that meaningful.

bbbales2 · May 9, 2017, 2:08pm

+1, using my own custom buggy scripts now

sakrejda · May 9, 2017, 2:09pm

I don’t recall if the calculation is implemented at the interface level currently, it shouldn’t be hard to get it into the core I just want to finish my simulation study first!

sakrejda · May 9, 2017, 2:19pm

Speaking of custom buggy scripts, can you share yours? I don’t feel like re-implementing my own custom buggy script right now.

syclik · May 9, 2017, 2:25pm

What’s wrong with stan_summary from CmdStan? It also lets you save as csv.

sakrejda · May 9, 2017, 2:28pm

Oh hey, I forgot about it because it crashes on really big models but for the current case it’ll work. Thanks!

syclik · May 9, 2017, 2:58pm

We can rewrite it so it doesn’t crash. That was literally a 4 hour weekend
coding project that hasn’t been touched since written. It was done really
poorly when I didn’t know how to use Eigen.

sakrejda · May 9, 2017, 3:00pm

I didn’t mean to imply otherwise. Really it should be a pretty thin wrapper since we can push most of the meaningful calculations to stan::math

syclik · May 9, 2017, 3:02pm

=). Didn’t read it that way. I just don’t like things crashing and I
remember writing that code in a haze and was happy it compiled. Last time I
looked, it looked really bad, but I didn’t want to spend time fixing it.

bbbales2 · May 9, 2017, 3:04pm

Wow I didn’t notice that stan_summary took in multiple files…

Document reading skills -1

bbbales2 · May 9, 2017, 3:25pm

Lemme add to that incomplete (only rhat), slow (Python), and for a slightly different format (from custom HMC) haha. Not something I’d wish on the outside world.

sakrejda · May 9, 2017, 3:31pm

I’ve been trying to avoid doing stuff like that too but… deadlines… :)

sakrejda · May 9, 2017, 4:02pm

My life has become a series of “system(cmd)” calls held together with duct tape so I don’t judge. :P

betanalpha · May 9, 2017, 4:34pm

There’s not too much we can do without moving away from the FFT calculation of the effective sample sizes which will always require keeping all of the samples in memory.

sakrejda · May 9, 2017, 5:03pm

We could certainly make the failure more graceful and I since the number of parameters matters to whether the code fails or not I believe we are currently loading all parameters at the same time whereas it could be parameter-by-parameter (for everything except the global R-hat).

Maverick · May 9, 2017, 7:32pm

read_stan_csv can read multiple files. why not using it? Did I miss anything?

jonah · May 9, 2017, 7:50pm

Like @Maverick said, read_stan_csv accepts multiple file names, so you can do multiple chains and then use RStan as you usually would with a stanfit object.

sakrejda · May 9, 2017, 9:00pm

hey, that’s true, I just missed it completely. Oh well.

ariddell · May 11, 2017, 12:13am

It would be great to have something in stan::services for this or for stansummary in general, so we didn’t have 3 different implementations.

I’ll eventually get around to submitting a PR but I wouldn’t complain of someone beat me to it!

Topic		Replies	Views
Slow cmdstanr/posterior vs. rstan summary CmdStan cmdstanr	5	1361	November 16, 2021
Empty stanfit Object For Computing MCMC Summaries RStan	6	745	March 29, 2019
Fast cmdstanr summary function Modeling cmdstanr	6	456	July 24, 2024
Extracting output of stansummary as an array PyStan rstan	5	1707	July 16, 2020
I can't get summary of my model CmdStan	2	735	July 29, 2022

Rstan ESS calculation from CmdStan files

Related topics