Shinystan with separate generated quantities

adamConnerSax · December 20, 2021, 6:48pm

I’m trying to build a work flow to accommodate running a model once and then post-stratifying (or doing something in the generated quantities block) in a separate cmdstan run, using standalone generate quantities. I’ve got some related questions:

What’s produced in the csv by the standalone step? Does it replicate the parameter draws or just produce draws for the variables in the generate quantities block?
If only the GQ variables are produced, how does this work with shinystan? That is, how would I explore model parameters and GQ variables in the same shinystan session?
Do I have to do the standalone run once for each csv (chain) produced by my original run or is there some way to specify multiple “fitted_params” files to cmdstan to do it all at once? (I suspect the answer here is “no” since it’s related to my previous question about multiple input data files. If the parser can’t handle lists there, then it can’t handle them here either. Seems like another good case for it, though, since most users of standalone GQ probably want to use all the draws from all the chains…)

In general, any pointers to how people use this feature would be appreciated!

Bob_Carpenter · February 16, 2022, 10:49pm

These are all good questions and sorry they haven’t been answered yet. I think @mitzimorris should know the answers.

ShinyStan is just going to work with whatever parameter draws you give it, so that’ll depend on what standalone generated quantities produces.

adamConnerSax · February 16, 2022, 10:58pm

Thanks!
I’ve sorted it all out, though it was a bit of work. The standalone GQ produces only the GQ bits so I had to merge the parameter sampling csv files with the GQ csv files. Which is, the way I did it anyway, somewhat memory intensive.
And you do have to run it once per sampled chain.
Once it works, it’s a nice workflow: run the model once and then do all the log-likelihood, posterior predictions, post-stratifications after, and relatively quickly.
I still think it would be useful if cmdstan supported this directly since it’s a bit fussy in the details and lots of downstream tools won’t work otherwise. But I don’t know how hard that is or how many people need that feature.

mitzimorris · March 3, 2022, 6:36pm

[quote=“adamConnerSax, post:3, topic:25754”]
I still think it would be useful if cmdstan supported this directly since it’s a bit fussy in the details and lots of downstream tools won’t work otherwise. But I don’t know how hard that is or how many people need that feature.
[/quote

for the record, it would be possible to produce merged output consisting of the sampler vars columns from the input drawset plus outputs from write_array method. it would take a little work to add in the first set of columns. I think that shinystan et al expect to see lp__ etc - is this correct?

adamConnerSax · March 9, 2022, 2:04pm

I think so! But I know such a limited corner of the stan tools space. Someone with more expertise should chime in! There’s shinystan and the various loo related tools and lots more from what I can see…

Thanks for following up!

Topic		Replies	Views
Standalone generated quantities - comments welcome Developers features	32	3005	May 1, 2019
Standalone generated quantities usage help : cmdstanr CmdStan techniques , fitting-issues , algorithms	3	991	April 23, 2020
CmdStan generate_quantities and stansummary CmdStan	14	1325	January 9, 2022
Adding standalone generated quantities option to cmdstan (and rstan and pystan) Interfaces	9	1211	December 15, 2017
Standalone Generated Quantities Example in pystan? Algorithms	6	1252	June 22, 2020

Shinystan with separate generated quantities

Related topics