Is there a way to remove variables from rstan/cmdstanr objects?

I am fitting models with many thousands of variables, many of which I am not interested in at the analysis stage (a hierarchical HMM with per-trial forward variables). However, the “superfluous” variables make working with the model-object cumbersome and time-intensive. For example, storing a fitted model can produce files that are quite big (several GB) and loading/extracting variables is slow etc.

Is there a way to “drop” or remove variables from a fitted model-object? I am using both rstan and cmdstanr (though I have come to prefer the latter), so I would be interested in solutions for both!
I know about methods to not include the variables during sampling by using blocks (but that’s not an option as I need the variables in the “generated quantities” block during sampling).

Operating System: Linux (Debian 6.3.0-18)
CmdStan Version: 2.26.1
Compiler/Toolkit: GCC 6.3.0 20170516

For CmdStanR we don’t have this quite yet but I’d like to enable it. @rok_cesnovar should we just add a variables argument to as_cmdstan_fit() to allow creating the object with a subset of the variables from the CSVs? (Edit: I guess this would still keep all variables in the CSV files so resulting calls to draws() would need to use variables to avoid reading in the remaining variables. We’d need to eliminate the variables from the CSV files or add some functionality to prevent it from reading in the undesired variables in subsequent calls to draws(). )

For RStan: rstan::sampling() has the pars and include arguments that together let you control which variables are saved.

1 Like

If you already have a fitted object and want to reduce its size, you could look into some of the functionalities provided by the shredder package. I’m not sure if it already supports CmdStanR.

3 Likes

Great suggestions, thank you! The shredder package looks great, thanks for the tip!

Yeah. Lets make an issue we dont forget.

Here’s the issue: create fitted model object with subset of variables · Issue #499 · stan-dev/cmdstanr · GitHub