Reducing `stanfit` object size in `rstan` with `shredder::stan_axe`

stanfit objects hold a lot of information and some of the elements that take of most of the memory are not necessarily needed in post-processing. Using shredder::stan_axe you can safely remove those elements and reduce the size of the stanfit object.

Vignette:: https://metrumresearchgroup.github.io/shredder/articles/axe.html
Package GitHub repository: https://github.com/metrumresearchgroup/shredder

Feedback/Comments/Suggestions always welcome.

Example

library(shredder)
library(butcher)
rats <- rats_example(nCores = 1)
Model Script
S4 class stanmodel 'rats' coded as follows:
data {
  int<lower=0> N;
  int<lower=0> T;
  real x[T];
  real y[N,T];
  real xbar;
}
parameters {
  real alpha[N];
  real beta[N];

  real mu_alpha;
  real mu_beta;          // beta.c in original bugs model

  real<lower=0> sigmasq_y;
  real<lower=0> sigmasq_alpha;
  real<lower=0> sigmasq_beta;
}
transformed parameters {
  real<lower=0> sigma_y;       // sigma in original bugs model
  real<lower=0> sigma_alpha;
  real<lower=0> sigma_beta;

  sigma_y = sqrt(sigmasq_y);
  sigma_alpha = sqrt(sigmasq_alpha);
  sigma_beta = sqrt(sigmasq_beta);
}
model {
  mu_alpha ~ normal(0, 100);
  mu_beta ~ normal(0, 100);
  sigmasq_y ~ inv_gamma(0.001, 0.001);
  sigmasq_alpha ~ inv_gamma(0.001, 0.001);
  sigmasq_beta ~ inv_gamma(0.001, 0.001);
  alpha ~ normal(mu_alpha, sigma_alpha); // vectorized
  beta ~ normal(mu_beta, sigma_beta);  // vectorized
  for (n in 1:N)
    for (t in 1:T) 
      y[n,t] ~ normal(alpha[n] + beta[n] * (x[t] - xbar), sigma_y);

}
generated quantities {
  real alpha0;
  alpha0 = mu_alpha - xbar * mu_beta;
} 

We use the butcher package to evaluate the size of each element.


rats%>%
  attributes()%>%
  butcher::weigh(units = 'MB')
#> # A tibble: 404 x 2
#>    object                 size
#>    <chr>                 <dbl>
#>  1 stanmodel            3.71  
#>  2 .MISC                1.66  
#>  3 sim.samples.alpha[1] 0.0160
#>  4 sim.samples.alpha[2] 0.0160
#>  5 sim.samples.alpha[3] 0.0160
#>  6 sim.samples.alpha[4] 0.0160
#>  7 sim.samples.alpha[5] 0.0160
#>  8 sim.samples.alpha[6] 0.0160
#>  9 sim.samples.alpha[7] 0.0160
#> 10 sim.samples.alpha[8] 0.0160
#> # … with 394 more rows

stan_axe

shredder can remove three elements in the stanfit object

  • The cached fit summary stored in fit@ .MISC $summary
  • The cached cpp object stored in fit@.MISC$stan_fit_instance
  • The stanmodel stored in fit@stanmodel
summary <- function(x){
  
    y <- x%>%
    attributes()%>%
    butcher::weigh()
    
    s <- y$size
    
    data.frame(min = min(s),
               max = max(s),
               mean = mean(s), 
               sd = sd(s), 
               sum = sum(s))
    
  }
# Initial stanfit object
butcher_rats <- rats%>%
  summary()

# Remove fit_instance
butcher_fit_instance <- rats%>%
  stan_axe(what = 'fit_instance')%>%
  summary()

# Remove fit_instance, stanmodel
butcher_stanmodel <- rats%>%
  stan_axe(what = 'fit_instance')%>%
  stan_axe(what = 'stanmodel')%>%
  summary()

# Remove fit_instance, stanmodel, summary
butcher_summary <- rats%>%
  stan_axe(what = 'fit_instance')%>%
  stan_axe(what = 'stanmodel')%>%
  stan_axe(what = 'summary')%>%
  summary()

# Remove fit_instance, stanmodel, summary
# Keep only parameters of alpha
butcher_params <- rats%>%
  stan_axe(what = 'fit_instance')%>%
  stan_axe(what = 'stanmodel')%>%
  stan_axe(what = 'summary')%>%
  stan_select(alpha)%>%
  summary()
tbl <- purrr::map_df(
  list('full' = butcher_rats,
       'fit_instance' = butcher_fit_instance,
       'stanmodel' = butcher_stanmodel,
       'summary' = butcher_summary,
       'param' = butcher_params),
  identity,.id='axe')
axe min (MB) max (MB) mean (MB) sd (MB) sum (MB)
full 4.8e-05 3.711536 0.0244919 0.2014100 9.894712
fit_instance 4.8e-05 3.711536 0.0205039 0.1842420 8.283592
stanmodel 4.8e-05 0.045000 0.0113451 0.0074580 4.572056
summary 4.8e-05 0.016048 0.0112342 0.0072865 4.527392
param 5.6e-05 0.016048 0.0108903 0.0073964 2.014712
8 Likes

Nice!

If it were up to me and I didn’t need to worry about following R programming idioms found in built-ins like glm(), the output of Stan would consist of an immutable matrix of parameter and diagnostic values and the metric/step size resulting from adaptation. But I’m told R users expect their outputs to contain everything in the input and all intermediate products and that they expect such objects to all be mutable.

1 Like

For better or for worse S3 methods like plot/summary/predict/print wouldn’t work without the large variety of info in the fit object.

butcher/shredder are a bit anti-method in that sense.