Please also provide the following information in addition to your question:
- Operating System: Windows 2016 Server
- rstanarm Version: 2.18.2
I have estimated a stan_lm object.
There are 1.1 M observations and 41 predictors.
The sample is 16000 (posterior sample size).
I had to increase iter to get n_eff > 1000 for log-posterior.
It was run on 4 chains.
In the RStudio environment, the stan_lm object is shown with size 1.5 Gb
I attempted to use shinystan.
Hang on… preparing graphical posterior predictive checks for rstanarm model.
See help(‘shinystan’, ‘rstanarm’) for how to disable this feature.
Error: cannot allocate vector of size 130.9 Gb
Is this expected behavior to require this vector size to launch shinystan?
If you need 16000 nominal draws to get an effective sample size of 1000 for
stan_lm something is wrong. But yes, if you have 16000 posterior predictions for each of 1.1 million observations, that is going to consume some RAM.
I appreciate your comment from last week.
Being a novice at Bayesian, I would appreciate what is to explore when “something is wrong”.
I checked for multicollinearity (car::vif). All generalized VIF were < 2.7; most were between 1 and 2.
The factor covariates had reasonable number of counts for each level.
The Rhat values were all 1.0
With the exception of the log-posterior (mcse = 0.2), all mcse values are 0.0.
Plotting variable traces showed good mixing.
What else would you recommend?
If you want to use the rest of shinystan’s functionality and ok without the PPCs (you can do those separately with pp_check()/bayesplot package, then you can use:
launch_shinystan(fit, ppd = FALSE)
and it should require much less memory. Although it will still be a decent amount given the number of posterior draws.