Thining and diagnostics

karimn · September 30, 2019, 4:04pm

Hi,

I’m trying to deal with some memory constraints because of the large number of necessary iterations. I’m running 4 chains with 4000 iterations each (2000 warmup/2000 sampling). I noticed that when I use “thin=2” I start seeing the below problems, which doesn’t happen without thining.

3: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess 
4: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

I was expecting these diagnostic checks to be run before thining. Is this wrong?

Thanks,
Karim

avehtari · September 30, 2019, 4:42pm

thin options is used to reduce the amount of draws saved (there can be cases where, e.g., laptops can’t load all draws from very long chains to memory and then thin helps), and diagnostics are run only for the saved draws.

Are your draws taking too much space, or why are you thinning?

karimn · September 30, 2019, 5:02pm

Yes, I’m running out of RAM when I extract the data to a data.frame. Would it be valid for me to fit the model without thining and then thin out the samples after extracting?

avehtari · September 30, 2019, 5:11pm

You can do diagnostics for non-thinned chains and if everything looks fine, thin, and then check ESS or MCSE relevant for the quantities of interest. That ESS warning has quite high ESS threshold to keep convergence diagnostics more reliable.

karimn · September 30, 2019, 5:21pm

Everything looks fine before I thin. Is there a function in rstan that I can use to check ESS after I thin for the quantities I’m interested in?

avehtari · September 30, 2019, 5:55pm

See https://rdrr.io/cran/rstan/man/Rhat.html and https://rdrr.io/cran/rstan/man/monitor.html
There are more useful functions in monitor.R
See also https://github.com/avehtari/rhat_ess

karimn · September 30, 2019, 6:27pm

Sorry, quick question on ess_bulk and ess_tail: the generated quantity I’m interested in is a matrix, can I pass them an (iteration * rows * columns) \times chains array or should I pass it each cell of the matrix separately, i.e., iterations \times chains?

avehtari · September 30, 2019, 6:56pm

Rhat: Convergence and efficiency diagnostics for Markov Chains in rstan: R Interface to Stan says for ess_bulk and ess_tail that they accept

A two-dimensional array whose rows are equal to the number of iterations of the Markov Chain(s) and whose columns are equal to the number of Markov Chains (preferably more than one).

and monitor: Compute summaries of MCMC draws and monitor convergence in rstan: R Interface to Stan says for monitor

A 3-D array (iterations * chains * parameters) of MCMC simulations from any MCMC algorithm.

karimn · September 30, 2019, 7:21pm

Awesome, thanks. I think I figured it out. Looks like my ESS bulk and tail are > 100 so thining is working for these parameters.

Topic		Replies	Views
Reproducing diagnostics after fit? Interfaces rstan	4	568	December 24, 2020
Difficulties assessing convergence of Markov chains Modeling fitting-issues	4	591	January 8, 2020
Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable General rstan	6	4664	September 25, 2020
New R-hat and ESS Developers	35	7466	July 1, 2019
Launch_shinystan rstanarm shinystan	4	981	December 19, 2018

Thining and diagnostics

Related topics