Effective sample size discrepancy between loo and rstan packages

ShaunMcD · December 28, 2018, 10:51pm

Apologies if this topic is in the wrong place.

I’ve been using a simple toy model to experiment with the rstan and loo packages (most current verison of each, as far as I’m aware), and I noticed what appears to be a discrepancy between the relative effective sample sizes I get from the summary method in rstan and the relative_eff function in loo. The toy model is as follows.

data {
  int<lower=0> N;
  vector[N] y;
}

parameters {
  real mu;
  real<lower=0> sigma;
}

model {
  y ~ normal(mu, sigma);
  mu ~ normal(-1, 1);
  sigma ~ cauchy(0, 1);
}

generated quantities{
  real lik[N];
  for(i in 1:N){
    lik[i] = exp(normal_lpdf(y[i] | mu, sigma));
  }
}

The data is a small draw from rnorm. Note that I am directly generating the likelihoods by exponentiating the logs in the generated quantities block.

Creating the model and extracting the relative effective sample sizes from the stanfit method gives me one set of results:

stan_runs = stan(file = 'code/smallstantest.stan', data = list(y = y, N = N), verbose = TRUE, iter = 500, save_warmup=FALSE)
summary(stan_runs, pars = c('lik'))$summary[,'n_eff']/1000 #250 iterations * 4 chains = 1000 iterations

 0.4850813   0.7350164   0.5353286   0.7060084   0.7548567   0.7030208   0.8220460   0.5953139   0.4300723   0.7909692

But using the provided function from loo gives me a slightly different (but somewhat similar) results:

relative_eff(extract(stan_runs, pars = c('lik'), permuted=FALSE))

0.4685559 0.7308909 0.5181340 0.7051706 0.7429508 0.7117508 0.7870215 0.5970975 0.4180794 0.7728746

What is the loo package doing differently? Looking through the source code didn’t reveal any obvious differences in methods. Is it some kind of warmup thing I’m missing? Perhaps something to do with permutation of chains? And to the extent that there is a difference, which is “correct”?

Any help is greatly appreciated.

ShaunMcD · December 28, 2018, 11:22pm

Update: there appear to be two differences:

The autocovariance in the rstan function is calculated using an FFT, while in the loo package it simply uses the acf function from rstats.
The rstan function (https://github.com/stan-dev/rstan/blob/develop/rstan/rstan/R/monitor.R) uses Geyer’s initial sequences to sum up the autocorrelations, while the loo function (https://github.com/stan-dev/loo/blob/master/R/effective_sample_sizes.R) does not.

Is there anyone experienced with both of these packages who can share some insight on the reasons for these different methods, and the impact of those differences?

Max_Mantei · December 29, 2018, 10:05am

Did you have a look at this thread?

avehtari · December 29, 2018, 5:54pm

Thanks for the report @ShaunMcD !

This is my fault. We have an issue for this https://github.com/stan-dev/loo/issues/85, but I forgot to fix this after this was fixed in monitor.R. I’ll fix this tomorrow by copying the updated code from monitor.R, but it will take some time before the next CRAN release of loo package (we should fix some other open issues, too).

There’s a (longish) discussion of the differences in N_eff BDA3 vs. Stan

ShaunMcD · January 31, 2019, 9:12pm

Many thanks for the clarification to both of you!
I see that the relevant pull request has now been merged. :)

Topic		Replies	Views
Effective sample size differs between rstan 2.18 and rstan 2.17.3 for same model object RStan	2	923	November 14, 2018
Getting warning with `loo` package regarding r_eff Modeling loo	5	2744	April 21, 2018
Error in loo::relative_eff( ) - Error in n_eff_vec/S : non-numeric argument to binary operator Modeling rstan , fitting-issues , loo	7	1223	May 13, 2020
LOO Model Comparison Alternative Modeling rstan , techniques , loo , cmdstanr	3	82	March 27, 2025
Get effective sample size Interfaces rstan	4	2578	November 13, 2018

Effective sample size discrepancy between loo and rstan packages

Related topics