Improved Cook et. al. Method

@seantalts and I spoke a couple days ago concerning improvements to the model checking approach outlined in Cook, Gelman, Rubin(2006). On Sean’s suggestion and in order to keep the process transparent I am going to post our consensus thus far here.

Originally, spikes at 0 and 1 were noticed when using the CGR method. Our current hypothesis states that the spikes at the fringes are not solely restricted to 0 and 1 but instead represent a generally U shaped distribution in the histogram of Q’s. When the ordered set of quantiles is re-imagined as a q-q plot, this leads to a S shape (hypothetically a shallow sigmoid based on appearance but not proved).

I wrote a basic random walk metropolis to make sure this was not a caused by Stan itself as well as to magnify the effect in a closed environment and verified the results. Based on the efficiency of the Markov Chain and the number of draws the sigmoid takes on a varying shape as seen below.
Varied Sigma.pdf (753.7 KB)

I’m still working on more analysis, but it seems that both of the statistics I tried seem to decay exponentially based on number of draws of the Markov Chain.
K-S Sigma = 0.1.pdf (4.9 KB)
K-S Sigma = 0.3.pdf (4.9 KB)
Wasserstein Sigma = 0.1.pdf (4.9 KB)
Wasserstein Sigma = 0.3.pdf (4.9 KB)

(More to come, next post in this thread will be a discussion of the summary statistic)
Tagging @anon75146577 @betanalpha

Originally a Chi-Squared was used and p-values and z-values were then computed. Andrew expressed desire to move away from the p-values and overall improve the CGR method. The primary approach we have discussed is in some way measuring how far away the q distribution is away from the expected/desired distribution. Originally the approach was to use a histogram with 2*sqrt(n_eff) bins and visually examine the histogram.

A potential improvement is to transform the histogram via a q-q plot. The primary advantage of that is the fact that the ideal distribution to compare it to is not in fact totally uniform.

Potentially if a single number was desired a K-S or Wasserstein could be used.

(Pinned to compile ideas)

We want a single number AND a methodology for examining and digging deeper via graphs. Maybe a stretch goal would be a way to indicate the type of mismatch occurring.

I actually never was able to get that sigmoid shape with a qqplot - here’s what I tried:

discrete.dan = function(N_samp, n_eff, pl) {
   round(pnorm(rnorm(N_samp, sd=1 + runif(N_samp, -1/sqrt(n_eff), 1/sqrt(n_eff)))) * pl) / pl
qqplot(discrete.dan(1e4, 26, 1000), runif(1e4))

Am I doing something wrong?

Re: goals for this method - I think we want a single number AND a methodology for examining and digger deeper via graphs. And maybe a stretch goal would be a way to indicate the type of mismatch occurring.

Re: summary statistic ideas: @betanalpha also mentioned that we could take something that’s supposed to be uniform ish and transform it to a normal distribution via the inverse normal CDF and then use a normality test (which are supposedly much more sensitive than KS and Wasserstein). However, if we’re expecting something not quite uniform we could still use a technique to deal with that; ideally some function of the number of effective samples and maybe the discretization level (sometimes planck_length in prior posts).

The transformation could definitely work, but we would have to work to get around the problem of transforming discrete quantiles. When I was using the method a couple weeks ago with a Chi-Squared we found that the transformation was extremely sensitive to the quantile calculation method and binning. ie. If there was a single .999 value but it was binned to .995 that would have a large effect on the end metric distribution.

@seantalts Is the code you are using drawing the samples from an MCMC? It seems to me that the discrete.dan function is drawing independent samples.

This is the very first MCMC I wrote, i’ve generalized it since but this is easiest to follow.

proposal <-
      ##returns the proposal given the current state 
    prop = rnorm(1,currentPosition,sigma);
    ##prop = currentPosition + runif(1,-1*sigma,sigma)

likelihood <-
    li = dnorm(currentPosition);

quantile <-
    ##Computes the quantiles
  j = 0;
  for(i in 1:length(underlying)){
    if(sample > underlying[i]){
    j = j+1;
  quant = j/length(underlying)

sampleMH <- 
  function(sigma, L){
    ##function to generate a MCMC chain of length L 
  theta = vector(length = L);
  theta_star = 0;
  ## set the first theta to a value in the target distribution. Eliminates the need for warmup/burn
  theta[1] = rnorm(1, mean = 0, sd = 1);
  for(i in 2:L){
    ##loop to fill in the remaining values
    theta_star = proposal(theta[i-1],sigma)
    ##realization from proposal distribution
    if(runif(1, min = 0, max = 1) < (likelihood(theta_star)/likelihood(theta[i-1]))){
      ##decides whether to accept
      ##or reject

drawQuantileMH <-
  function(sigma, L){
    distribution = sampleMH(sigma, L);
    independent_sample = rnorm(1,0,1);
    quant = quantile(independent_sample, distribution);

Then plot an ordered set of drawQuantileMH() vs the index.

The proposal and likelihood can be changed, the code as is doesn’t preserve detailed balance for non-symmetric proposals so keep that in mind.

This is what Cook, Gelman, and Rubin did.

Translation for us non-physicists?