Cook et al spike at 0

seantalts · July 24, 2017, 9:39pm

Hey all,

I’ve been doing the Cook, Gelman, and Rubin software validation method to jointly validate my data generating and fitting models and I noticed that in many cases I was seeing a spike at 0 on the histogram of sum(posterior_thetas <= theta0)/length(posterior_thetas) which should look like a uniform(0, 1) distribution. I discovered some bugs in my models, but eventually realized that to make that spike go away I had to do two separate things:

replace <= with > in the above sum
In places where I was generating a positive value (typically for scale parameters), it was not sufficient to do e.g. fabs(normal_rng(0, 5)), instead I had to do a while loop and iterate until drawing a value greater than or equal to zero.

#1 above was implemented in R, so I suspect some kind of weird R-specific rounding or coercion but would love to know what is happening. #2 above was implemented in Stan and when I attempted to test the distributions of random numbers drawn with both techniques they seemed to line up within 2 decimal places for quantiles and to have extremely similar histograms at several levels of granularity.

Any ideas?
cc @betanalpha and @jonah who have been helping me with this so far.

anon75146577 · July 24, 2017, 9:51pm

There isn’t enough info here to say something concrete (what’s the model etc)

But I can tell you that fabs(normal_rng(0,1)) has a very different distribution to

sample = -1;while (sample <0) { sample = normal_rng(0,1)}

The first is a "folded normal" while the second is a truncated normal (ie a normal conditioned on the event that every sample is positive).

So which one do you actually want to use?

anon75146577 · July 24, 2017, 9:53pm

Actually I’m wrong. In the special case where it’s a N(0,1), the two coincide. Nevertheless, which one do you want?

betanalpha · July 24, 2017, 10:19pm

Right, for a zero mean they are exactly the same for Stan as the different normalizing constants don’t effect sampling. But to be pedantic we want a truncated normal here.

If you compare fabs(normal_rng(0, 1)) to a while loop that checks while (x <= 0) then Cook et all comes out fine. Which means the only difference is at x = 0, which should be vanishingly small for double precision floating point. It’s almost as if there’s a loss of precision somewhere that causes a nontrivial excess near zero.

There’s a similar problem with the comparison in R – the cases where < and <= are different should be extremely rare in double precision floating point, yet we’re seeing 1% effects in the Cook et al histograms.

anon75146577 · July 24, 2017, 10:20pm

is the code anywhere I can see?

anon75146577 · July 24, 2017, 10:38pm

Does this happen when you only use the parameter on the internal scale
(which I guess is a log transform?). That is, don’t use the constrained
"natural scale" parameter just the inconstrained one.

anon75146577 · July 24, 2017, 11:04pm

Also. Have you checked the values at which < and <= are different. It might not be R.

seantalts · July 25, 2017, 1:50pm

Hey Dan! Thanks for taking a look. I put up all the code here for now.
The above weirdness happens for a few different models; you can see the entry point in R for a simple 1D linear regression here, which ends up using this to generate data and this to fit it. The offending <= (now >) is here.

I’m kind of new - how would one go about using the parameter on the unconstrained scale? Isn’t all the math in a Stan program implicitly done on the constrained scale?

Let me try to see the set difference between <= and = and get back to you. Thanks again.

seantalts · July 25, 2017, 2:15pm

Actually it was <= vs >; I suspect that since <= was causing a spike at 0th percentile (for theta0 among new thetas sampled from the posterior) that changing it to < would not make a big difference - it was already reporting that none of them were <=, so also none of them would be <, right?

anon75146577 · July 25, 2017, 3:51pm

I foolishly ran your code and it may never finish…

So I’m just going to ask more questions:

Does this only happen for constrained parameters? (If so, it indicates that the problem is on Stan’s end, not on R’s)
Can you hack together a problem with an analytic form for the posterior (probably N(mu, sigma) with easy priors) and do the test for exact draws from the posterior in R? Does that show the problem?

seantalts · July 25, 2017, 3:57pm

Sorry, now I’ve commented out all the irrelevant stuff. The linear regression thing should finish in a few minutes by itself.

I’m not sure how to do either of those two other things… maybe Michael can point me in the right direction today.

anon75146577 · July 25, 2017, 4:02pm

The models for the normal are here. With conjugate priors and the associated posterior

It’s probably enough to do both unknown mean, known variances (unconstrained parameter) and unknown variance, known mean (constrained parameter).

The end of the doc shows how to do it jointly.

seantalts · July 27, 2017, 2:57pm

Okay, so on your and Michael’s suggestion I made a new linear regression model with a known variance (gen model, fit model) and generated the attached PDF,
constant_sigma_fits.pdf (36.5 KB),
showing 3 pages of increasingly granular histograms for the parameters using <= (“lte”) and > (gt). Then another 3 pages after with a new random seed, to show that some artifacts around 0 stick around. I’m really not sure what to read into this - it doesn’t actually look like switching to > helped the problem, nor that the while vs fabs thing was the only issue.

Any thoughts based on this? I can go and do the analytic posterior vs. stan fit posterior next if that might shed additional light.

Bob_Carpenter · July 27, 2017, 2:59pm

These things are hard to do by eye. What’d the statistical tests say?

Are you checking that you have convergence and reasonable n_eff in all these runs?

seantalts · July 27, 2017, 5:23pm

As we talked about in the meeting, the statistical tests don’t really examine the fact that the spikes are often at 0, but they’d probably pass in these cases.

I made two more pdfs of charts that I think might show that the fabs issue was likely a red herring - there’s a linear regression with while loop (lin_regr, model) and with fabs (lin_regr_fabs, model)
lin_regr_fabs.pdf (23.7 KB)
lin_regr.pdf (24.0 KB)
(<= always on the left, > always on the left in these)

You can see they aren’t the same, especially on the most granular histograms, but they might be close enough? I don’t have a good intuition for this.

It also seems like the issue exists roughly equally in both > and <= quantiles.

The spikes aren’t that weird by themselves, but the fact that one shows up at 0 in like 7 out of 8 histograms of appropriate granularity across seeds and various ways of getting sigma seems weird to me…

betanalpha · July 27, 2017, 6:12pm

Those models should be the same because you’re using the while loop with <=. The discrepancy arose when you compared the fabs implementation against the while loop with just <.

seantalts · July 27, 2017, 6:26pm

fabs should be even more the same as < 0 right? Either way, I re-ran and here’s the new one:

lin_regr_lt.pdf (23.9 KB)

anon75146577 · July 27, 2017, 6:28pm

Use INLA::inla.ks.plot( data, punif) where data is whatever you’d draw the histogram of.

This will give a plot where the p-value from the KS test is displayed as well as the empirical process* that the test is based on. This should not have long sojourns outside the circle. It gives more information about where the problem is, but is considerably easier to read than a histogram.

*(It’s a plot of |ecdf - punif|/sqrt(n), which converges to a Brownian Bridge as n-> infintiy if the uniform distribution is correct (here ecdf is the empirical CDF).

betanalpha · July 27, 2017, 6:39pm

No, fabs would admit zero so is equivalent to the while loops with <= 0. I thought that you had done some experiments verifying this, but those results may have been confounded with other stuff.

seantalts · July 27, 2017, 8:02pm

As you can see from these pdfs, the effect is not always there on every graph; I think that was confusing me as I was only doing a small subset of these graphs when testing this manually before. Your advice to do this more exhaustively was 👌🏻. Sorry for the confusion.

(Also I don’t think this matters, but fabs and while (sigma < 0) sigma = normal_rng(0, 5); admit 0s, while (sigma <= 0) will retry on 0.)

Topic		Replies	Views
Improved Cook et. al. Method General	7	788	August 17, 2017
[Re-Post] Rejecting Initial Value (Student_t distribution) Modeling	8	484	June 20, 2018
Negative Binomial Overflow during warmup Modeling rstan	4	539	August 31, 2020
PSA: using stantargets for SBC(-esque) checks, & big speedups for big gaussian likelihoods Publicity techniques , specification , performance	1	973	May 30, 2021
Negative_binomial_2: should i be worried about metropolis proposal rejection in warmup phase? Modeling	2	1240	August 28, 2018

Cook et al spike at 0

Related topics