Sbc histogram interpretation

jpritikin · July 8, 2019, 2:11pm

Here are the least uniformly distributed histograms from a simulation-based calibration,

The alpha parameters seem to exhibit a slight inverted U shape like Figure 5 in the SBC paper. This is integrating over 500 datasets. The pattern persists with binning of 4 or 8. How do I interpret this?

@seantalts The SBC paper says that this pattern indicates that the data averaged posterior is overdispersed relative to the prior, but that doesn’t make sense. In the code (3.7 KB), alpha is generated from lognormal_rng(log(1)-0.15^2/2.0, 0.15) but the prior for parameter recovery is set to lognormal(log(2)-.5^2/2.0, .5). Any thoughts?

seantalts · July 8, 2019, 7:14pm

So in that model you’re both generating the random draws from a prior and then fitting to them, right? I think it makes sense to me - your posterior is wider than your prior for all of the alphas (look at the sigma parameter of the lognormal - it’s larger in the model than in the data generation), hence calling it over-dispersed relative to the prior. I might be missing something - can you give me some more context?

Re: Binning - ideally you’d aim for 1 bin per rank to avoid any binning artifacts whatsoever.

jpritikin · July 8, 2019, 7:40pm

Yes

Oh! When you put it that way, yes, it makes sense.

Also, tangentially, I find Equation 1 from the SBC manuscript a bit confusing because it relies on definitions given in-line in the preceding two paragraphs. I think it’s worth using numbered displayed equations and grouping all the math together for clarity. My other complaint is that you use the same notation for the prior for ground truth parameters \tilde\theta \sim \pi(\theta) and the data averaged prior \pi(\theta) defined in Equation 1. How about using \pi_1(\theta) for the former and \pi_2(\theta) for the latter so then you can say that you expect \pi_1(\theta) = \pi_2(\theta)?

seantalts · July 8, 2019, 7:47pm

I like that - when we do a second revision I’ll make a note to incorporate that. Thanks!

jpritikin · July 9, 2019, 1:29am

Instead of lognormal, I put an exponential(1.0) prior on alpha. The histogram is slightly improved, sbc-exp
This parameter is analogous to \alpha in edstan. If I can’t think of a better way to parameterize the model then what? Put the histogram in the manuscript and say that alpha will be slightly biased toward zero?

seantalts · July 9, 2019, 5:52pm

Just to be clear, you are trying to put the same prior on alpha in both the data generating process and the model block now right?

jpritikin · July 9, 2019, 6:11pm

No, I don’t see how that could work. Consider edstan which has an exponential(0.1) prior on the scale of a normal distribution, \sigma \sim \text{Exp}(0.1) and \theta_j \sim \mathcal N(\dots,\sigma^2). This prior definitely doesn’t make sense for data generation, but seems to do fine in parameter recovery. In an SBC context, aren’t we just validating a subset of the parameter space if we do this?

I sense the assumption in the SBC paper that data generation and parameter recovery use the same priors, but isn’t this too limiting?

seantalts · July 10, 2019, 10:56am

SBC calibrates the degree to which your algorithm can fit your model assuming your data generating process matches your model specification. It doesn’t sound like that’s what you’re looking for - what are you aiming to show? It sounds like you might be looking for tools like posterior predictive checks and loo.

jpritikin · July 10, 2019, 11:33am

When you say ‘algorithm’, surely you don’t just mean HMC, ADVI, or INLA? My impression was that SBC is also an important part of model development, that is, “a critical part of a robust Bayesian workflow.” I took this phrase to mean that we should use SBC when developing models. Certainly SBC should work if the data generating process matches the model specification. If it doesn’t then there is a bug in the model specification. However, the use of SBC seems to extend beyond this narrow situation.

Here’s an example. My interpretation is that SBC verified that a subset of the parameter space is well calibrated.

Given that there are situations where the data generating process is somewhat different from the parameter recovery model, SBC seems like a useful procedure to investigate whether the posterior will be accurate or not. Why would SBC not help here?

Posterior predictive checks and loo are great once you actually have real data, but SBC seems like a useful procedure to validate your model before you have data.

maxbiostat · July 10, 2019, 12:13pm

Here are my two cents, @seantalts and @jpritikin please feel free to kick my butt if I misspeak.

Nope. Only if the algorithm is working as expected, which is the point of running SBC. To find out if it is. Here, “algorithm” can be understood as, for instance, (i) running favourite MCMC to approximate target (ii) computing expectations and associated MCSE. If there is substantial autocorrelation, for example, the rank histogram will be horn-shaped (Fig. 4), indicating that one needs to tweak the thinning step of the “algorithm”. This also helps to study the interaction of a given target with a given approximation algorithm as for example when one has a funnel effect that prevents proper exploration of the space. This will also likely show up in SBC histogram as a horn pattern (Fig. 6), suggesting under-dispersion of the approximated target relative to the “true” one.

Not sure if it won’t help, but it is my impression that Theorem 1 in the paper only works out if the data generating process is the prior.

seantalts · July 10, 2019, 12:24pm

I do! I should have said inference algorithm. It’s important to see how well your inference algorithm can work with your model - for example, even with exactly the same DGP and model specification, you see severe bias even with HMC on the eight schools model.

If you wanted to verify that some parameter subspace is fit appropriately with your inference algorithm, you would change both the DGP and model in lock-step to verify that. I’m not sure I understand your use case, though. It’s an interesting idea to see how well an incorrect model could recover a mis-specified DGP, but 1) if you had some better guess about the DGP, you would just use that as the model in most cases; and 2) if you had a lot of uncertainty about the DGP you would attempt to widen your priors in the model basically. One case it would be interesting is for e.g. discrete parameters - you might have a DGP that generates quickly and uses discrete parameters (or some other computationally impossible part) and then you might want to calibrate how well a model without those nasty parts can recover the rest of the parameters. Is that what you want to do here? It didn’t seem like it as the DGP you had down looked at the surface level to be just as computationally feasible as the model block, but maybe I missed something.

jpritikin · July 10, 2019, 1:59pm

When I said that “SBC should work,” I meant that the SBC procedure to test the algorithm would be an appropriate test. So I agree with your post, but it doesn’t seem very relevant to the point that I was trying to discuss.

Oh! Yes, some additional clarification about the scope of the SBC paper might be helpful to readers.

Sure, and isn’t an SBC-like test a marvelous way to check whether the parameter subspace is fit appropriately?

I’m not sure you, or others here, have the time to understand the models that I’ve been playing with recently. That’s why I put in the link to the trivial lognormal / normal example (see above). Without getting into too much detail, I just wanted to provide some vague justification for mismatch between the DGP and parameter recovery prior. I like your example with discrete parameters. That’s another case where a mismatch seems unavoidable. The question I’m interested in getting answered is whether SBC is a reasonable procedure for validating parameter subspaces of these kinds of models.

seantalts · July 10, 2019, 2:30pm

I think the key thing is just to make sure that you definitely can’t just make your model match the DGP - you would always prefer to do that if it had reasonable computational properties. If not, we didn’t really intend the paper to show this as the use-case seemed pretty rare, but I do think SBC can give a decent sense for how well your fits are recovering parameters. @maxbiostat is right that the proof doesn’t work if the DGP doesn’t match the model, so use with caution.

betanalpha · July 10, 2019, 2:47pm

Right – and it has to be the entire data generating process, including all priors.

Simulating data from one data generating process and fitting with another can be a useful calibration procedure, but unlike SBC there are no guarantees on what behaviors to expect see. For more see Probabilistic Modeling and Statistical Inference.

jpritikin · July 10, 2019, 3:17pm

Yes, of course. The trouble starts when this seems intractable.

@betanalpha Ah ha! I’m glad I was on the right track. I’ll cite your link.

Joshua_Pritikin · September 21, 2019, 1:00pm

@seantalts Regarding Figure 7 in the SBC paper, I still get confused reading the caption, “biased in the opposite direction relative to the prior distribution.” Can you give some example numbers so there is no risk of misunderstanding? Figure 7(b) has numbers along the x-axis. Is it correct to say that this histogram would result if the prior was centered at 5 and the data-averaged posterior was centered at 2? Why is this the most intuitive way to present the results? Wouldn’t it be easier to interpret if the comparison was reversed so the histogram showed the bias in the same direction instead of the opposite direction?

seantalts · September 22, 2019, 5:19pm

My thinking here is that you’re always computing the histogram of the rank of the prior within the posterior. So your resulting ranks always indicate the placement of the prior relative to the posterior. So that histogram in 7b is showing that the prior is concentrating to the right of the posterior. I agree that that caption does not present that information in a way that is easy to learn.

Topic		Replies	Views
Using narrower priors for SBC General prior-choice , simulation-based-calibration	14	1486	August 6, 2021
Best practices for Simulation Based Calibration with hierarchical models Algorithms simulation-based-calibration	10	1588	May 22, 2021
Need help on sbc uniform proof General simulation-based-calibration	14	1784	July 6, 2021
Can sbc used to adjust prior? Modeling simulation-based-calibration	1	428	July 22, 2020
SBC for arma model Developers simulation-based-calibration	11	1157	April 7, 2021

Sbc histogram interpretation

Related topics