I am testing a hypothesis on the distribution of distances between transcription factor binding sites and genes in the human genome. I have a model that fits well for several binding motifs I picked randomly and now I want to test whether it can fit all motifs I am interested in (several hundreds). During development, I checked the fits by hand by visually inspecting posterior predictive check (PPC) plots for density and few functions (median, max, min, IQR). Now I need to automate these checks.

The model for each motif has 5 parameters, while there are 5e3 - 1e5 datapoints for each motif, so it seems that fitting a hierarchical model would be an overkill and I thus fit each motif separately, which has reasonable running time.

So my idea is that for each motif I do a PPC of ~50 quantiles and than look at the distribution of the actual quantiles of the data within the PPC quantiles. Then I check, if there are

a) quantiles where the distribution of data values across motifs is non-uniform

b) motifs where the distribution of data values across quantiles is non-uniform

I might even calculate some p-values on that (whoa)…

Does that sound reasonable or is it a footgun? Has a similar approach been formalized somewhere? I’ve read a paper (can’t find the reference) where they do a similar thing but to test whether the model works when fitting simulated data (I thing they transform the uniformity test to a normality test). Is there a reason this approach might not be valid for testing model fit to actual data ? Thanks for any hints!