Posterior predictive tail thicker than observed

I am modeling a positive outcome using a lognormal regression. The outcomes are rounded to integers so I am using the usual interval censoring approach. Here are my outcomes (about 800 observations here):

image

I am including about a dozen covariates and several random effects but I always get a similar posterior predictive:

image

For me y \geq 10 is an important threshold so here’s the distribution of proportion y \geq 10 in the replicated datasets and a vertical line for the observed:

image

Of course I could model the binary outcome y \geq 10 directly but that seems extreme since the threshold, while important is somewhat arbitrary, and I am still interested in the continuous outcome. I was looking forward to being able to make full use of the information through interval censoring.

Note I’ve also tried skew normal (posterior for \alpha was 0\pm .05), gamma, and negative binomial family regressions and the results were very similar. I also tried not including the censoring in the model-- no change. I tried including random effects for a very fine (400 groups for 800 observations) multilevel structure guessing that it might overfit but at least reproduce the tail, but that didn’t work either.

My first thought is that I’m missing the right predictors but I didn’t see any improvement in the posterior predictive as I continue to improve the model (as measured by loo and bayes_R2) so I’m not sure if that’s right? Is there a more flexible family I should be using? Different priors? Anyone ever encountered this before or otherwise have some ideas about what might help (assuming this is an issue)? I realize this might be difficult to answer without seeing the data, which I can’t share.

You could get heavier tails with a student-t model: https://mc-stan.org/docs/2_19/functions-reference/student-t-distribution.html

I guess you could estimate the degrees of freedom term. \nu \to \infty corresponds to a normal distribution.

You might have to put pretty constraining priors on \nu to get it to work well. gamma functions are involved and the numerics are nasty there.

You could also try just picking a low \nu (like 4 or something random?) and seeing if the fit changes in any interesting way.

Thanks @bbbales2 but I think you may have misread the question as I was seeking thinner tails. Following advice from @James_Savage I tried the Gumbel distribution whose fit did have slightly thinner tails initially, but were thicker farther out, producing unrealistically large values in replications (> 1000, when the max in my data is 100).

Oh whoops, my apologies. I’m not so sure what to do.

How far off from a normal do your predictions look on the log scale?