I’ve been using frequentist GLMs to model count variables in insurance contexts. I wanted to ask here if there is an official technique in the frequentist perspective equivalent to predictive checks. So far, the adequacy of the distribution assumption can’t be seen on the model outputs. One could see the deviance residuals and realise that the distribution does not fit. But can we simulate outcomes over all model predictions using the ML estimates? We would only get the “ML” distribution, but at least to check if we are capturing the main features of the data.
What I meant by simulating was: we get all fitted means. Then we compute the variance by the variance function and the estimated dispersion parameter. Finally we simulate nj outputs with the mean and variance we obtained in the previous step, for every possible combination. At the end we can plot the histogram of fitted vs observed for the whole dataset.
Moreover, it would be an interesting topic to discuss the confidence interval on this ML distribution, although I don’t know how are the obtained.
If you are an R user, then maybe
arm::sim() does what you want? There is a discussion of it in Gelman/Hill, and I think it’s basically the approach you described.
I think in any case you rely on the distribution of the MLE regression coefs, which is asymptotically normal. In the case of Poisson, that’s it; in the case of Negative Binomial regressions I’m not sure if there is even a good frequentist way to estimate the overdispersion parameter.
When using a Quasi-Likelohood in a GLM it’s not so clear (at least to me) how to stimulate new data.
That being said, if you consider the parameters of your model as random variables with distributions and generate new data from it, you’re (imo) closer to Bayes than to frequentist stats. In frequentist stats the source of variation is the data, while you have ground truth in parameters - so you test parameters and not new data. That’s why you see frequentists use data resampling methods such as bootstrap.
I hope this was helpful. :)
Thank you Max. Can you give me the reference of that discussion?
What I mean it’s just a quick check before doing any kind of inference on the true parameters. So I am not talking about sampling distributions but the distribution of the target variable Y represented by the ML estimates. In the frequentist perspective, I’ve faced many inadequacies of distribution assumptions because of the lack of any kind of “predictive checks” (and difficulties to analyze a deviance residuals plot). For instance, Poisson or Negative Binomial assumptions may result in similar regression point estimates. Although, once we obtain the variance through the Variance Function and generate a sample, we can see that the Poisson model may not replicate the data at all (however we test significance of parameters with such inappropriate assumption).
PD: For the NB case, the constant overdispersion parameter is usually obtained by MLE too. I have a software that estimates the parameter “k”, given that V(Y)= mu (1 + mu*k).
Hey Juan! I’m sorry, I don’t have the Gelman/Hill book with me right now. I think there’s a chapter in it, that describes this simulation approach to inference (and implicitly fake data generation, iirc).
I just found that there is a stats::simulate method for GLMs in R. This will simulate response values for your regression (not sure how exactly they do it…). I think it also works with MASS::glm.nb (my point about the NB was that you usually need to fix mu before estimating the overdispersion… but that discussion is really only tangent to the topic…).
Btw, I fully agree with your point about inadequate model checks for GLMs. In trade economics there is this trend of simply applying Poisson Pseudo Maximum Likelihood (PPML, really it’s just the Quasi-Likelohood of the Poisson) to everything and do zero model checks, because “consistency”! Annoying.
Generally, you will find that the flexibility of hierarchical models will almost always beat “flat” frequentists models in predictive data checks. And then you’re probably better off doing full Bayes anyways (if data size permits).