With the following data from The data (K.CSV (256 Bytes)) is from Koch, G.G.; Atkinson, S.S.; Stokes, M.E. 1986. Encyclopedia of Statistical Sciences. Volume 7. John Wiley. New York. Edited by Samuel Kotz and Norman Johnson.
Melanoma,Area,AgeGroup,Population 61,0,<35,2880262 76,0,35-44,564535 98,0,45-54,592983 104,0,54-64,450740 63,0,65-74,270908 80,0,>74,161850 64,1,<35,1074246 75,1,35-44,220407 68,1,45-54,198119 63,1,54-64,134084 45,1,65-74,70708 27,1,>74,34233
I can fit the following poisson model:
fit_1a <- rstanarm::stan_glm(Melanoma ~ Area + AgeGroup, offset=log(Population), family=poisson(link = "log"), data=data)
Although the resulting coefficients are the same as in Koch (1986), the posterior predictive checks don’t look good. I was wondering whether it is possible to improve its fit — improving the results of PSIS and PPC — by transforming the data or creating new variable from the same data. Or these variables are not enough and more variables might be needed.
bayesplot::pp_check(fit_1a) + xlim(0,300) loo1 <- loo::loo(fit_1a, save_psis = TRUE) plot(loo1)
Also, what do the experts prefer a good looking PSIS test or good looking yrep? I think both of them should need to be good.
When using a negative binomial model instead of a poisson, the PSIS test improves and the yrep worsens.