I have a functioning negative binomial regression model of count data with one integer predictor. My goal is prediction and I need to predict responses for predictor values quite larger than what is present in my sample data. With what I currently have, predicted responses get extremely large for new data values of the covariate at one order of magnitude above the available sample data.
I would like to ask what I could do in terms of priors to “stabilize” the model to prevent this.
Here is a working example with simulated data using brms:
# simulation of sample data following http://www.econometricsbysimulation.com/2014/02/easily-generate-correlated-variables.html
library(MASS)
Sigma <- matrix(.3, nrow=2, ncol=2) + diag(2)*.7
rawvars <- mvrnorm(n=1000, mu=c(0,0), Sigma=Sigma)
nbvars <- cbind(a=qnbinom(pnorm(rawvars[,1]), size=0.3, prob=0.1), b=qnbinom(pnorm(rawvars[,2]), size=0.3, prob=0.01))
nbmod <- brm(bf(a~b),
family = negbinomial(),
data=nbvars, chains=2, cores=2, warmup=1000, iter=3000)
posterior_predict(nbmod, newdata = data.frame(b=c(10, 100, 1000, 10000)), scale="response", nsamples=10)
Output looks like this:
[,1] [,2] [,3] [,4]
[1,] 0 4 1828 2.690863e+39
[2,] 5 7 274 1.008591e+29
[3,] 0 0 6908 2.524178e+25
[4,] 1 9 1531 2.425535e+27
[5,] 2 0 148 5.014781e+25
[6,] 0 2 102 9.682165e+24
[7,] 3 9 65 3.470250e+19
[8,] 2 1 34 9.535929e+26
[9,] 9 6 1547 2.965608e+25
[10,] 4 1 20 2.436934e+28