Greetings,
We are fitting a model where we try to estimate the size of an epidemic outbreak based on the number of mutations observed in samples of foot and mouth viruses in infected farms. The main idea behind the model is that the number of mutations are described by a negative binomial with parameters “z” and “shape”. If we observe all infected farms in an epidemic, the distribution of observed mutations should follow that NB. However, if we missed one farm in a chain of transmissions, the number of mutations after the missing farm will be distributed as a NB with 2z and 2shape. If you missed 3, you have NB(3z, 3shape) and so on. Thus, the data is a mixture of these NBs and we estimate what proportion each one of these contributes to the mix. Now, the issue is how to decide how many of these NBs to consider. What we have done is to fit models with just one NB, two, three, and up to eight. We fit the model with Stan and then used LooIC and found that the different models were not too far from each other. These are the delta LooIC:
model1 10.75
model2 10.84
model3 3.30
model4 0.00
model5 0.42
model6 1.97
model7 3.56
model8 4.56
So, we decided to use model averaging and our doubts are about whether to use stacking or pseudo BMA as they produce very different weights.
With stacking we get:
model1 0.108
model2 0.000
model3 0.035
model4 0.496
model5 0.361
model6 0.000
model7 0.000
model8 0.000
and with pseudoBMA
model1 0.026
model2 0.009
model3 0.113
model4 0.328
model5 0.290
model6 0.130
model7 0.062
model8 0.042
It is not totally clear to me why do we get these differences and if one option should be preferred over the other.
I understand that this may not have a simple answer! In any case, I’d really appreciate any comments about this. Thanks a lot in advance.