Ad 1) Fair enough. See for instance here on Wikipedia about the gamma-poisson mixture and NB distribution being the same thing.
Ad 2) I learned about hierarchical modelling on the job, so I don’t have the best or latest recommendations for literature or study books. Maybe someone else does. I’m sure there are plenty of resources if you look for “multi-level” or “hierarchical modelling”. The Stan documentation also provides a nice high-level overview, I think.
As for the case you present here: your model actually doesn’t treat your simulated data as having a hierarchical structure as you are assigning each of the M
groups its own coefficient beta
. This is also the reason why your estimate of betabar
seems “off”: you are trying to estimate a M-length mean vector and M by M covariance matrix from essentially one observation (i.e., one set of M model coefficients beta
). In contrast, a simple hierarchical model might consider installation time (M groups) as a group-level predictor, meaning that aside from your coefficients for refrigerator model and and production time (which, as I understand, are not in the model at the moment) you will have a group-level term \epsilon_j, which is governed by the between group-variation parameter \sigma_M.
y_{ij} \sim NegBinom(\xi_{ij},\alpha)
\xi_{ij} = exp(\beta X_i+ log(offset_i) + \epsilon_j)
\epsilon_j \sim N(0,\sigma_M)
Note that these equations involve two suffixes: i for individual observations and j for groups. Also note that there is no multivariate distribution involved anywhere yet. Only if one were to add a second group-level predictor, one could consider that the effects of those two group-level predictors are correlated (e.g., if the effect of one is higher, the effect of another group-level predictor might be lower). But it doesn’t sound that this is at play here.
If you are using R and are having trouble to code up your model in Stan, consider using the rstanarm or brms package, which can both do what you need.