games_data.csv (3.8 KB)
negbin.R (576 Bytes)
bnb1.stan (1.8 KB)
I’m trying to fit a bivariate negative binomial regression (“BNB1” described here https://aip.scitation.org/doi/pdf/10.1063/1.4903663) to a dataset of soccer results. I keep getting warnings about divergent transitions, and the estimation results also seem to be incorrect as according to them all teams are equal in skill.
Previously when I have encountered divergent transitions it has been in context of hierarchical regressions and following various tutorials it has been a fairly straightforward matter to apply noncentered reparametrization. Now however I’m at a loss because divergent transitions are not related to hierarchical structure, and the distribution I’m working with is not covered by tutorials I know of.
I attached a runnable example (btw the code runs for few iterations in the example I attached, I actually used 20k iterations), and if someone has an idea how to fix it that would be great (by the way I’m aware of the fact that vectorizing operations would be best practice, but I’ll worry about efficiency once I’m able to estimate the parameters at all). However, the model being somewhat complicated I understand that debugging it is much to ask, so more general pointers would be much appreciated:

Are there any suggested readings on how to gain intuition on what it is about the model that causes the posterior geometry to be problematic for sampling? Or for that matter introductory readings to thinking about posterior geometry?

Once I have some idea about what’s wrong, how do I go about reparametrizing a model? Is there any more systematic method than staring at the loglikelihood function and thinking carefully?
Also now that I’m posting I have a question that does not have to do with divergences but with the bivariate negative binomial itself:
The expected values of marginal distributions are according to the paper just mu[1] and mu[2]. When I fit the model without any regression parameters, i.e. just setting a prior on mu directly I avoid all warnings about divergences and things seem to work out fine. However, the posterior mean estimates I get for mu[1] and mu[2] are 1.4 and 1.6, whereas corresponding sample means are like 1.2 and 1.4 or so. The posterior variability is small enough that the actual sample means are quite outside of any reasonable probability interval. Now, to my understanding the posteriors should be centered around sample means? Does this mean BNB is fundamentally unsuited for describing these data? Or maybe that my implementation is incorrect? The latter would seem much more likely, if it were not for the fact that I also implemented separately another variant of bivariate negative binomial from another paper and got essentially the exact same results. EDIT: having contemplated, is it that sample mean of first component is not an unbiased estimate of the expected value of corresponding marginal when it comes to BNB?
Thanks!