Parametrization for a Poisson model with gamma distributed rates for correlated time periods

When we fit a negative binomial (in the mean rate and dispersion parameter parameterization as opposed to the one with a precision parameter) data to counts, we can either do that using the negative binomial likelihood or equivalently with a Poisson likelihood plus a gamma distributed random subject effect. The latter formulation is how we can deal with multiple (correlated) observations for different time periods per subject that differ in some covariates (so that you cannot just add up the number of events and follow-up time).

I leave out the covariates in what follows below, because it’s does not seem that important in terms of how to parametrize the model (feel free to disagree), because I suspect it matters the most how I implement the random subject effect. I am wondering: Is there some clever way of parametrizing the random effects distribution akin to the centralized vs. non-centralized parametrization for normally distributed random effects?

I already noticed that for

random_patient_effect ~ gamma(1/kappa, 1);
aval ~ poisson_log( logmu + log(random_patient_effect) + log(kappa) + log_follow_up);

I get some issues “1: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See 2: Examine the pairs() plot to diagnose sampling problems”. However, the this is the best, I got, at all (and even seems to do better than the straightforward negative binomial likelihood, see below). At least this issue is presumably more benign than some (and possibly can be handled with more warm-up samples?).

Things are much worse, if I go for

random_patient_effect ~ gamma(1/kappa, 1/kappa);

in the above, which results in lots of divergent transitions, issues with the tree-depth and things like " Log rate parameter[xxx] is 1.#QNAN". Similarly, if I just have a single record per patient, then the below

aval ~ neg_binomial_2_log( logmu + log_follow_up, 1/kappa);

has problems (and can only be used for a single observation per subject) such as divergent transitions and “neg_binomial_2_log_lpmf: Precision parameter is 1.#INF, but must be finite”.

However, even if I hand-code the pdf for a gamma distributed random variable after log-transformation (see code below), I wonder whether I could parameterize this all more cleverly to improve sampling.

I realize that a normally distributed random effect could be an alternative (after all the log-normal distribution is somewhat similar to the gamma distribution), but it would be an easier sell to go with the model more closely related to the negative binomial model that people in the field are familiar with.

Here is how I am implementing this model at the moment. Any suggestions for improvements, efficiency gains and especially better parametrization would be most welcome. NB: For anyone trying this at home: providing Stan reasonable initial values are crucial for making any of this work (once you do the explicit random patient effect).

functions {
// PDF for a vector of log-transformed i.i.d. gamma distributed random variables
real loggammadist_lpdf(vector y, real alpha, real beta){
return sum(betay-exp(y-log(alpha)) - betalog(alpha) - lgamma(beta));

data {
int<lower=1> nrecords;
int<lower=0> aval[nrecords];
real<lower=0> follow_up[nrecords];

transformed data {
vector[nrecords] log_follow_up;
for (r in 1:nrecords) {
log_follow_up[r] = log(follow_up[r]);

parameters {
real logmu;
real<lower=0> kappa;
vector[nrecords] random_patient_effect;

model {
logmu ~ student_t(4, 0, 2.3);
kappa ~ student_t(4, 0, 2.3);
random_patient_effect ~ loggammadist(1/kappa, 1/kappa);

aval ~ poisson_log( logmu + random_patient_effect + log_follow_up);

Hmm… quickly skimming over it a few points which could help:

  • I would sample 1/kappa directly, so inv_kappa… maybe even better to sample log_kappa such that the inversion is just a sign flip
  • I also would give logmu a normal; making mu log-normal (much easier to sample than those fatty t)
  • The gamma(1/kappa, 1) can be turned into Chi-Square variate by scaling with 1/2
  • The logdammadist_lpdf should be parametrized by log_alpha and log_beta

These points should improve numerical stability issues. The next step is to look at pairs plots and this new mcmc_parcoord plot in bayesplot may help. Maybe this gives some insight on what is going on and causing trouble.

Looking about, I would have had modeled it as:

aval ~ poisson(lambda);
lambda ~ gamma(r, p / (1-p));
p = inv_logit(mu +random_patient_effect + log_follow_up);

r controls the shape of the counts … and lambda defines the process as whole.