When to use reparameterization in gamma regression?

Hi, I am working on gamma regression in rstan and have encountered some challenges.
The model I initially use is based on the following parameterization:

y ~ gamma(mu^2/phi, mu/phi)

where mu is defined as:
mu = exp(a0 + a_ind + b + c)

However, some post suggest the following reparameterization such as here and here:
y ~ gamma(inverse_phi, inverse_phi ./ mu)

I applied both parameterizations to two different datasets. For the first dataset, the original parameterization works “better” (no divergent transitions and low BFMI issues), while for the second dataset, the alternative reparameterization performs better.

My question is:

  1. Is it acceptable to choose the parameterization that provides “better” results for a given dataset, or should I aim to use a consistent parameterization across all datasets?
  2. Are there any considerations I should be aware of when deciding between these parameterizations?

Thank you!

Yes. You just want to make sure the models have the same posterior. We do this all the time with things like centered vs. non-centered parameterization—the former works well for large data sets and the latter for tiny ones.

You just have to be careful to apply change-of-variables corrections to make sure you’re fitting the same model. For instance, you want to give inverse_phi a prior that matches the prior for phi after inverting (and hence requires the same kind of Jacobian correction as lognormal or inverse gamma).

1 Like

Thank you very much for the answer!

You just want to make sure the models have the same posterior.

How can I check this? In the case of the second model, I have posteriors for the 1. original model without parameterization and with some warnings and 2. reparameterized model without warnings. If I understood your wards correctly, can I compare the posteriors from these models with warnings and without warnings?

You just have to be careful to apply change-of-variables corrections to make sure you’re fitting the same model. For instance, you want to give inverse_phi a prior that matches the prior for phi after inverting

Sorry to ask many questions, but if I use prior normal(0, 5) for phi in the original model, how can I set the prior for the reparameterized model?

Thank you!

You can always compare a set of draws. Or just look at the posterior means and standard deviations. These should be the same to within MCMC standard error for the two models.

[edit: realized you could do this with direct transform or user-defined inverse normal density more cleanly.]

Direct transform

You can just do the matching transform directly in Stan, which is probably easiest!

parameters {
  real phi;
}
transformed parameters {
  real inv_phi = inv(phi);
}
model {
  phi ~ normal(0, 5);
}

and then you just use inv_phi as you would otherwise.

Reparameterization approach

If the new model uses inv_phi = 1 / phi, then you just need to account for the change of variables and use inverse_normal(inv_phi | 0, 5). We don’t have an inverse normal, but we can apply the usual rules. If X \sim \textrm{normal}(\mu, \sigma) and Y = f(X) = 1 / X, we have

P_Y(y) = P_X(f^{-1}(y)) \cdot \left| \frac{\text{d}}{\text{d}y} f^{-1}(y)\right|,

and note that f^{-1}(y) = 1 / y (a function that it’s own inverse is called an “involution”), so that

\frac{\text{d}}{\text{d}y} f^{-1}(y) = \frac{\text{d}}{\text{d}y} 1/y = \frac{\text{d}}{\text{d}y} y^{-1} = -y^{-2}.

We started with p_X(x) = \textrm{normal}(x \mid 0, 5), so we get

p_Y(y) = \textrm{normal}(1 / y \mid 0, 5) \cdot \left| -y^{-2} \right|.

In Stan, using the fact that

\log \left| -y^{-2} \right| = \log y^{-2} = -2 \cdot \log(y),

we can code the inverse normal as follows

1 / inv_phi ~ normal(0, 5);
target += 2 * log(y);

We could just pack that up into a distribution as a function,

functions {
  real inv_normal(real y, real mu, real sigma) {
    return normal_lpdf(1 / y, mu, sigma) + 2 * log(y);
  }
}

and then use

inv_phi ~ inv_normal(0, 5);
1 Like

Thank you, Bob. Your answer was so helpful! I will check the posterior and reach out again if there seems to be a problem. Thank you also for the information about the prior.

@Bob_Carpenter

By comparing the two models (original and reparameterized models), I realized that the posterior for the effect for c (this was a regression coefficient for the survey month on y) is slightly different between the two models.
I used the prior transform method you taught me in the previous comment.

In this case, which should I choose? Is it right to choose the one with larger n_eff? Or is there any other criterion to choose a better model?

Original model result:


Reparameterized model result:

Thank you for your help!

If both models are fitting properly, the only difference should be variance from sampling as reflected in the Monte Carlo standard error (MCSE). So you should see the estimates are roughly within +/- 2 standard errors of each other (4 standard errors total). Often a centered parameterization will have issues such that it doesn’t mix properly. So if they’re close, I’d trust the reparameterized one with larger effective sample size (assuming that is the reparameterized one).

If you’re worried about model specification, you can simulate data and fit and then validate with simulation based calibration checks. You can also just do a few iterations of SBC to informally evaluate (that is, simulate data from model, then fit and make sure that estimates are close to simulated parameters within uncertainty (standard deviation, not standard error for this)).

Yes. The whole point of the reparameterization is to generate a larger effective sample size more efficiently and robustly.

1 Like

Thank you very much, Bob.
As you infered, the reparameterized one had much larger n_eff for all parameters. Additionally, the estimates are roughly within the plus minus twice the errors. So I would go with the reparameterized one. Thank you also for the detailed information on the calibration check, I will try that.