Proper prior for explanatory coefficient in log-scale regression

yksakana · January 21, 2025, 8:50am

Hi, I am struggling with gamma regression in rstan and would like to set weakly informative priors for the coefficients of explanatory variables. I am performing a regression using the following formula, assuming that the response variable Dist follows a gamma distribution:

Dist=exp⁡(a0 + a_individual + b * Length + c * Month)

where Dist is the movement distance of an animal, a0 is the intercept, a_individual is the random intercept for each individual, b is the regression coefficient for the normalized length of the animal (Length), and c is the coefficient for the survey month Month.

Dist takes values between 0.1 and 50 meters.

In this case, can I assume that a normal(0, 1) prior is weakly informative? Since I am modeling on the log-scale, increasing the sigma for the prior leads to unrealistic values when predicting Dist using the obtained posterior.

Thank you for your help!

Bob_Carpenter · January 21, 2025, 3:58pm

The notion of prior propriety is really a matter of whether you’re using the information that you have. Sometimes, when you have a lot of data, inferences for parameters will not be particularly sensitive to the prior.

I’m not sure why you want to do a gamma regression. And you’re only specifying an expectation, not a noise model. Given the linear model of the log distance, lognormal would be a natural choice:

for (n in 1:N) {
  dist[n] ~ lognormal(a0 + a[ind[n]] + b * length[n] + c[month[n]], sigma);
}

I did as you said, not as you wrote, and treated c as a varying effect by month rather than a fixed effect with a multiplication. If it’s time since a starting event, then a fixed effect can make sense. The way it’s coded above, ind[n] picks out the index of the individual, length[n] is the length of item n and month[n] is the month of item n. In Stan, if you code everything as vectors, you can vectorize to

dist ~ lognormal(a0 + a[ind] + b * length + c[month], sigma);

The lognormal is convenient because it takes the log of the median value as the location parameter, so you don’t have to apply exp yourself.

Do you expect errors to be additive or proportional to the value? The former would indicate a normal and the latter a lognormal errpr/. If your values range across 2 orders of magnitude (0.1 to 50), then it probably makes sense to use a lognormal, because an additive error scale for measurements of 0.1 would be huge compared to an additive error scale for measurements of 50.

A prior for what? There is a vector a, a scalar b and a varying effect c? Given that this is a lognormal regression, they all multiply, so the median distance will be

mu[n] = exp(a0 + a[ind] + b * length + c[month]) = exp(a0) * exp(a[ind[n]]) * exp(b * length[n]) * exp(c[month[n]])

yksakana · January 21, 2025, 11:37pm

Thank you very much for the detailed explanation. I would like to follow up with some questions about choosing between lognormal and gamma regression, as well as selecting appropriate priors.

Yes, this is what I meant. Thank you for the clarification.
I initially used gamma regression because dist is a positive continuous variable. However, that was my only reasoning, and I am now considering switching to lognormal regression if it’s more appropriate. I hadn’t previously considered the type of errors, so I appreciate your explanation. As you mentioned, proportional error seems to be the appropriate model here, and as far as I know, both gamma and lognormal distributions assume proportional error.

As you mentioned, using a lognormal distribution means I don’t need to apply exp() to the predicted values. By doing this, I could improve the stability of the MCMC chains by avoiding excessively large values. Is this the reason you recommend lognormal regression over gamma regression? Or are there other factors that make lognormal a better choice for this model?

I would like to know the proper prior for a[ind], b, and c[month].
Unfortunately, I don’t have much data, and the priors seem to have a great impact on the posterior distributions of these parameters. When I use wider priors (e.g., increasing the standard deviation of normal priors), the posterior distributions for these coefficients become very wide, leading to unrealistic predictions for the dist value (e.g., 200) in the generated quantities block.

Seeing the results, I feel normal(0, 1) prior provides a relatively good fit to my data. Any advice on both the choice of regression model and prior specification would be greatly appreciated.

Thank you!

Bob_Carpenter · January 27, 2025, 5:18pm

Sorry for the delay—I’m just catching up with the forums.

It’s a matter of what your data look like. They have slightly different forms for the residual. I find lognormal easier to interpret as it’s just a normal on the log scale (hence it turns multiplication into addition).

Right—we only have to calculate log(y) - mu for the lognormal, which should be stable.

That depends on the problem and what you know about the values.

That’s great. Stan works best when things are on the unit scale. What you can do at this point is calculate some posterior predictive checks for quantities of interest to measure how well the model fits the data.

yksakana · January 27, 2025, 10:44pm

Many thanks for the answers.
I will try both gamma and lognormal regressions and do posterior predictive checks.

Topic		Replies	Views
Gamma regression - Prior for inverse_phi Modeling rstan	1	242	March 27, 2024
Help understanding prior specification Modeling specification	5	823	January 11, 2020
Checking if my train of thought in determining a prior makes sense General	2	475	January 16, 2021
Weakly informative priors for Shifted LogNormal Modeling specification , prior-choice , cognitive-science	3	1052	March 1, 2021
Help with non-linear model [Log probability evaluates to log(0)] Modeling rstan	2	358	October 18, 2022

Proper prior for explanatory coefficient in log-scale regression

Related topics