Proper prior for explanatory coefficient in log-scale regression

Hi, I am struggling with gamma regression in rstan and would like to set weakly informative priors for the coefficients of explanatory variables. I am performing a regression using the following formula, assuming that the response variable Dist follows a gamma distribution:

Dist=exp⁡(a0 + a_individual + b * Length + c * Month)

where Dist is the movement distance of an animal, a0 is the intercept, a_individual is the random intercept for each individual, b is the regression coefficient for the normalized length of the animal (Length), and c is the coefficient for the survey month Month.

Dist takes values between 0.1 and 50 meters.

In this case, can I assume that a normal(0, 1) prior is weakly informative? Since I am modeling on the log-scale, increasing the sigma for the prior leads to unrealistic values when predicting Dist using the obtained posterior.

Thank you for your help!

The notion of prior propriety is really a matter of whether you’re using the information that you have. Sometimes, when you have a lot of data, inferences for parameters will not be particularly sensitive to the prior.

I’m not sure why you want to do a gamma regression. And you’re only specifying an expectation, not a noise model. Given the linear model of the log distance, lognormal would be a natural choice:

for (n in 1:N) {
  dist[n] ~ lognormal(a0 + a[ind[n]] + b * length[n] + c[month[n]], sigma);
}

I did as you said, not as you wrote, and treated c as a varying effect by month rather than a fixed effect with a multiplication. If it’s time since a starting event, then a fixed effect can make sense. The way it’s coded above, ind[n] picks out the index of the individual, length[n] is the length of item n and month[n] is the month of item n. In Stan, if you code everything as vectors, you can vectorize to

dist ~ lognormal(a0 + a[ind] + b * length + c[month], sigma);

The lognormal is convenient because it takes the log of the median value as the location parameter, so you don’t have to apply exp yourself.

Do you expect errors to be additive or proportional to the value? The former would indicate a normal and the latter a lognormal errpr/. If your values range across 2 orders of magnitude (0.1 to 50), then it probably makes sense to use a lognormal, because an additive error scale for measurements of 0.1 would be huge compared to an additive error scale for measurements of 50.

A prior for what? There is a vector a, a scalar b and a varying effect c? Given that this is a lognormal regression, they all multiply, so the median distance will be

mu[n] = exp(a0 + a[ind] + b * length + c[month]) = exp(a0) * exp(a[ind[n]]) * exp(b * length[n]) * exp(c[month[n]])

Thank you very much for the detailed explanation. I would like to follow up with some questions about choosing between lognormal and gamma regression, as well as selecting appropriate priors.

Yes, this is what I meant. Thank you for the clarification.
I initially used gamma regression because dist is a positive continuous variable. However, that was my only reasoning, and I am now considering switching to lognormal regression if it’s more appropriate. I hadn’t previously considered the type of errors, so I appreciate your explanation. As you mentioned, proportional error seems to be the appropriate model here, and as far as I know, both gamma and lognormal distributions assume proportional error.

As you mentioned, using a lognormal distribution means I don’t need to apply exp() to the predicted values. By doing this, I could improve the stability of the MCMC chains by avoiding excessively large values. Is this the reason you recommend lognormal regression over gamma regression? Or are there other factors that make lognormal a better choice for this model?

I would like to know the proper prior for a[ind], b, and c[month].
Unfortunately, I don’t have much data, and the priors seem to have a great impact on the posterior distributions of these parameters. When I use wider priors (e.g., increasing the standard deviation of normal priors), the posterior distributions for these coefficients become very wide, leading to unrealistic predictions for the dist value (e.g., 200) in the generated quantities block.

Seeing the results, I feel normal(0, 1) prior provides a relatively good fit to my data. Any advice on both the choice of regression model and prior specification would be greatly appreciated.

Thank you!