Problems converging using custom gamma2 distribution

I think modelling variance in this case is simply that: a modelling choice. Whether it is a good choice for your data is hard to guess, though a good use of posterior predictive checks could tell you if one of those has problems for your data. (e.g. looking at the “stat_grouped” check with stats like sd, min, max over various subgroups of your data could give some clues).

The difference between fitting shape and fitting variance would manifest if you have sets of observations which differ in predictors for mean, but have the same predictors for shape/variance. This is easiest to see if we model only the mean . Changing mean (\mu) and holding shape (\alpha) constant means the variance is (if my math is correct) \sigma^2 = \frac{\mu^2}{\alpha}. Obviously, if we hold \sigma^2 constant, the implied \alpha will change with \mu. I can imagine cases when one is preferred and case where the other is preferred.

Generally yes, this can happen - poorly chosen priors will affect your model and your inferences. However, if your data contain enough information about all the model parameters, your inferences will be the same under a broad range of priors. If you have the time, it is IMHO good practice to check if your inferences change with a bit wider and a bit narrower priors. If your inferences do not change, you are quite likely OK.

Does that make sense?