Problems converging using custom gamma2 distribution

Some observations:

  1. it was easy to set the priors and run log(y) with Gaussian distribution, but setting the priors for a lognormal distribution has been difficult as determined by the pp_check looking great for gaussian but having too long of a tail for lognormal. I tried prior predictive checks but got nowhere. I read from another post according to paul that prior predictive checks are not possible yet. I haven’t been able to reduce the tail of the lognormal, so either the failures of the gaussian are hidden in the pp_check or I’m not understanding how to get the two to match up, but more than likely, It seems when there is not enough data, the priors are more important for the lognormal, which is not something i expected.
  2. When using the Gamma, I have low Bulk and Tail ESS for only the Intercept (about 200-300) with an Rhat of 1.01. From shinystan, I get warnings of params with < 10% neff and mcse. Increasing the number of iterations doesn’t seem to do much to increase ess. However, for the lognormal, while the Rhat is about the same, the ESS were better. Gamma2 while finished faster, had terribly low ESS (all much less than 100). So maybe this parameterization is not the way to go after all.
  3. I did not model 1|r|g_rep for both y and shape, even though this is how it was done in all brms docs. This is because that model would be too complex, overparameterized. However, I don’t believe not modeling 1|r|g_rep for shape would cause the low ess for intercept. When I tried modeling it, it was overparameterized, the model converged quickly with high ess for all params but the est.error was enormous for all (about 2-3 as opposed to the 0.2).
  4. monotonic effects, I read the paper and vingette on this. Monotonic effects force this monotonic behavior. And wrongly assuming this has the same affect as any other wrong modeling assumption. Failing to use monotonic effects affect ess, although this is probably not what I’m seeing. I believe my g_size and g_noise should be monotonic but that is not what I’m seeing when modeling as unorded factors. This is similar to Figure 12 in the paper. But I didn’t understand the explanation very well.
1 Like