How do we ensure random effects have correct distribution?


I am working on a model of lung cancer with patient-level random effects (which are used to determine an acceleration factor). At the moment I have a scale parameter real<lower=0> re_sigma; and then random effects parameters vector[N] re_z;. I then use in my model block

model {
  re_sigma ~ rnorm(0, 0.2);
  re_z ~ rnorm(0, 1);
  vector[N] re_t = exp(re_sigma * re_z);

I haven’t yet had a chance to run this model with Hamiltonian Monte Carlo, but I ran with optimisation (to find posterior mode) and the random effects re_z do not follow a standard normal distribution (final iteration, across ~27k random effects). Instead they are very clustered near 0, with standard deviation < 0.001. This naturally then also affects the inference for re_sigma.

When I run HMC will these actually follow a standard normal distribution (within each sample, across ~27k random effects), or do I need to do something else to stop them tending to 0? If so, what would that be? e.g. sum(re_z .^ 2) ~ chi_square(N); (and would I need to add a Jacobian adjustment for this too?)

Many thanks

I’m not super familiar with how optimization mode works, but with one iteration and such a high dimensional parameter space, it’s not surprising that the estimates stayed close to the initialisation point, which is a uniform distribution centered on zero.

Thanks @mike-lawrence.

Apologies for not being clear - it ran for 634 iterations, but I am referring to the distribution on the final iteration.

In this case I did initialise them with random standard normal variates, and they then all got pushed back to zero. I think it’s just because that is the mode of the distribution for each individual random effect?

Do you have any intuition about whether when I run with HMC they will instead follow the standard normal distribution as expected?

Many thanks

Using HMC will permit propagation of uncertainty, but that would only be noticeable in quantities/visualizations computed across iterations. See here for discussion of approaches to checking hierarchical distribution structure.

Thanks @mike-lawrence. For now I have decided to scrap including these random effects as parameters and instead will marginalise over random effects using Gauss-Hermite quadrature. Fingers crossed it works!

You can’t reliably optimise the population and subject parameters at once, hmc estimates may look very different.

1 Like