What is a suitable prior for latent factor loadings?


A year or two ago, I got help here setting up a model with latent factor loadings.

Now I’m running into trouble trying to validate the model using simulation based calibration (SBC). I used a normal(0,10) prior for the factor loadings. However, this prior isn’t a good expression of my prior belief. A more realistic prior would be similar to lognormal(0,1), but allowing for an arbitrary sign. As it is now, with normal(0,10), the density is concentrated near zero. This trips up SBC because the factor loadings that are drawn from the prior distribution are near zero which makes them difficult to recover. I tend to get lots of divergences because the data have such a weak factor structure.

I tried using a lognormal(abs(loading) | 0,1) prior, but this just makes sampling impossible because the loading’s sign is set by random starting values. The loading’s density vanishes near zero so the loadings cannot switch between negative and positive.

What is the recommended way to handle this situation?

Thank you.


I got an idea.

I used the unreleased SBC branch of rstan to test whether I can generate data using lognormal*(2*bernoulli-1) and then recover the parameter using normal(0,3). It seems to work! The SBC distribution looks just as uniform as when I use normal(0,3).

Here is the stan model (528 Bytes) and R code (1017 Bytes). Is this kosher?