My sense is that it would be difficult since they are not actually trying to estimate the Phi matrix in Equation 3 of the paper. Rather, they are simulating random values of Phi and then normalizing it. My recollection is that I can’t use random variables this way in Stan.
It does seem like it’s something you could code. I read far enough to see that Phi is constructed from discrete elements (psi^1/2, 0, or -psi^1/2) which could be constructed via a mixture if you were willing to make psi a parameter. No idea how well that would work in practice.
They say in the paper that Psi is a parameter, but I don’t think Phi is. I
didn’t think you could do something like that where you randomly simulate a
variable in Stan and use that in a model. I figured you would need to
marginalize it out or something (which I wouldn’t know how to do).
It would be a transformed parameter and you would need to marginalize but it would look more or less like this:
paramameters {
real psi;
}
transformed parameters {
matrix[N,P] phi;
for (i in 1:N) {
for (p in 1:P) {
phi[i,p] = psi^2 * -psi^(1/2) + (1-psi^2) psi^(1/2)
}
}
phi = gram_scmidt(phi) // that would need implementing, it's useful
// for other models too and I've seen an implementation
// that looked auto-diff friendly.
}
My first thought was that the formula on page 60 of the presentation (also stated in the paragraph between Equations 3 and 4 of the paper) looks different, but this is the type of thing that I don’t always have a good intuitive grasp or and would need to play around with it to be sure.
And from what I know about Stan, it makes sense that the geometry could be atrocious given that formula.
I wrote up an R script to show the difference between the two (at least I
think I modified yours to match it correctly, the presentation and the
paper have different formulas). Phi2 below seems (using your formula) much
more well-behaved. Phi is like a mixture between a discrete variable and
another mixture of two continuous distributions. I can imagine Stan would
not like it.
I put two versions below, they both produce about the same histograms,
normal-looking centered around 0.25. The phi and phi2 functions look a bit
different.
newphi <- rep(0, N)
for (i in 1:N)
{
psi <- 0.1 + runif(N) * 0.9
newphi <- newphi + sample_phi(psi, sample_phi_i)
}
newphi <- newphi / N
More seriously, the question is how do changes in psi affect the posterior distribution of the betas, but that would take more effort since you’d need the whole gram-schmidt thing coded
Before you do anything else with this I would suggest taking a look at the normal mixture section of the manual where the implementation of marginalization is discussed. Once that makes sense I think this will too.