# Regularized horseshoe prior for latent factor-loadings matrix? (in latent factor analysis)

Hi, I found there were various discussions on prior selections like this, but I could not find a clear answer about it, especially related to horseshoe prior for latent factor analysis.

The main reason that I am looking for this is to find a substitute for Bhattacharya & D. B. Dunson, 2011 to implement in Stan.

Piironen & Vehtari, 2017 was very helpful, but I got stuck expanding the regularized horseshoe prior to a prior for factor-loadings matrix.

For example, a simple latent factor model can be described as

\mathbf{y_i} = \mathbf{ \Lambda} \mathbf{\eta_i} + \mathbf{\epsilon_i}, \: \mathbf{\epsilon_i} \sim N_p(0, \mathbf{\Sigma})

where \mathbf{y_i} =[y_{i1}, ..., y_{ip}]^T, \mathbf{\eta_i} is a vector with k latent factors, \mathbf{ \Lambda} is p x k factor loadings matrix, and \mathbf{\Sigma} = diag(\sigma_1^2, ..., \sigma_p^2).

My plan was to combine multiple vectors generated from horseshoe function (implemented in brms)

functions {
vector horseshoe(vector zb, vector[] local, real[] global,
real scale_global, real c2) {
int K = rows(zb);
vector[K] lambda = local[1] .* sqrt(local[2]);
vector[K] lambda2 = square(lambda);
real tau = global[1] * sqrt(global[2]) * scale_global;
vector[K] lambda_tilde = sqrt(c2 * lambda2 ./ (c2 + tau^2 * lambda2));
return zb .* lambda_tilde * tau;
}


in a column-wise fashion like this (dimension of sigma is not appropriate here).

for (k in 1:K){
factor_loading[:,k] = horseshoe(zb[k], hs_local1[k], hs_local2[k], hs_global1[k], hs_global2[k], hs_scale_global * sigma[k], hs_scale_slab^2 * hs_c2[k]);
}


I am wondering,
1. Is it valid to do so to generate a factor-loadings matrix?
2. In this case, how can I deal with sigma, which is now a vector, when calculating tau?

Hi!

I did not find a solution, and I abandonned the idea, I’m good with a limited number of factors. The thought I had about it :

By implementing such a prior, we remove the constraints on the loading matrix that make it identifiable (basically upper diag zero loadings and positive diagonal). I read some articles in which they relaxed a little bit these constraints, but I did not find these really interesting.

I have the strong feeling that pushing loadings toward zero in a unsupervized way would lead to non identifiable model : multiple loading matrix with different combinations of zero would be equivalent without constraints. In the Bhattacharya and Dunson paper, the higher the index of a loading column was, the higher it was shrinked.