Horseshoe regularization on matrix factorization not working

Short explanation of centered/non-centered parametrization in Stan User’s guide chapter on Efficiency Tuning. If you google “centered non-centered parameterization Stan” you’ll find several examples, e.g. Chapter 11 Hierarchical models and reparameterization | An Introduction to Bayesian Data Analysis for Cognitive Science

Non-centered parametrization is good if likelihood contributions for group specific parameters are weak. Centered parameterization is good if likelihood contributions for group specific parameters are strong. Since HS is not shrinking close to zero, I inferred that your likelihood is strong, and thus centered parameterization is likely to be better.

If you are uncertain whether to use m, n*m, or n to scale the global scale, I recommend simulating from the prior to see the effect of these for the prior. This would be useful anyway, to see whether the HS prior matches your assumptions in what is close to zero.

Unfortunately, I don’t have strong advise on how to use HS in matrix factorization. The issue is complicated as the elements are not fully exchangeable and you would like to take into account the structure. I remember seeing alternative sparsity priors presented in literature for matrix factorization, but can’t remember exact references at the moment. Based on the results you reported here and if the likelihood is actually informative, then different sparsifying priors should not really matter. To get forward, I think the next useful steps are prior simulation and testing what happens if you make the global scale smaller and smaller. You could also use priorsense (see Priorsense 1.0 is now on CRAN) to check the sensitivity of the results to your prior choices.¨