Hi,

I am trying to fit the following hierarchical model with assymetric link for binary classification to some text categorization data in Stan.

I am interested in to use shrikage priors as laplace, horseshoe and more, with flexible structure (as normal scale mixtures) used in "Hierarchical Bayesian Survival Analysis and Projective Covariate Selection in Cardiovascular Event Risk Prediction" (http://ceur-ws.org/Vol-1218/bmaw2014_paper_8.pdf) and as done in the paper “Sparsity information and regularization in the horseshoe and other shrinkage priors” in the horseshoe case (https://arxiv.org/abs/1707.01694 ).

I have n= 5485 and k=17388 predictors.

The X matrix is a tf-idf matrix for documents (used in text categorization). This matrix is a very sparse matrix. Its columns represent documents and columns words. The values in the matrix are normalized.

Fitting the following model, I have a lot of divergent transitions however shrinkage works to obtaining between 9 and 15 non-zero coefficients (horseshoe always overcomes laplace).

I tried to reduce divergent transitions using adapt_delta=0.9999 or reparameterizing the model but it was not enough.

I would appreciate any suggestions to improve the efficiency of the model.

model.txt (1.5 KB)