Divergent transitions after warmup for logistic regression model with horseshoe priors

Hello everyone,
I am trying to fit a logistic regression model with horseshoe regularization priors.
I ran the model without any error. However I am getting some warning messages.


This is the stan model I ran,

data {
  int<lower=1> N;
  int<lower=1> K1; 
  int<lower=0,upper=1> y1[N];
  matrix[N,K1] x1;
  
}

parameters {
   real alpha1;
   vector[K1] beta1_tilde;
    vector<lower=0>[K1] lambda;
   real<lower=0> tau_tilde;
}
transformed parameters {     
   vector[K1] beta1= beta1_tilde .* lambda  * tau_tilde;
}
model {
    beta1_tilde ~ normal(0, 100);
    lambda ~ cauchy(0, 1);
    tau_tilde ~ cauchy(0, 1);
    alpha1 ~ normal(0, 100);
  
  y1 ~ bernoulli_logit_glm(x1, alpha1, beta1);
}

I am getting following warning message :

There were 21 divergent transitions after warmup. See
Brief Guide to Stan’s Warnings
to find out why this is a problem and how to eliminate them.Examine the pairs() plot to diagnose sampling problems
to find out why this is a problem and how to eliminate them.
3: Examine the pairs() plot to diagnose sampling problems

I went through some discussions in this forum and found some solutions like increase the adapt_delta and max_treedepth. I tried that but didnt work for my problem.

stan(file="logistic_model.stan", data=data, iter=2000, chains=4,
                       control = list(adapt_delta = 0.99,max_treedepth=15))

It is great if somebody can help me to fix this. Thank you very much.


Update :
The pairs plot based on two predictors looks like this :


Update :
I tried following parameterization which was mentioned here:

But it didn’t work.

data {
  int<lower=1> N;
  int<lower=1> K1; 
  int<lower=0,upper=1> y1[N];
  matrix[N,K1] x1;
  
  real < lower =1 > nu_global ;
  real < lower =1 > nu_local;
   
}

parameters {
  real alpha1;
  vector[K1] beta1_tilde;
  real < lower =0 > r1_global ;
  real < lower =0 > r2_global ;
  vector < lower =0 >[ K1] r1_local ;
  vector < lower =0 >[ K1] r2_local ;
}
transformed parameters {
  
  real < lower =0 > tau ; 
  vector < lower =0 >[ K1] lambda ; 
  vector[K1] beta1;
  lambda = r1_local .* sqrt ( r2_local );
  tau = r1_global * sqrt ( r2_global );
  beta1= beta1_tilde .*lambda  *tau;
     
}
model {
  
     beta1_tilde ~ normal(0, 1);
    r1_local ~ normal(0, 1);
    r2_local ~ inv_gamma (0.5* nu_local , 0.5* nu_local );
    
    r1_global ~ normal(0, 1);
    r2_global ~ inv_gamma (0.5* nu_global , 0.5* nu_global );
     alpha1 ~ normal(0, 100);
  
  y1 ~ bernoulli_logit_glm(x1, alpha1, beta1);
}
1 Like

Hi! Sorry for taking so long to reply!

I’ve only used the horseshoe in connection with brms, which automagically reparameterizes the model for you. In this case, let’s hope @avehtari or @paul.buerkner can help us out.

1 Like

The code shows the original horseshoe prior, but it would be better to use regularized horseshoe especially with Bernoulli model. See discussion why original horseshoe prior has problems with Bernoulli, the description of the more robust regularized horseshoe prior, and Stan code for it in Sparsity information and regularization in the horseshoe and other shrinkage priors. brms use the regularized horseshoe and the parameterization shown in the appendix C.1 of that paper.

3 Likes