Looking for an advice regarding divergent transitions

I have fitted a LASSO regularization logistic regression model as follows :
image
image

data {
  int<lower=1> N;
  int<lower=1> K1; 
  int<lower=0,upper=1> y1[N];
  matrix[N,K1] x1;
  
}

parameters {
   real alpha1;
   vector[K1] beta1_tilde;
    vector<lower=0>[K1] tau_tilde;
   real<lower=0> lambda ;
}
transformed parameters {
  
   vector[K1] beta1= beta1_tilde .* tau_tilde*(lambda^2)/2;
}
model {
  
     beta1_tilde ~ normal(0, 1);
    tau_tilde ~ exponential(1);
    lambda ~ cauchy(0, 1);
    alpha1 ~ normal(0, 100);
  
  y1 ~ bernoulli_logit_glm(x1, alpha1, beta1);
}

But I am getting the divergent transitions warning. So I increased the adapt_delta parameter .


 stan(file="logistic_LASSO2.stan", data=data_stan,
           control = list(adapt_delta = 0.99999,max_treedepth=15),
                              iter=2500, chains=4)

Still I am getting some divergent transitions. This is the pairs plot for the parameter lambda and the first regression coefficient . The model had 5 divergent transitions.

The divergent transitions appeared to scatter randomly. Can anybody advice how to improve the results so that I can get rid of this divergent transition warning ?

Thank you.

I am very much a novice, so be careful with my advice. I suggest you try much tighter priors on lambda and alpha1. The fat tails on the cauchy can cause problems and the very broad normal on alpha1 is allowing values that do not seem likely in a logistic regression. As an experiment, try normal(0, 3) or something like that on both of them.

Driving adapt_delta so close to one is probably not a useful approach. I know the warning message suggests that. From what I have read on this forum, a well specified model should not need such a high value.

2 Likes

@FJCC Thank you for your valuable advice. I will try priors for the intercept term as you mentioned. Also I will consider your advice regarding adapt_delta too. Also is it possible to know whether I could have done a different parametrization for beta coefficients ? Actually, I am trying different regularization priors. Unfortunately , almost every prior giving this divergent transitions warning .

I tried tight priors. But didn’t have any luck :(

I tried several parameterizations . For of such is as follows:

transformed parameters {
   
   //vector[K1] tau = tau_tilde *(lambda^2)/2;
   real<lower=0> lambda2 = lambda/sqrt(2);
   vector [K1] tau= tau_tilde*lambda2;
   vector[K1] beta1= beta1_tilde .* tau;
}
model {
     
    beta1_tilde ~ normal(0, 1);
    tau_tilde ~ exponential(1);
    lambda ~ normal(0, 5);
    alpha1 ~ normal(0, 5);
  
  y1 ~ bernoulli_logit_glm(x1, alpha1, beta1);
}

By choosing the default values for adapt_delta , my pairs plots looks like this :
The green dots corresponds to divergent transitions.




will this imply there is some thing wrong with the the priors for other coefficients also ?

I wish could dive into this but I got a bunch of “real” work dumped on me today and I can’t take the time. Two quick suggestions:

Try building some toy data with just a few parameters. Maybe two parameters that contribute to the output and one that is just noise. Can the model recover the correct values?

Can you post your data somewhere? If it is not tool big, I would be willing to play with it should I get the time. How big is the data set?

I hope one of the experts will see this thread and provide more guidance.

1 Like

@FJCC

Hi Thank you reaching me out for helping me. I cannot share my data for privacy reasons. But I have a toy data set . When I try to apply the same model to that data, I am getting the same issues.
here i have upload my data and code.
diabetes.csv (23.3 KB) logistic_model_reg_LASSO4.stan (654 Bytes)

require(dplyr)
student.mat <- read.csv("C:/Users/diabetes.csv")
str(student.mat)
stu_data=student.mat %>% select(c(Outcome,Glucose,BMI))
stu_data_stan_ridge=list(N=dim(stu_data)[1],K1=2,y1=stu_data$Outcome,
                         x1=stu_data %>% select(-c(Outcome)))

stufit_reg_LASSO4 <- stan(file="logistic_model_reg_LASSO4.stan", data=stu_data_stan_ridge,
                          iter=2000, chains=4)

Please check this when you have a time. Thank you once again.

Is this something to deal with the incorrect transformation of the variables ?