Divergence in Hierarchical Binomial Linear model not solved by non-centered model

I tried making a hierarchical linear model, but resulted in 20% divergence. I tried increasing the adapt_delta, and it helped, but did not entirely resolve the problem. I read this guide, and made a corresponding non-centered model, but this led to the same divergence issue. I think I must have a mistake in my stan model? Or I misunderstand something entirely? Below are my first model (centered) and second model (non-centered).

data {
  int<lower=0> ndrivers; // Number of drivers
  int<lower=0> nteams; // Number of teams
  int<lower=0> C[ndrivers]; // Number of failures per driver
  int<lower=0> N[ndrivers]; // Number of racers per driver
  int<lower=0> is_b_driver[ndrivers]; // Categorical Variable for A/B driver class
  int team[ndrivers]; // Categorical Variable for team
}

parameters {
  real mu;
  real<lower=0> sigma;
  real alpha[nteams]; // Unique intercept per team
  real beta;  // coefficient for driver class
}
transformed parameters {
  real logit_p[ndrivers];
  real p[ndrivers];
  // Linear predictor
  for (i in 1:ndrivers){
    logit_p[i] = alpha[team[i]] + beta * is_b_driver[i];
  }
  p = inv_logit(logit_p);
}

model {
  // Priors
  mu ~ normal(0, 100);
  sigma ~ cauchy(0, 5);
  alpha[nteams] ~ normal(mu, sigma);
  beta ~ normal(0, 100);
  // Likelihood
  C ~ binomial(N, p);
}

generated quantities {
  real inv_logit_beta; //
  inv_logit_beta = inv_logit(beta);
}

data {
  int<lower=0> ndrivers; // Number of drivers
  int<lower=0> nteams; // Number of teams
  int<lower=0> C[ndrivers]; // Number of failures per driver
  int<lower=0> N[ndrivers]; // Number of racers per driver
  int<lower=0> is_b_driver[ndrivers]; // Categorical Variable for A/B driver class
  int team[ndrivers]; // Categorical Variable for team
}

parameters {
  real mu;
  real<lower=0> sigma;
  real alpha[nteams]; // Unique intercept per team
  real beta_tilde;  // coefficient for driver class
}
transformed parameters {
  real beta;
  real logit_p[ndrivers];
  real p[ndrivers];
  // Linear predictor
  beta = mu + sigma * beta_tilde;
  for (i in 1:ndrivers){
    logit_p[i] = alpha[team[i]] + beta * is_b_driver[i];
  }
  p = inv_logit(logit_p);

}

model {
  // Priors
  mu ~ normal(0, 10);
  sigma ~ cauchy(0, 5);
  alpha[nteams] ~ normal(mu, sigma);
  beta ~ normal(0, 10);
  // Likelihood
  C ~ binomial(N, p);
}

generated quantities {
  real inv_logit_beta; //
  inv_logit_beta = inv_logit(beta);
}


Any thoughts on my mistake would be helpful.

I figured out my issue. I didn’t uncenter the parameter I was pooling.

So I tried fixing my un-centering. I am still having divergence issues. Did I make a mistake? Here is my new model:


data {
  int<lower=0> ndrivers; // Number of drivers
  int<lower=0> nteams; // Number of teams
  int<lower=0> C[ndrivers]; // Number of failures per driver
  int<lower=0> N[ndrivers]; // Number of racers per driver
  int<lower=0> is_b_driver[ndrivers]; // Categorical Variable for A/B driver class
  int team[ndrivers]; // Categorical Variable for team
}

parameters {
  real mu;
  real<lower=0> sigma;
  real alpha_tilde[nteams]; // Unique intercept per team
  real beta;  // coefficient for driver class
}
transformed parameters {
  real alpha[nteams];
  real logit_p[ndrivers];
  real p[ndrivers];
  // Linear predictor
  for (i in 1:nteams){
    alpha[i] = mu + sigma * alpha_tilde[i];
  }
  for (i in 1:ndrivers){
    logit_p[i] = alpha[team[i]] + beta * is_b_driver[i];
  }
  p = inv_logit(logit_p);

}

model {
  // Priors
  mu ~ normal(0, 100);
  sigma ~ cauchy(0, 5);
  alpha_tilde[nteams] ~ normal(0, 100);
  beta ~ normal(0, 10);
  // Likelihood
  C ~ binomial(N, p);
}

generated quantities {
  real inv_logit_beta; //
  inv_logit_beta = inv_logit(beta);
}