# Hierarchical Binomial Model cannot initialise some chains when N_SUBJ > x

Hi there,
I have been trying to fit a hierarchical binomial model on the lottery/surebet choices by rats given the lottery offer. I started with a simple non-hierachical version and synthetic data, it works just fine. I then progressed to a species-level model. Interestingly, this model works well when `N_SUBJ < 8`, meaning all the 4 chains with default parameters run and give no divergences. However, when I set `N_SUBJ >= 8` in the synthetic dataset, only 1 out of 4 chains would run and the others gave this typical error:

``````Chain 4:   Log probability evaluates to log(0), i.e. negative infinity.
Chain 4:   Stan can't start sampling from this initial value.
Chain 4: Initialization between (-2, 2) failed after 100 attempts.
``````

And the remaining one chain that did initialise would often end with 100% divergences.
When I ramp up `N_SUBJ` to be a large value, say 20, then none of the chains would initialise…
Note that previously I have completed a `bernoulli_logit` version of the same model and it works perfectly fine on all levels, but I want to speed up computation by using `binoimal` if possible due to large data size.
Here’s the model, any help is appreciated!

``````// hierarchical model fitting one species as beta-rho risky agents with binomial
functions {
real choiceprob(real rho, real beta, real lott_value, real lott_prob, real sb_value, real rew_multi) {
real y;
real u1;	// Lottery utility
real u2;	// Surebet utility

u1 = lott_prob * (rew_multi * lott_value) ^ rho;
u2 = (rew_multi * sb_value) ^ rho;
y = 1 / (1 + exp(-beta * (u1 - u2)));
return y;
}
}
data {
int<lower=0> T; // Number of trials types
int<lower=0> K; // number of subjects in each species
int individual[T]; // vector of subjid indexes
vector[T] lott_value; // Lottery value for that choice
vector[T] lott_prob; // Lottery probabilities for that choice
vector[T] sb_value; // Surebet values for that choice
vector[T] total_rew_multi; // total reward multiplier = base_reward * rew_multi
int n_chose_lott[T]; // number chose lottery for this trial type
int n_trials[T]; // total number of trials for this trial type
}
parameters {
real log_rho_s; // species-level rho
real log_beta_s; // species-level beta
real<lower=0> log_sigma_rho; // standard deviation for species rho
real<lower=0> log_sigma_beta; // standard deviation for species beta
vector[K] log_rho_i; // individual-level rho
vector[K] log_beta_i; // individual-level beta
}
transformed parameters{
real rho_s;
real beta_s;
vector[K] rho_i;
vector[K] beta_i;
rho_s = exp(log_rho_s);
beta_s = exp(log_beta_s);
rho_i = exp(log_rho_i);
beta_i = exp(log_beta_i);
}
model {
// set weak priors
log_rho_s ~ std_normal();
log_beta_s ~ std_normal();

// draw individual parameters from the species distribution
log_rho_i ~ normal(log_rho_s, log_sigma_rho);
log_beta_i ~ normal(log_beta_s, log_sigma_beta);

// fit the actual model to each trial
for(t in 1:T){
n_chose_lott[t] ~ binomial(n_trials[t], choiceprob(rho_i[individual[t]], beta_i[individual[t]], lott_value[t], lott_prob[t], sb_value[t], total_rew_multi[t]));
}
}
``````

And some example rows of synthetic data file, note that each individual / subject only has 6 rows of data here (6 different lottery offers).

``````        individual   lott_value lott_prob sb_value total_rew_multi n_chose_lott n_trials
1               1          0       0.5        3               1           34      327
2               1          2       0.5        3               1           65      327
3               1          4       0.5        3               1          135      332
4               1          8       0.5        3               1          192      302
5               1         16       0.5        3               1          344      352
6               1         32       0.5        3               1          359      359
7               2          0       0.5        3               1           70      342
8               2          2       0.5        3               1           83      307
9               2          4       0.5        3               1          129      345
10              2          8       0.5        3               1          192      360
11              2         16       0.5        3               1          262      336
12              2         32       0.5        3               1          293      309
13              3          0       0.5        3               1           83      341
``````

Some things I have tried but did not work:

1. By using the `shinystan` interface I found out that when the chains have high divergences, the `stepsize ` is often a ridiculously small number compared to what `stepsize` is when the model runs fine (~0.35). So I added `stepsize = 0.35` in setting, did not help.
2. Increasing `adapt_delta` even to 0.99 did not help at all.
3. Increasing both `adapt_delta` and set `stepsize` did not help.

Something even more interesting:

1. `N_SUBJ = 8` :only 1 chain runs
2. `N_SUBJ = 9`: 2 chains run
3. `N_SUBJ = 10`: 0 chains run
4. `N_SUBJ = 11`: 1 chain run
5. `N_SUBJ > 11`: 0 chains run

When you have initialization problems the first thing to try is to set `init="0"`.

Looks like `lott_value` can be high and it could be squared (or more) when calculating the choice probability. If `beta` is not very small the probability can easily overflow and round to 1.
The calculation is more stable if you use `binomial_logit` instead of `binomial`.

``````real choiceprob_logit(real rho, real beta, real lott_value, real lott_prob, real sb_value, real rew_multi) {
real y;
real u1;	// Lottery utility
real u2;	// Surebet utility

u1 = lott_prob * (rew_multi * lott_value) ^ rho;
u2 = (rew_multi * sb_value) ^ rho;
return beta * (u1 - u2)));
}
...
for(t in 1:T){
n_chose_lott[t] ~ binomial_logit(n_trials[t],
choiceprob_logit(rho_i[individual[t]], beta_i[individual[t]],
lott_value[t], lott_prob[t], sb_value[t],
total_rew_multi[t]));
}
``````