I am currently working on my postgraduate dissertation in statistics and I am using STAN extensively for this. My dissertation involves simulating a latent position model of chess games which estimates the ratings of the respective chess players. So, the model would be an alternative to the existing Elo Rating system.

My raw data consists of all the matches played during January 2024 on Lichess among all titled, non-BOT players. The only input I am using from this data are the names of each unique player and the outcome for each match.

With regards the STAN model, I have based the priors of both latent variables (ratings and white adjustment factor) on their respective distribution from the raw data.

Currently, my STAN model is as follows:

data {

int<lower=0> P; // number of players

int<lower=0> N_games_matrix[P, P]; // number of games matrix

int<lower=0> Y_matrix[P, P]; // scores matrix

}

parameters {

vector<lower=2004.86, upper=3200>[P-1] gamma_free;

vector<lower=0.015, upper=0.075>[P] W;

}

transformed parameters {

vector[P] gamma; // latent ratings for each player

gamma[1] = 0; // constrain the first playerâ€™s gamma to 0

for (p in 2:P) {

gamma[p] = gamma_free[p-1]; // assign the rest of the gammas

}

}

model {

// Priors

gamma_free ~ normal(2579.787, 191.9756); // prior for the free latent ratings

W ~ normal(0.045, 0.5); // prior for the white player advantage

// Likelihood

for (i in 1:P) {

for (j in 1:P) {

if (i != j) { // Ensure i is not equal to j

real eta = gamma[i] - gamma[j] + W[i];

```
// Debugging: Print intermediate values and target log probability
print("i: ", i, " j: ", j, " eta: ", eta);
print("Y_matrix[i, j]: ", Y_matrix[i, j], " N_games_matrix[i, j]: ", N_games_matrix[i, j]);
print("target(): ", target());
Y_matrix[i, j] ~ binomial(2 * N_games_matrix[i, j], inv_logit(eta));
// Debugging: Print updated target log probability
print("Updated target(): ", target());
}
}
```

}

}

However, I am consistently receiving errors like this:

â€śChain 2: Log probability evaluates to log(0), i.e. negative infinity.

Chain 2: Stan canâ€™t start sampling from this initial value.

Chain 2:

Chain 2: Initialization between (-2, 2) failed after 1 attempts.â€ť

In addition, when I included print statements to see where exactly the intialization fails, I get the following for each estimate value:

Y_matrix[i, j]: 0 N_games_matrix[i, j]: 0

target(): -inf

Updated target(): -inf

i: 948 j: 714 eta: -7.06877

Y_matrix[i, j]: 0 N_games_matrix[i, j]: 0

target(): -inf

Since N_games_matrix[i, j] and Y_matrix[i, j] are zero, I am pretty sure the binomial distribution might be causing issues. Can you offer any advice on this matter?

Any help is greatly appreciated and all the best :)

Thanks,

Patrick