I am making a Bradley-Terry model to infer tennis player skills. I assume that each player has a skill vector that is trivariate-normal with mean 0 and covariance matrix that depends on the amount of time since they last played a match. I use a trivariate normal since there are 3 court surfaces in tennis, hard court, clay court, and grass court. My hypothesis with the covariance is that if a player hasn’t played in a while, there is more uncertainty in his skill. This is similar to the idea behind Glicko.
I am using a negative binomial likelihood on the set counts, since in tennis you must win either 2 or 3 sets to win the match depending on whether it is a best of 3 or 5 set match.
The model is:
functions {
// Computes the log-PDF of the negative binomial distribution with y failures until r successes with success probability p
// vetorized
real negbin_vec_lpmf(int[] y, vector p, int[] r) {
int n = num_elements(y);
vector[n] y_ = to_vector(y);
vector[n] r_ = to_vector(r);
return sum(lchoose(y_ + r_ - 1, y_) + r_.*log(p) + y_.*log1m(p));
}
// element-by-element
real negbin_lpmf(int y, real p, int r) {
real lp = lchoose(y + r - 1, y) + r*log(p) + y*log1m(p);
return lp;
}
}
data {
int<lower=0> J; // number of players
int<lower=0> N_train; // number of matches
matrix[N_train, J] z; // indicator matrix where z[i,j] = 1 if player j participate in match i
matrix[N_train, 3*J] X_skill; // design matrix for player skills
matrix[N_train, J] X_skill_time; // number of months between each player's last event and current event
int total_n[N_train]; // whether the match was best of 3 or 5 sets
int winner_y[N_train]; // number of sets LOST by the winner of the match
}
transformed data {
vector[3] Zero = rep_vector(0, 3);
// Build up requisite number of set victories, which is 2 for a best of 3 match or 3 for a best of 5 match
int<lower=2, upper=3> r[N_train];
for (i in 1:N_train) {
if (total_n[i] == 3) {
r[i] = 2;
} else {
r[i] = 3;
}
}
}
parameters {
cholesky_factor_corr[3] Lcorr0;
row_vector[3] alpha[J]; // player skills over the rating periods
real<lower=0> beta; // how uncertainty grows as a player misses time
vector<lower=0>[3] tau;
}
transformed parameters {
// define 3x3 correlation and covariance matrices
cov_matrix[3] Omega;
corr_matrix[3] O;
O = multiply_lower_tri_self_transpose(Lcorr0);
Omega = quad_form_diag(O, tau);
}
model {
// Priors
Lcorr0 ~ lkj_corr_cholesky(1);
tau ~ cauchy(0, 25);
beta ~ normal(0, 25);
// Build up covariance matrices
for (i in 2:(N_train+1)) {
for (j in 1:J) {
if (z[i-1, j] == 1 && X_skill_time[i-1, j] > 0) { // Time-varying covariance for the two players participating in match i-1
alpha[j] ~ multi_normal(Zero, quad_form_diag(O, tau + beta^2*X_skill_time[i-1, j]));
} else { // Otherwise use constant covariance
alpha[j] ~ multi_normal(Zero, Omega);
}
}
}
// Likelihood
// Build up set win probability vector for each match
vector[N_train] s;
s = inv_logit(X_skill * to_vector(to_matrix(alpha)));
target += negbin_vec_lpmf(winner_y | s, r);
}
When I try to fit the model, all transitions diverge, Rhat is bad, ESS is bad, basically something is going wrong. I’m not quite sure what to do. Note that if, for instance, I assume the covariance is constant and the same for all players and specify alpha ~ normal(Zero, Omega);
, the model fits fast, the estimates are reasonable, diagnostics are good, and everything is fine. But I’m not sure what’s going wrong with my “time-dependent” covariance here.
I’ve attached csv’s of z
, X_skill
, X_skill_time
,total_n
, and winner_y
so you can run the model yourself. For these data, J = 15
and N_train = 114
.
Thanks for any help!
total_n.csv (232 Bytes)
winner_y.csv (232 Bytes)
X_skill_time.csv (3.5 KB)
X_skill.csv (11.2 KB)
z.csv (3.8 KB)