I have a hierarchical model on pairwise comparisons (tennis players). I’m running it on a small example now–there are J=50 players who played against each other in N=4411 matches. I add p=3 covariates to the model. Even on this small example it takes a long time to run (several hours), and this is really problematic when I’d like to get a version working for J=1200 players and N=25000 matches. The model is:
data {
int<lower=0> J; // total number of players
int<lower=0> N; // number of matches
int<lower=0> p; // the number of covariates for each player (besides the skill parameters)
matrix[N, 3*J] X_skill; // design matrix for the skills
matrix[N, p] X1; // characteristic player 1
matrix[N, p] X2; // characteristic player 2
int<lower=2, upper=3> winner_sets[N]; // number of sets won by the winner for match n
int<lower=2, upper=5> total_sets[N]; // number of total sets for match n
}
parameters {
vector[3*J] mu; // means of the player skills
vector<lower=0>[3*J] sigma; // SDs of the player skills
vector[p] beta; // coefficients on player characteristics
vector[3*J] alpha; // player skills on hard, clay, and grass
}
model {
// priors
mu ~ double_exponential(0, 1);
beta ~ normal(0, 1);
alpha ~ normal(mu, sigma);
// sample
winner_sets ~ binomial_logit(total_sets, (X1 * beta)  (X2 * beta) + (X_skill * alpha));
}
So I am using a uniform, improper prior over sigma
. (SEE EDIT BELOW)
My code in the sampling statement is vectorized, and it runs fine. It just takes too long to run. The data are:

X_skill
: design_matrix_working_example.csv (1.3 MB) 
X1
:
player1_attributes_working_example.csv (106.6 KB) 
X2
:
player2_attributes_working_example.csv (106.4 KB) 
winner_sets
:
winner_sets_working_example.csv (8.6 KB) 
total_sets
:
total_sets_working_example.csv (8.6 KB)
Thank you for any help you can provide!
EDIT: Okay, I just ran the model with 4 chains and iter=1000
. It finished in a somewhat reasonable 45 minutes, but there were “1982 divergent transitions after warmup”. I looked at the following traceplot:
It seems the improper prior on sigma
is causing the chains to be extremely sticky. I’m going to try to use a HalfCauchy prior instead with location 0 and scale 1. I hope my understanding is right, and I’ll report back with how goes.