Hi Lucas,
A couple other possibilities would be TrueSkill (discussion), TrueSkill2, the Stochastic Rank Ordered Logit model, or Plackett-Luce (overview, Stan implementation, 2012 paper on Bayesian version).
One way to avoid ragged arrays is to have data in tidy format like this:
game_id,game_kind,player_id,win
1,2,1,1
1,2,2,0
1,2,3,0
1,2,4,0
2,1,1,0
2,1,2,0
2,1,3,1
In order to keep things vectorized you can also book-keep which row a game starts on and how many players there are, like:
data {
// constants
int n_rows;
int n_players;
int n_games;
int n_game_kinds;
// tidy data
int player_won[n_rows]; // outcome
int game_id[n_rows]; // which game is this row about
int player_id[n_rows]; // which player is this row about
// metadata
int<lower=1, upper=n_game_kinds> game_kind[n_games]; // e.g. chess, checkers, ...
int<lower=1, upper=n_rows> game_start[n_games]; // which row does game j start in?
int<lower=2> n_players_in_game_j[n_games]; // how many rows does game j comprise?
...
}
parameters {
vector[n_players] latent_abilities; // whatever you're trying to model
...
}
model {
...
latent_abilities ~ std_normal(); // prior of some kind
for (j in 1..n_games) {
int start = game_start[j];
int end = start + n_players_in_game_j[j];
int players_j[n_players_in_game_j] = player_id[start:end];
int results_j[n_players_in_game_j] = player_won[start:end];
int game_kind_j = game_kind[game_id[j]];
vector[n_players_in_game_j] abilities_j = latent_abilities[players_j];
...
results_j ~ your_likelihood_here(abilities_j, game_kind_j, ...);
}
}
That’s just a sketch, haven’t checked for syntax or off by one errors, but you can generalize from there, so instead of won in {0, 1} you might have rank in [1, 2, …] (e.g. in a horse race) or score (e.g. in a basketball game), or something more custom like # of seconds behind 1st place finisher (e.g. in a footrace).