Hi,
I have a model in which I’m trying to use team strengths to get predict the outcome of multiple teams playing a particular game. My logistic model for estimating team scores appears to be fairly robust, but doesn’t take into account changes in team strength over time. Any advice/literature that has done this would be appreciated.

Each game consists of a number of rounds which can be won or lost by either team. I’m using a logistic regression model where: round_won ~ bernoulli_logit(team_a_strength-team_b_strength)

I think I could use a binomial and do this per match as well, but I think they should be functionally equivalent? I also specifically define my team strengths so that the mean of all team strengths is always equal to zero to remove correlation from these variables.

I then have a vector of team scores and can produce match predictions from my posterior accounting for various win conditions. This has produced reasonably good results in the past, however, the use of all historical data to calculate the team strengths is not always helpful as it can lead to overly precise calculations of strengths when a lot of data has been gathered which may be unfair to teams who have improved recently. All the match data I have are stamped with unix timestamps, so it would be possible to allow team strengths to vary with time. I have a couple of ways I think I could do this.

First I could set up the model so that older data simply contribute less to the team strength calculation… e.g.: team_strength ~ normal(current_team_strength, f(deltaT))
which is easy to set up but probably not representative of the real process, and so it’s not easy to imagine what our function of time should look like.

Alternatively, I could do this properly and set up some kind of spline model where team_strength varies as a function of time with autocorrelation. This is probably more representative of the actual underlying process and I think that weekly or fortnightly knot points make sense for a model like this. I’m less familiar with these types of models in stan and the worry I have is that there may not be enough data to make a model like this generate reasonable results, especially if the number of teams is large.

I’m sure this is something that has been tackled before but I can’t find any literature that specifically tries to do time variance with logistic regression like this. Any suggestions or pointers would be helpful to improve this model.

Just out of curiosity: what game is it? Also, perhaps you could model some kind of score instead of the number of wins? That could perhaps be more informative. (I’m just speculating here)

How big is your dataset? If the number of (unique) time points is not so large, you could try a GP. That worked reasonably well for some football (soccer) models I did. Basically, for every team you have a GP with common length scale and variance/jitter. The prior on the length scale can be informative depending on the time unit you choose (days, weeks…). The go-to GP kernel for me was always the “standard” squared exponential, which I think is reasonable in such a setting.

Hi Max,
The game in question is competitive Team Fortress 2, a far throw from football, but quite interesting to make predictions for.
I am effectively using strength to make inferences but perhaps I’m not being clear. I’m using bernoulli logit and just predicting on each round (in a football analogy this would be equivalent to something goal) without really caring which match they’re from, with the GP model this would have to change and I’d use some kind of binomial logit model to keep the time points consistent. The reason I’m doing things in this way is the win conditions to predict match results are actually quite complicated (game is separated into two halves, each of which can be won either on a time limit or rounds limit, this is easy to generate as draws from a posterior distribution on round win/loss). I additionally calculate the number of rounds in a game using a poisson distribution from round data and total game duration, although games that went to winlimit will act to censor this slightly.

The gaussian process model looks doable, although I have a few questions about this particular implementation. One thing I think I’ll have to be wary of is correlation between team strengths because the model is based on team_b_strength-team_a_strength. Currently, I just use a metric team_strengths_prime=team_strength-mean(team_strengths) but I’m not sure whether I’d apply the gaussian process to team_strengths directly or to team_strengths_prime (I think the former?)

As to the question of rounds- teams typically play three or four “halves” (which is the data I have a score like 5-3 from) around 4 to 5 nights a week. There’s been about a month of pre season and 1 week of regular season so far, with ~9 weeks to go. I think with these sort of numbers, a kernel which has a relatively slow length scale (order of weeks) is appropriate for team changes. From my experience, teams tend to see a small amount of relative change from start to end of season, with the fastest amplitude changes in the preseason where many teams are starting to form.

The code has the small distinction of using a binomial_logit rather than bernoulli logit function. I also got rid of the “a” parameter because I wasn’t sure what it did, and it seemed to be an extra source of noise in this case. Aside from that it’s pretty similar:

data {
int<lower=1> N1;
real x1[N1];
int<lower=0> M1[N1];
int<lower=0> z1[N1];
int<lower=1> N2;
real x2[N2];
}
transformed data {
real delta = 1e-9;
int<lower=1> N = N1 + N2;
real x[N];
for (n1 in 1:N1) x[n1] = x1[n1];
for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}
parameters {
real<lower=0> rho;
real<lower=0> alpha;
vector[N] eta;
}
transformed parameters {
vector[N] f;
{
matrix[N, N] L_K;
matrix[N, N] K = cov_exp_quad(x, alpha, rho);
// diagonal elements
for (n in 1:N)
K[n, n] = K[n, n] + delta;
L_K = cholesky_decompose(K);
f = L_K * eta;
}
}
model {
rho ~ inv_gamma(5, 5);
alpha ~ std_normal();
eta ~ std_normal();
a ~ std_normal();
z1 ~ binomial_logit(M1,f[1:N1]);
}

If I want to calculate f as the difference in score between just two teams using historical matches, this does reasonably well (although the number of games admittedly isn’t high, so there isn’t a huge amount of change found).

The issue comes when I have different teams because the data structure isn’t something that will play nicely with Stan. Each team will have a different number of N1 games that they have participated in, so we need to make, let’s call it J different f curves each with their own set and number of x values that we need to take the difference of to get our bernoulli_logit probabilities. As far as I know, this requires initializing a different number of parameters depending on J which I’m not sure how to do. In fact, because we’re looking at the differences between f values for our logit function, we’re going to have to keep track of which x values in one f correspond to which x values in the other f which is all rather complicated. Advice on this would be greatly appreciated.