Help with performance drop when adjusting a Bradley-Terry model

C_G · March 2, 2021, 12:21pm

Hi,

I am trying to fit a Bradley-Terry model using Stan to model team performance data.

I started off using an example here created by Bob Carpenter which worked perfectly.

However I am now in the process of trying to adapt this example to allow for additional covariates other than just the team index and so far my attempts result in a doubling of run time when using the exact same number of covariates as the original example. Am I doing something obviously wrong or is there a way to improve the run time ?

Here is my Stan model so far:


data {
  int<lower = 0> K;                     // # of Variables
  int<lower = 0> N;                     // # of Matches
  matrix[N,K] M;                        // Matrix of covariate differences
  int<lower = 0, upper = 1> y[N];       // Result from Team A
}

parameters {
    vector[K] alpha;            // Coeficents
}

model {
    y ~ bernoulli_logit(M * alpha);
}

And for reference here is my R calling code:

team_a_mat <- model.matrix(~ team_a + covar_a, data = adat)[,-1]
team_b_mat <- model.matrix(~ team_b + covar_b, data = adat)[,-1]

stopifnot(
    nrow(team_a_mat) == nrow(team_b_mat),
    ncol(team_a_mat) == ncol(team_b_mat)
)

diff_mat <- team_a_mat - team_b_mat

fit <- sampling(
    mod,
    data = list(
        K = ncol(diff_mat),
        N = nrow(diff_mat),
        M = diff_mat,
        y = adat$team_a_result
    )
)

EDIT: Made a couple of tweaks to remove the second matrix multiplication which sped things up a bit but am still seeing a 2x worsening in performance :( - have updated the numbers in the main thread accordingly

Max_Mantei · March 5, 2021, 10:42am

Hey there, welcome back!

Your model looks like logistic regression. sometimes inference can be unstable when you have perfect separation induced by the predictors/variables.

I think adding weakly informative priors for alpha will help greatly. Since the predictor is on the log-odds scale, something like alpha ~ normal(0, 2.5); will probably stabilize estimation, while still being pretty wide in terms of plausible values.

Cheers,
Max

Max_Mantei · March 10, 2021, 9:07am

Hey @C_G !

Just wanted to ask if you got your model to work? Did using (informative) priors help?

Cheers,
Max

C_G · March 14, 2021, 1:23pm

Hi Max,

Apologies for the delay in replying. Indeed you are right at this point the model is just a different specification of logistic regression. Also specifying some weakly informative priors did speed up the sampling by a marginal amount. I think in general its fair to say that the original model specification used by Bob Carpenter is a superior implementation but is obviously much more limited in its formulation.

Thank you for your help !

andrjohns · March 14, 2021, 2:01pm

Something else to try is the bernoulli_logit_glm likelihood (more info in the manual), which is a more efficient implementation.

So you would change your likelihood:

    y ~ bernoulli_logit(M * alpha);

to:

    y ~ bernoulli_logit_glm(M, 0, alpha);

(The 0 is because you don’t have an intercept, just coefficients and covariates). The _glm distributions will (generally) sample faster and better than the matrix multiplication. Also, if you have a discrete GPU and a large dataset, then you can GPU-accelerate the model

Topic		Replies	Views
Stan model to fit ranks to match data like Bradley & Terry (1952)? General example-models	3	1323	May 21, 2018
Bradley-Terry model: Stan vs. iterative conjugate fit Modeling	1	502	October 11, 2022
Improving Performance on Logistic Regression with Informative Priors Modeling performance , rstanarm	4	1561	May 1, 2020
How can I reduce the runtime of this model? Modeling rstan	10	711	November 1, 2022
Bradley-Terry model for teams Modeling	1	90	July 11, 2024

Help with performance drop when adjusting a Bradley-Terry model

Related topics