Help with performance drop when adjusting a Bradley-Terry model

Hi,

I am trying to fit a Bradley-Terry model using Stan to model team performance data.

I started off using an example here created by Bob Carpenter which worked perfectly.

However I am now in the process of trying to adapt this example to allow for additional covariates other than just the team index and so far my attempts result in a doubling of run time when using the exact same number of covariates as the original example. Am I doing something obviously wrong or is there a way to improve the run time ?

Here is my Stan model so far:


data {
  int<lower = 0> K;                     // # of Variables
  int<lower = 0> N;                     // # of Matches
  matrix[N,K] M;                        // Matrix of covariate differences
  int<lower = 0, upper = 1> y[N];       // Result from Team A
}

parameters {
    vector[K] alpha;            // Coeficents
}

model {
    y ~ bernoulli_logit(M * alpha);
}

And for reference here is my R calling code:

team_a_mat <- model.matrix(~ team_a + covar_a, data = adat)[,-1]
team_b_mat <- model.matrix(~ team_b + covar_b, data = adat)[,-1]

stopifnot(
    nrow(team_a_mat) == nrow(team_b_mat),
    ncol(team_a_mat) == ncol(team_b_mat)
)

diff_mat <- team_a_mat - team_b_mat

fit <- sampling(
    mod,
    data = list(
        K = ncol(diff_mat),
        N = nrow(diff_mat),
        M = diff_mat,
        y = adat$team_a_result
    )
)

EDIT: Made a couple of tweaks to remove the second matrix multiplication which sped things up a bit but am still seeing a 2x worsening in performance :( - have updated the numbers in the main thread accordingly

1 Like

Hey there, welcome back!

Your model looks like logistic regression. sometimes inference can be unstable when you have perfect separation induced by the predictors/variables.

I think adding weakly informative priors for alpha will help greatly. Since the predictor is on the log-odds scale, something like alpha ~ normal(0, 2.5); will probably stabilize estimation, while still being pretty wide in terms of plausible values.

Cheers,
Max

1 Like

Hey @C_G !

Just wanted to ask if you got your model to work? Did using (informative) priors help?

Cheers,
Max

Hi Max,

Apologies for the delay in replying. Indeed you are right at this point the model is just a different specification of logistic regression. Also specifying some weakly informative priors did speed up the sampling by a marginal amount. I think in general its fair to say that the original model specification used by Bob Carpenter is a superior implementation but is obviously much more limited in its formulation.

Thank you for your help !

1 Like

Something else to try is the bernoulli_logit_glm likelihood (more info in the manual), which is a more efficient implementation.

So you would change your likelihood:

    y ~ bernoulli_logit(M * alpha);

to:

    y ~ bernoulli_logit_glm(M, 0, alpha);

(The 0 is because you don’t have an intercept, just coefficients and covariates). The _glm distributions will (generally) sample faster and better than the matrix multiplication. Also, if you have a discrete GPU and a large dataset, then you can GPU-accelerate the model

1 Like