Dependent beta parameters

Hi all,

This is the first hierarchical model that I’ve attempted, so please pardon my inexperience. I am attempting to model sports outcomes to find the latent strength of the participating teams. To do so, I am specifying each game in terms of the home team, away team, and a binary indicator that equals 1 when the home team wins and 0 when the away team wins. I assume the data are Bernoulli(win_prob) conditioned on win_prob, which I choose to be Beta(strength[i], strength[j]) where i is the home team and j is the away team. (As a quick-and-dirty solution, I picked a Gamma prior centered at gamma_mean with a controllable spread as to make it nearly symmetric.)

I would like the win_prob matrix to be symmetric, with one side being determined by the other since, for example, if team 1 as 70% of beating team 2, then team 2 should have 1 - .7 = 30% of beating team 1. However, I have been unable to specify a model such that one portion of the matrix is Beta(-,-) and the other is 1 - (value sampled from that Beta(-,-)). After many failed attempts to do this, and many Google searches, I decided on the temporary solution to define the reverse probabilities simply as Beta random variables with the strength parameters swapped, which is causing the model to have low n_eff, Rhat > 1, and other diagnostics fail the eye check. Here is the .stan:

data { 
  int<lower=0> teams;
  int<lower=0> N;
  
  int<lower=0,upper=1> home_win[N];
  int<lower=1,upper=teams> home_team[N];
  int<lower=1,upper=teams> away_team[N];
  
  real<lower=0> gamma_mean;
  real<lower=0> spread;
} 

parameters {
  vector<lower=0.1>[teams] strength;
  matrix<lower=0,upper=1>[teams, teams] win_prob;
} 

model {
  for (team in 1:teams) {
    strength[team] ~ gamma(spread, spread/gamma_mean);
  }
  
  for (i in 1:teams-1) {
    for (j in i+1:teams) {
      win_prob[i, j] ~ beta(strength[i], strength[j]);
      win_prob[j, i] ~ beta(strength[j], strength[i]);
    }
  }
  
  for (n in 1:N) {
    home_win[n] ~ bernoulli(win_prob[home_team[n], away_team[n]]);
  }
}

Data would crudely look like:

Screen Shot 2020-03-04 at 12.02.07 AM

Here team 1 beats 7 at home, but loses to 5 and 26 at home.

One of my attempts was to define a transformed parameter, det_win_prob (det for deterministic), in the following way. However it caused the stan() call to run forever without output:

transformed parameters {
  matrix<lower=0,upper=1>[teams, teams] det_win_prob;
  
  for (i in 1:teams) {
    for (j in 1:teams) {
      if (i < j) {
        det_win_prob[i, j] = win_prob[i, j];
      } else if (i > j) {
        det_win_prob[j, i] = 1 -  win_prob[i, j];
      }
    }
  }
}

Thank you for taking the time to read this, and I apologize if it is trivial.

Hi Liam! Welcom to the Stan forums! :)

I think there are two possible ways to improve the model. I think the first one is necessary and the second one is optional.

  1. Right now you declare win_prob as \text{teams} \times \text{teams} matrix. In the model block, however, you never assign a distribution to the diagonal elements of win_prob. Of course that makes sense, but Stan will treat those entries as \text{diag}(\text{win_prob}) \sim \text{Uniform}(-\infty,+\infty), which can be tricky. (Fundamentally, this is also the problem in the second approach, I think.) I’d suggest that you just declare the (\text{teams} \times \text{teams}) - \text{teams} (subtract the diagonal, or \text{teams}(\text{teams} - 1) / 2 (just below diagonal) unique parameters in the parameters block and build the win_prob (or win_prob_det) matrix in the transformed parameters block, padding out the diagonal with some arbitrary number.
  2. The hyperparameters are gamma_mean and spread, so you might want to simulate from the specific values and see, what your choice implies – some combination can lead to highly bimodal priors, which is not that bad, but is probably not helping either.

Cheers!
Max

2 Likes

Hi Max, thanks for your quick answer.

Edit: solved my bookkeeping issue relating to indices.

Nice! Do you want to share your solution? It could be helpful for others. :)

Of course. My solution is the following.

I added a function, sum_1_to_n, to compute the teams*(teams-1)/2 number without giving me parser warnings related to integer division, and errors using the equivalent number sum(1:(teams-1)) in the constraint of unique_win_prob. After doing so, I was able to define unique_win_prob as a vector where order is important, since it will be unpacked into a matrix win_prob later. When defining both, I used the same for loop scheme and an index to ensure all the bookkeeping stayed the same. I had to wrap each set of loops in extra curly brackets in order to define idx, the temporary use index variable.

This boosted my n_eff, and makes more sense, but n_eff is still low. Rhats are very close to 1 with enough iterations, although the Rhats of the win_probs on the diagonal are NA which throws a semi-annoying warning. On to more tweaking to boost n_eff!

functions {
  
  int sum_1_to_n(int n) {
    int result = 0;
    for (i in 1:n) {
      result += i;
    }
    return result;
  }
  
}

data { 
  int<lower=0> teams;
  int<lower=0> N;
  
  int<lower=0,upper=1> home_win[N];
  int<lower=1,upper=teams> home_team[N];
  int<lower=1,upper=teams> away_team[N];
  
  real<lower=0> gamma_mean;
  real<lower=0> spread;
} 

transformed data {
  int<lower=0> unique_pairings = sum_1_to_n(teams-1);
}

parameters {
  vector<lower=0.1>[teams] strength;
  vector<lower=0,upper=1>[unique_pairings] unique_win_prob;
} 

transformed parameters {
  matrix<lower=0,upper=1>[teams, teams] win_prob;
  {
    int idx = 1;
    for (j in 1:teams) {
      win_prob[j, j] = 0;
      for (i in 1:(j-1)) {
        win_prob[i, j] = unique_win_prob[idx];
        win_prob[j, i] = 1 - unique_win_prob[idx];
        idx = idx + 1;
      }
    }    
  }

}

model {
  for (team in 1:teams) {
    strength[team] ~ gamma(spread, spread/gamma_mean);
  }
  
  {
    int idx = 1;
    for (j in 1:teams) {
      for (i in 1:(j-1)) {
        unique_win_prob[idx] ~ beta(strength[i], strength[j]);
        idx = idx + 1;
      }
    }    
  }

  for (n in 1:N) {
    home_win[n] ~ bernoulli(win_prob[home_team[n], away_team[n]]);
  }
}
2 Likes