Dixons-Coles with over/unders

svonpfefer · January 11, 2022, 1:39pm

Hi,

I’m playing around with the dixons-coles football model[1] and trying to get parameters given not only the score, but the probability of over/under 2.5 goals before the game began. My hope is then to replace the actual score of the game with the probability of the home/away/draw outcomes. I’m struggling to understand why the over/under 2.5 goals isn’t really having any affect on the outputs. Code presented below:

functions {
  real tau(int x, int y, real rho, real mu1, real mu2) {
    real adj;

    if (x == 0 && y == 0)
      adj = 1 - (mu1 * mu2 * rho);
    else if (x == 0 && y == 1)
      adj = 1 + (mu1 * rho);
    else if (x == 1 && y == 0)
      adj = 1 + (mu2 * rho);
    else if (x == 1 && y == 1)
      adj = 1 - rho;
    else
      adj = 1;

    return adj;
  }

  real dixon_coles_log(int[] goals, real rho, real mu1, real mu2) {
    int home;
    int away;
    real prob;

    home = goals[1];
    away = goals[2];

    prob = poisson_lpmf(home | mu1) + poisson_lpmf(away | mu2) + log(tau(home, away, rho, mu1, mu2));
    return prob;
  }
  
  real under_three_log(real under_prob_target,real mu1, real mu2, real rho){
  real under_prob = 0;
  int goals[2];
    
    for(hg in 0:2){
      for(ag in 0:2){
        if( (hg+ag) < 4 ){
          goals[1] = hg;
          goals[2] = ag;
          under_prob = under_prob + exp(  dixon_coles_log(goals,rho, mu1, mu2));
        }
      }
    }
  return (under_prob);
  }
  
}

data {
  //int<lower=0> N;
  //vector[N] y;
  real under_prob_log;
  int home_goals;
  int away_goals;
}

// The parameters accepted by the model. Our model
// accepts two parameters 'mu' and 'sigma'.
parameters {
  real <lower=0> mu1;
  real <lower=0> mu2;
  real rho;
}

// The model to be estimated. We model the output
// 'y' to be normally distributed with mean 'mu'
// and standard deviation 'sigma'.
model {
  //y ~ normal(mu, sigma);
  int score[2];
  
  mu1 ~ normal(0, 1);
  mu2 ~ normal(0, 1);
  rho ~ normal(0, 1);
  
  score[1] = home_goals;
  score[2] = away_goals;
  
  under_prob_log ~ under_three_log(mu1,mu2,rho); 
  score ~ dixon_coles(rho, mu1, mu2);
}

generated quantities{

  real log_link;
  int score[2];
  
  score[1] = home_goals;
  score[2] = away_goals;
  
  log_link = under_three_log(under_prob_log,mu1,mu2,rho); ;
}

with inputs:

dat = list(under_prob_log = 0.3 , home_goals = 1, away_goals = 0)

[1] Predicting Football Results With Statistical Modelling: Dixon-Coles and Time-Weighting - dashee87.github.io

Ara_Winter · February 6, 2022, 3:54pm

Hi and welcome. Sorry for the delay. Thanks for posting the link and the data. Can you post the full model call? And also let folks know what version of Stan and if this is being carried out in R, python, etc.

Topic		Replies	Views
Doubts when modelling the probability of a team beating another team Modeling specification	3	664	November 23, 2020
World Cup model General	17	3431	December 7, 2019
Multinomial Logit: probability of choice of a soccer action Modeling pystan , multinomial-response , logistic-regression	24	2156	March 9, 2021
Pystan Follow Model Outputs (soccer match outcomes) using Generated Quantities block Modeling	11	1279	October 19, 2018
Is it ok to use the same independent outcome variable twice in a model? Modeling rstan , techniques	21	799	June 4, 2024

Dixons-Coles with over/unders

Related topics