# Help with creating desired vector

Hi all,

I am working on a my own version of the paper of Baio and Blangiardo. They create a Bayesian hierarchical model on team level to determine the attack and defense parameters of each team. They use these parameters and home-advantage-parameter to forecast game scores. Visually, their layout looks as follows:

My interpretation of their model in Stan is

``````data {
int<lower=1> n_teams;
int<lower=1> n_matches;
int<lower=1, upper=n_teams> home_team[n_matches];
int<lower=1, upper=n_teams> away_team[n_matches];
int<lower=0> home_goals[n_matches];
int<lower=0> away_goals[n_matches];
}

parameters {
real home;
real mu_att;
real mu_def;
real<lower=0> tau_att;
real<lower=0> tau_def;
vector[n_teams - 1] raw_att;
vector[n_teams - 1] raw_def;
}

transformed parameters {
vector[n_teams] att;
vector[n_teams] def;
vector[n_matches] home_log_theta;
vector[n_matches] away_log_theta;

for (team in 1:(n_teams - 1)) {
att[team] = raw_att[team];
def[team] = raw_def[team];
}

att[n_teams] = -sum(raw_att);
def[n_teams] = -sum(raw_def);

home_log_theta = exp(home + att[home_team] + def[away_team]);
away_log_theta = exp(att[away_team] + def[home_team]);
}

model {
// priors
home ~ normal(0, 100);
mu_att ~ normal(0, 10);
mu_def ~ normal(0, 10);
tau_att ~ gamma(0.01, 0.01);
tau_def ~ gamma(0.01, 0.01);
raw_att ~ normal(mu_att, tau_att);
raw_def ~ normal(mu_def, tau_def);

home_goals ~ poisson(home_log_theta);
away_goals ~ poisson(away_log_theta);
}
``````

Now, I would like to make my model on a player-level. I assume that for each team:

• the attack parameter is equal to exp(sum(attack parameters of each player of team))
• the defense parameter is equal to exp(sum(defense parameters of each player of team))

Visually:

In the graph I only draw the player-lines for the home team, but the same method applies for the away team.
My code looks as follows:

``````data {
int<lower=1> n_teams;
int<lower=1> n_games;
int<lower=1> n_players
int<lower=1, upper=n_teams> home_team_id[n_games];
int<lower=1, upper=n_teams> away_team_id[n_games];
int<lower=1, upper=n_players> player_id[n_games];
int<lower=0> home_goals[n_games];
int<lower=0> away_goals[n_games];
}

parameters {
real player_att_mu;
real player_att_tau;
real player_def_mu;
real player_def_tau;
real team_att_raw[n_teams - 1];
real team_def_raw[n_teams - 1];
real player_att_raw[n_players - 1];
real player_def_raw[n_players - 1];
}

transformed parameters {
// Enforce sum-to-zero constraint
real team_att[n_teams];
real team_def[n_teams];
real player_att[n_players];
real player_def[n_players];

vector[n_games] home_expected_goals;
vector[n_games] away_expected_goals;

for (team in 1:(n_teams-1)) {
team_att[team] = team_att_raw[team];
team_def[team] = team_def_raw[team];
}

team_att[n_teams] = -sum(team_att_raw);
team_def[n_teams] = -sum(team_def_raw);

for (player in 1:(n_players-1)) {
player_att[player] = player_att_raw[player]
player_def[player] = player_def_raw[player]
}

player_att[n_players] = -sum(player_att_raw);
player_def[n_players] = -sum(player_def_raw);

for (game in 1:n_games) {
team_att[home_team_id[game]] = exp(sum(player_att[player_id[game]]));     // problem
team_def[home_team_id[game]] = exp(sum(player_def[player_id[game]]));     // problem
team_att[away_team_id[game]] = exp(sum(player_att[player_id[game]]));     // problem
team_def[away_team_id[game]] = exp(sum(player_def[player_id[game]]));     // problem

home_expected_goals[game] = exp(team_att[home_team_id[game]] + team_def[away_team_id[game]] + home_advantage);
away_expected_goals[game] = exp(team_att[away_team_id[game]] + team_def[home_team_id[game]]);
}
}

model {
// Priors (uninformative)
player_att_mu ~ normal(0, 10);
player_def_mu ~ normal(0, 10);
player_att_tau ~ gamma(0.1, 0.1);
player_def_tau ~ gamma(0.1, 0.1);

home_goals ~ poisson(home_expected_goals);
away_goals ~ poisson(away_expected_goals);
}
``````

How can I make sure that my code knows for each game which players belonged to the home team and which players belonged to the away team. I need to specify somewhere a vector with the player_ids of each team. But I donâ€™t see how this can be done. Please advice! Help is much appreciated

1 Like

I donâ€™t think that this is going to work because presumably each player typically belongs to one team, so the model canâ€™t identify their unique contributions from just the team performance.

Yes, correct I assume that each player belongs to only team (still there could be transfers in the winter period but letâ€™s ignore that). But the team selection is likely to change per match right? Due to injuries, suspensions etc. In addition, I could weight each player for % of the match he played.
This paper estimates the individual strengths from team-level results. Why would the method not work for my particular case? Thanks for brainstorming with me

Yes, the model will become more well-â€śidentifiedâ€ť the more that there is variability from game to game in which players are actually playing for a given team. Furthermore, if playing in a game is less a dichotomous played-or-didnâ€™t-play state and more a played-for-X-minutes thing, then your idea for using weights to indicate the latter is sound.

As for implementation, gimme a bit to mull on how best to achieve thatâ€¦