This makes sense but just not 100% clear how to implement this in practice.
I’ve implemented this football/soccer model by Baio and Blangiardo (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.182.8659&rep=rep1&type=pdf) and my current stan code is:
data {
int nteams;
int ngames;
int home_team[ngames];
int away_team[ngames];
int<lower=0> home_goals[ngames];
int<lower=0> away_goals[ngames];
}
parameters {
real home;
real mu_att;
real mu_def;
real tau_att;
real tau_def;
vector[nteams-1] att_free;
vector[nteams-1] def_free;
}
transformed parameters {
vector[nteams] att;
vector[nteams] def;
vector[ngames] log_theta_home;
vector[ngames] log_theta_away;
// need to make sum(att)=sum(def)=0
for (k in 1:(nteams-1)) {
att[k] = att_free[k];
def[k] = def_free[k];
}
att[nteams] = -sum(att_free);
def[nteams] = -sum(def_free);
log_theta_home = home + att[home_team] + def[away_team];
log_theta_away = att[away_team] + def[home_team];
}
model {
home ~ normal(0, 10000);
mu_att ~ normal(0, 10000);
mu_def ~ normal(0, 10000);
tau_att ~ gamma(0.1, 0.1);
tau_def ~ gamma(0.1, 0.1);
att_free ~ normal(mu_att, 1/tau_att);
def_free ~ normal(mu_def, 1/tau_def);
home_goals ~ poisson_log(log_theta_home);
away_goals ~ poisson_log(log_theta_away);
}
The reason why I want to update the model is when testing the performance of it I’ve just kept increasing the data and running this whole model on the dataset again as more games come in.
The parameters I’m specifically interested in are att, def and home.
If the posterior distribution of the parameters after seeing a first set of games X is approximately multivariate normal, how do I make adjustments to this stan code so that I can obtain new posterior distribution by running it on the next set of games X’? So far, I’ve just been re-running the code for X+X’ when new games come in.