# RuntimeError: Goal Soccer Model

Hello,

I’m trying to create a model to calculate a probability to do a goal in a soccer game. To do that, my data considering who the kicker is (player_id), who the goalkeeper is (glk_id), in which zone is the kicker (cat_zone), in what period of time is the shot (timeFrame), what is the current result of the player team (cat_res) and if the player team is home/away (localia).

I think the best option is using a bernoulli_logit.

The model is as follows:

``````goals_model = """
data {
int<lower=0> N; // number of observations 8451
int players; // number of players 426
int glk; // number of goalkeepers 38
int zones; // number of field zones 8
int time; // number of time frames 7
int res; // types of results (winning, losing, tying)
int loc; // localia

vector[N] player_id;
vector[N] glk_id;
vector[N] cat_zone;
vector[N] timeFrame;
vector[N] cat_res;
vector[N] localia;

int goal[N]; // dependent variable
}
parameters {

real alpha; // intercept

vector[players] beta_player; // coefficient associated with each player
vector[glk] beta_glk; // coefficient associated with each goalkeeper
vector[zones] beta_zones; // coefficient associated with each zone
vector[time] beta_time; // coefficient associated with each time frame
vector[res] beta_res; // coefficient associated with each result
vector[loc] beta_loc; // coefficient associated with each type of localia

real epsilon; //Uncertainty / unexplained variance
}
model {
// priors
alpha ~ normal(0,1);
beta_player ~ normal(0,1);
beta_glk ~ normal(0,1);
beta_zones ~ normal(0,1);
beta_time ~ normal(0,1);
beta_res ~ normal(0,1);
beta_loc ~ normal(0,1);

goal ~ bernoulli_logit(alpha + beta_player .* player_id + beta_glk .* glk_id +
beta_zones .* cat_zone + beta_time .* timeFrame + beta_res .* cat_res + beta_loc .* localia);

}
"""
``````

Then, I ran the following code:

goal_reg = pystan.model.StanModel(model_code=goals_model, model_name=‘goal_reg’)

But, then when i try this:

lin_fit = goal_reg.sampling(data=datos,
iter=1000, chains=4,
warmup=500, n_jobs=-1,
seed=42)

I get this error:

" RuntimeError: Exception: elt_multiply: Rows of m1 (2) and rows of m2 (8451) must match in size (in ‘unknown file name’ at line 43)"

I would appreciate if someone could help me whit that. I’m doing my university final work.

The error points to line 43 which has expressions like `beta_player .* player_id`. That is an element-wise multiplication of two vectors with different sizes. But you shouldn’t multiply by the player ID. Instead the ID should be used as an index to the `beta_player` array.
Try this code:

``````data {
...
int<lower=1,upper=players> player_id[N];
int<lower=1,upper=glk> glk_id[N];
int<lower=1,upper=zones> cat_zone[N];
....
}
...
model {
...
goal ~ bernoulli_logit(alpha + beta_player[player_id] + beta_glk[glk_id] + ...);
}``````
3 Likes

Thak you nhuurre for your time,

I tried this now, including the changes you told me:

``````goals_model = """
data {
int<lower=0> N; // number of observations 8451
int players; // number of players 426
int glk; // number of goalkeepers 38
int zones; // number of field zones 8
int time; // number of time frames 7
int res; // types of results (winning, losing, tying)
int loc; // localia

int<lower=1,upper=players> player_id[N];
int<lower=1,upper=glk> glk_id[N];
int<lower=1,upper=zones> cat_zone[N];
int<lower=1,upper=time> time_frame[N];
int<lower=1,upper=res> cat_res[N];
int<lower=0,upper=loc> localia[N];

int goal[N]; // dependent variable
}
parameters {

real alpha; // intercept

vector[players] beta_player; // coefficient associated with each player
vector[glk] beta_glk; // coefficient associated with each goalkeeper
vector[zones] beta_zones; // coefficient associated with each zone
vector[time] beta_time; // coefficient associated with each time frame
vector[res] beta_res; // coefficient associated with each result
vector[loc] beta_loc; // coefficient associated with each type of localia

real epsilon; //Uncertainty / unexplained variance
}
model {
// priors
alpha ~ normal(0,1);
beta_player ~ normal(0,1);
beta_glk ~ normal(0,1);
beta_zones ~ normal(0,1);
beta_time ~ normal(0,1);
beta_res ~ normal(0,1);
beta_loc ~ normal(0,1);

goal ~ bernoulli_logit(alpha + beta_player[player_id] + beta_glk[glk_id] +
beta_zones[cat_zone] + beta_time[time_frame] + beta_res[cat_res] + beta_loc[localia]);

}
``````

Then i run this:

goal_fit = goal_reg.sampling(data=datos, iter=1000, chains=4, warmup=500, n_jobs=-1, seed=42)

But i get this error:

“RuntimeError: Exception: vector[multi] indexing: accessing element out of range. index 0 out of range; expecting index to be between 1 and 2 (in ‘unknown file name’ at line 43)”

Remember that indexes start at 1 in Stan (Python starts at 0)

1 Like

Yes, i know. But i don’t know what i’m doing wrong.
If you could help me, let me know what other information I can give you, because i have not been able to identify which index starts at 0.

Thankss

Sure, do you have the code that creates the idx?

If np.array

``x = x + 1``

You’re using the IDs as indexing. Should be pretty easy to find which ID is 0.
I’m guessing it’s one of `localia` based on this line

``````data {
...
int<lower=0,upper=loc> localia[N];
}
``````

The lower bound should be `lower=1`.

1 Like

Thak you nhuurre, now i didn’t have any problem.
Do you know where can I find something that explains how to interpret the results of fit?

``````              mean se_mean     sd   2.5%     25%    50%     75%  97.5%  n_eff   Rhat
alpha        -1.58    0.02   0.73  -3.01   -2.09  -1.59   -1.06  -0.16   1559    1.0
beta_player[1] 0.92  5.4e-3   0.35   0.18     0.7   0.94    1.14   1.58   4201    1.0
beta_player[2] 0.95  5.4e-3   0.35   0.22    0.74   0.95    1.18   1.64   4188    1.0
beta_player[3] 0.83  6.4e-3   0.45  -0.11    0.55   0.84    1.13   1.72   5086    1.0
beta_player[4]-7.0e-3  9.3e-3   0.54  -1.15   -0.36   0.03    0.36   0.96   3357    1.0
...
lp__          -2840    0.55  15.61  -2871   -2850  -2840   -2829  -2810    809    1.0``````