It means the loop runs for more iterations than there are items in actions
. I guess my previous reply was somehat ambiguous so I’ll say it more explicitly: if the size of actions
is Npairs×3
then the loop must be for (i in 1:Npairs)
.
@nhuurre now, i got the sampling, and the format is as follows:
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
beta_player[1,1] 0.1 4.6e-3 0.16 -0.22 -0.02 0.09 0.21 0.42 1305 1.0
beta_player[2,1] 0.76 4.2e-3 0.16 0.46 0.65 0.75 0.86 1.07 1419 1.0
beta_player[1,2] 0.84 0.01 0.51 -0.11 0.48 0.82 1.18 1.82 1806 1.0
beta_player[2,2] 0.31 0.01 0.51 -0.63 -0.06 0.29 0.65 1.31 1637 1.0
beta_player[1,3] 0.65 5.2e-3 0.2 0.25 0.51 0.65 0.79 1.06 1516 1.0
beta_player[2,3] -0.07 5.1e-3 0.2 -0.47 -0.21 -0.07 0.07 0.34 1590 1.0
beta_player[1,4] 0.59 8.9e-3 0.35 -0.06 0.36 0.58 0.81 1.31 1538 1.0
beta_player[2,4] -0.51 8.3e-3 0.35 -1.15 -0.74 -0.54 -0.29 0.23 1777 1.0
What does the format beta_player[a,b] mean?
I assume your parameters block looks like
parameters {
matrix[2,Nplayers] beta_player;
matrix[2,Nzones] beta_zone;
matrix[2,Nloc] beta_loc;
matrix[2,Nres] beta_res;
matrix[2,Ntimes] beta_time;
}
and in the model block you have
for (i in 1:N) {
vector[2] beta = beta_player[:,player_id[i]] + beta_all[:,pred_index[i]];
actions[i] ~ multinomial(softmax(append_row(0.0, beta)));
}
and that actions[i]
is {n_shots, n_passes, n_dribblings}
in the i
th row of the dataset.
Then the interpretation is something like, beta_player[1,k]=0.65
means player number k
is exp(0.65)=1.9
times more likely to do a pass than a shot and beta_player[2,k]=-0.07
means that that same player is exp(-0.07)=0.93
times as likely (i.e. 7% less likely) to do a dribbling than a shot. But these probabilities are also modified by the zone and time so it’s not quite so straightforward.
Perfect!
So, I need to know, how is the functional form of the multinomial function in stan?
Could you tell me or where to find information about that?
The functions reference has a page on multinomial and softmax.
It may be helpful to compute some predicted probabilities to examine. For example
import numpy as np
from scipy.special import softmax # same as Stan's softmax
# extract draws from the fit
beta_player = fit.extract()['beta_player']
beta_zone = fit.extract()['beta_zone']
# etc
# let's say we're interested in player number 50
# on zone 3, localia 1, ... (NB: Python indexing starts from 0)
beta = beta_player[:,:,50] + beta_zone[:,:,3] + beta_loc[:,:,1] + #etc
# beta is (N_draws, 2) array, add the zero column
beta = np.column_stack(np.zeros(beta.shape[0]), beta)
# apply softmax, then calculate average over draws
probs = softmax(beta, axis=1).mean(axis=0)
prob[0] # probability of a shot
prob[1] # probability of a pass
prob[2] # probability of a dribbling
Btw, this thread is getting quite long. If you need more help interpreting the coefficients you could start a new thread about that. More people will see it.