Right, that’s what I thought when writing the Python code above.
The difference between categorical and multinomial is that
model {
// three events
1 ~ categorical_logit(beta);
1 ~ categorical_logit(beta);
2 ~ categorical_logit(beta);
}
is the same as
model {
// two 1s, one 2, no 3s
{2,1,0} ~ multinomial_logit(beta);
}
The predictor beta
depends only on columns 2-6. Structure your dataset so that each row corresponds to a unique beta
and instead of event_id
you have three columns that count the number of shots, passes, and dribblins for that beta
. The number of rows to loop over in this format is at most 515\times12\times7\times2\times3=259560 which is about half as many as in the per-event format.