World Cup model


Hey all,

at this link you may find a multinomial model for the soccer World Cup 2018 in Russia:

A preliminary simulation has been obtained with R, but further updates will be implemented with Stan. Hopefully, I will include some Stan code.

Glad to receive feedbacks!
Looking forward to see you soon.



You might be interested in Andrew’s world-cup example, which we’ve been using for teaching purposes, or Milad’s case study on the Premier League, which has an accompanying video presentation.


The ball is round and the game lasts for 90 minutes.
After the game is before the game.
Sepp Herberger.

m_n\{1,2,X\} is a softmax, I suppose. Above is not.

My concern is:

Can we extrapolate? Can we say, because yesterday it was raining, today it will rain?
Can we just say, because Germany won the last championship, it will this one also?
Clearly our past data say so, but we know its not. I claim, if we fitting a model it does


To be a softmax, it’d have to be \exp(\eta_{nj}) in the denominator. It’s probably just a typo that it’s not.



Thanks Bob!
I already knew both the Andrew World Cup’s model and the Milad model for the Premier League as well. In fact I have been largely inspired by these models and I enjoyed reading.


Actually, instead of the softmax parametrization, I used the alternative multinomial logistic parametrization here, ( modeling K-1 =2 probabilities and the K-th (the draw in this case ) as:

1/{1+\sum_{k=1}^{K-1}exp { beta_k x}

However, I realized now there is a typo since i did not exponentiate the etas in the denominators, and the sum is from 1 to K: thus, thanks!


As I motivated in the Andrew’s blog in the comments section (, this table only represents the estimated probabilities obtained after simulating the World Cup 10000 times before each game is played… Thus, the reason why Germany is favored is mainly due to a high FIFA ranking, rather than past historical results


The nomenclature around all this is very inconsistent and confusing. What you’re calling “multinomial logistic” is just softmax with one of the inputs pinned to 0. The 0 in the version you’re using (1 after exp(0)) identifies the model, but comes with the disadvantage that priors become asymmetric. There’s a discussion in the manual around K vs. K - 1 parameter parameterizations of multinomial logistic regression.


Yeah, I got your point and I agree, the nomenclature I used above was confusing. Anyway , thanks for the suggestion about the priors.

I fixed the typos highlighted in your previous comment.


Took a second look at the model:

  1. \eta_{n.} not have an intercept resp. home advantage parameter. Is there any reason for that?
    At the same time you have u_{att} in att_t, same for defense. This is a constant for all t, so
    both \eta_{n.} gets added these. Is this a case for an identifiability problem?

  2. The model uses a mixture. What about using Sensor fusion instead?


Mmh, I still have to think about it. Anyway, at the time being, mu_att and mu_def do not appear in the model anymore.
See my website for model and predictions updates about the quarter of finals starting today!

I had no idea what sensor fusion was, thanks for the suggestion!