Hi all, I am trying to predict ranking (1st to 10th place) from racetime using historical data of elite marathoners. My question is “Conditional on a 2:05 marathon (2h 5min) performance what is the probability for a medal winning position (i.e at least 3rd)?”.
To put that in perspective the first sub2:05 marathon was run by the winner of the Berlin marathon back in 2003 (a world record at the time); some 20 years later 9 out of top-10 2023 Berlin marathon achieved sub2:05 performances. So there is a natural progression of marathon performance, there are mean differences between venues (some attract the best elite runners, some attract “slower” elites and of course some venues are inherently slower compared to others due to course , environmental or other external factors).
My data set contains runner id, venue id, year id , racetime and top-10 ranking for 15 selected popular marathon races for years 2001-2023. My DV is final race position (as ordered) and the predictors are year and racetime. My model is specified as:
fit<- brm(
position_ordered ~ 1 + year+racetime+(1|athleteid)+(1|venueid)+(1|year_as_factor),
data = Dataset,family = sratio("cloglog"),iter=4000, warmup=1500,control = list(adapt_delta = 0.95, max_treedepth=15), seed=12345)
I included year both as a population- and group-effect given that there is a natural progression in marathon performance across years on average, but also random variations around that trend. In addition I added athlete and venue group-effects due to the repeated measures within both athletes and venues.
The first question is whether the sequential ordinal regression is appropriate for this type of research? In addition I am diving into Burkner and Vuorre ordinal tutorial on the interpretation of the coefficients but I cannot wrap my head around a satisfactory natural interpretation. On the above model I have 0.1 (95%CI: 0.07-0.12) for year and 0.42 (95%CI: 0.39-0.45) for racetime.
Any help/input would be greatly appreciated!!!
PS: Haven’t touched the priors yet, just float with the default ones, the model does seem to converge without any problems, but first thing’s first and I would like to get a solid understanding of the interpretation before experimenting with priors or with more complex models.