hey @martinmodrak
Thanks for replying. After thinking more about it, the event I want to capture is the knock-out of the opponent by a fighter. In addition, I want to use the fights round, instead of number of seconds, as time unit. So, there are basically three different possible outcomes per fight:
- Fighter KOās opponent in round r
- event is observed in round r
- Opponent KOās fighter in round r
- event is right-censored in round r
- Fighter or opponent wins by jury decision
- event is right-censored in round r = 3 ( normal fight) or r = 5 ( title fight)
Visually this looks, for a random picked fighter, as follows:

The table below shows the summarized data:
round |
KO |
censored |
fighters removed |
fighters at risk |
hazard rate |
0 |
0 |
0 |
0 |
27 |
- |
1 |
7 |
1 |
8 |
27 |
0.26 |
2 |
4 |
0 |
4 |
19 |
0.21 |
3 |
3 |
2 |
5 |
15 |
0.20 |
4 |
2 |
0 |
2 |
10 |
0.20 |
5 |
0 |
8 |
8 |
8 |
- |
I want to calculate the probability that a KO happens in round r when it hasnāt happened yet in round r ā 1 . So, I want to create a discrete time survival model.
Let KO_o be a discrete random variable that indicates the round when the KO occurs for a randomly selected opponent o.
Next, we define the discrete-time hazard as the conditional probability of opponent o getting KOāed in round r give that he/she has survived until that round

Cox (1972) proposed that because the hazard rate are probabilities, they can be reparametrized so that they have logistic dependence on the time periods.

Where [R_1, R_2, R_3, R_4, R_5] are a sequence of dummy variables. If an opponent is KOāed / censored in round 3, R_3 = 1 and the rest is 0. If we take the logs, we obtain a model on the logit of the hazard rate

Next, for every opponent o, we determine for each round r whether a KO was observed using a sequence of dummy variables Y_o,r that consist of the values y_o,r

If opponent o does not get KOāed during the match, Y_r,o will be equal to 0 in every round that was observed during the fight. If the fight was a title fight, Y_r,o is equal to {0, 0, 0, 0, 0}. For an opponent that gets KOād in the third round, Y_r,o is equal to {0, 0, 1}
In addition, we want to check whether opponent o is censored or not.

The probability that an uncensored opponent o will get KOāed in round r is equal to

The probability that a censored opponent o will get KOāed after round r is

The likelihood function is the product of the probabilities of observing the data, Prā”{T_ko = t_r }, in the case of uncensored opponents (c_o = 0), and Prā”{T_ko > t_r }, in the case of the censored opponents (c_o = 1):

According to Singer and Willet (1993) we can rewrite the above to

Now the likelihood function of the discrete-time hazard model is equal to the likelihood function for N (t_1, t_2, ā¦, t_r) independent Bernoulli trials with parameter Ī»_r,o. So, we can treat the N dichotomous observed values y_r,o as the values of the outcome variable in a logistic regression analysis of the time-period indicators R.
I transform my data to person-period format rounds_jon_jones.csv (2.4 KB)
and write the following stan code
> data {
> int<lower=0> n_rounds;
> int<lower=0, upper=1> knockouts[n_rounds];
> matrix[n_rounds , 5] rounds
> }
>
> parameters {
> real alpha_1;
> real alpha_2;
> real alpha_3;
> real alpha_4;
> real alpha_5;
> }
>
> model {
> // priors
> alpha_1 ~ normal(0, 1);
> alpha_2 ~ normal(0, 1);
> alpha_3 ~ normal(0, 1);
> alpha_4 ~ normal(0, 1);
> alpha_5 ~ normal(0, 1);
>
> // likelihood
> knockouts ~ bernoulli_logit(alpha_1 * rounds[,1] + alpha_2 * rounds[,2] + alpha_3 * rounds[,3] + alpha_4 * rounds[,4] + alpha_5 * rounds[,5]);
> }
Running the code gives the following output

I calculate the hazard rate of the first round as follows:

The table below shows the hazard rate for every round:
round |
hazard |
1 |
0.752 |
2 |
0.505 |
3 |
0.257 |
4 |
0.330 |
5 |
0.061 |
These hazard rates are quite different from the hazard rates of the first table where we divide the number of KOās by the number of fighters at risk per round. Can someone tell me if I did something wrong? In addition, feedback is much appreciated. Please point out if I did something wrong!
Thank you