ROPE range specification for binary models

Dear Stan Community,

I would like to use ROPE for logistic multilevel models, but I am having trouble understanding the guidelines for setting up a sensible ROPE range. The term ‘standardized parameters’ is used often (e.g., Kruschke, 2018; Makowski et al., 2019). I wanted to ask whether this suggests that the posterior distribution should be standardized. On a side note, in the formula suggested by Kruschke (2018) to calculate the ROPE, there is a ‘divided by four’ addition for binary models, which, unfortunately, is not explained further in the text. Any explanations about this would be greatly appreciated!


Sorry for taking long to respond.

I admit I don’t understand the guidelines and the terms you use either, but I think one can do a lot of stuff from the first principles.

In logistic regression, the model coefficients represent additive changes in log-odds. For binary/categorical predictors the coefficient directly represent change in log-odds from the baseline (which is IMHO easy to grasp) while for continuous predictors it is change in log-odds per each unit from the predictor being zero.

In both cases, one can often make sensible considerations about ROPE just with that. Change in log-odds is the same multiplying the odds, so if say baseline odds are 1:1 and 11:10 is the maximal odds you consider practically equivalent for a treatment, you can compute the relevant value of the ROPE for the coefficient as \log \left(\frac{\frac{11}{10}}{\frac{1}{1}} \right) = \log(\frac{11}{10}) - \log(1) \simeq 0.095. Note however that this requires speaking about relative risk, which is not always the beast target. In other words, you are saying that multiplying the odds by 1.1 is practically equivalent regardless of the baseline odds, so with the same ROPE, if the baseline odds are 1:100 the practically equivalent range would end at 11:1000.

Additionally, you may work with absolute risk by making predictions on the absolute scale (i.e. computing the posterior distribution for the mean of the outcome). You can than compare those absolute risks between the groups.

Does that make sense?

Also it helps if you give links to papers or other sources instead of just citations as those might be ambiguous for people that don’t follow the field closely.


The “divide by four” rule is a simple ad-hoc way to interpret coefficients of a logistic regression model on the log-odds scale on the probability scale.

The largest derivative of the logistic link function is 0.25 (at x = 0), which means that a one unit change in one of the inputs corresponds at most to a 0.25 unit change in one of the outputs. This means that you can simply divide logistic regression coefficients by 4 to get a reasonable estimate for the change on the probability scale, e.g. a logistic regression coefficient of 2 corresponds approximately to an increase of the probability p(y = 1) by 50%.
This approximation is best when you have roughly as many 1's as 0's in your data set, because otherwise you overestimate the expected change in probability (the logistic link function is much less steep at the extremes).

The whole thing is explained in the new Regression and other stories book by Gelman, Vehtari and Hill, and I think was originally mentioned in “Data Analysis in Using Regression and Multilevel/Hierarchical Models” by Gelman and Hill. If I remember correctly it’s also somewhere in BDA3, which is available online for free.

Whether or not you want to standardize the inputs depends on your desired interpretation. Standardizing them has the advantage that you can reason on a scale of standard deviations (e.g. an increase in 1 standard deviation is associated with an increase in the expected probability by x%), but maybe that’s not what you want and the original units are easier to interpret.

One paper that might interest you is, which explains how you can interpret coefficients of logistic regression models on the probability scale in a less ad-hoc way, but for just calculating ROPE values the divide-by-4 rule or calculating the derivative at the mean should already by good enough.