Interpretation of Sequential Ordinal Regression with "sratio" family and "probit" link

Hi everyone!

I’m new to this community and I mostly work in neurodegenerative diseases. I’m using this tutorial to work my way through ordinal regression using brms. My research question is if we can predict the amount of pathological protein found in the brain post mortem from clinico-cognitive presentation while the participant is still alive. My dependent variable is the amount of protein, with the ordinal categories absent, rare, occasional, moderate, numerous and the dependent variable is cognitive impairment with the categories yes/no.

I chose sequential modelling with the stopping ratio family because participants can only reach “numerous” amounts of protein after they have moved from “rare” to “moderate” first. The predictor’s reference category is “Cognition Impaired”, so the coefficient is for “unimpaired”.

Model1 <- brm(
  formula = Protein ~ 1 + CognitionImpaired,
  data = Analysis,
  family = sratio("probit"),
  save_pars = save_pars(all = TRUE),
  sample_prior = TRUE,
  seed=6545,
  iter=10000,
  warmup=5000,
  chains=4,
  cores=16,
  control = list(adapt_delta = 0.99999),
  init_r = 0.05)

This model results in a coefficient of -0.96 with a CI from -2.08 to 0.08 for the coefficient called CognitionImpaired_unimpaired. I’ve worked through the tutorial paper a few times and read through some of Tutz’ work that is referenced but I can’t wrap my head around how to interpret my coefficient properly. How can I write this nicely in my manuscript so that my clinician colleagues who aren’t that into stats can grasp the meaning easily? My initial understanding was this, and I’m fairly sure I got that wrong:
The probability of finding protein in the superior frontal gyrus of participants without cognitive impairment was 0.96 SD lower than in participants whose cognition was impaired, with the 95%CI including zero (-2.08 to 0.08).” I initially misread that as this interpretation applies to cumulative models, not sequential ones (if I understood that right).
I understood that for sequential models, the coefficient represents a threshold for moving from one protein rating to the next. But what would be a nice, legible way of phrasing this for myself, and my clinician colleagues? I would like to have a written out version of the results to accompany the plot produced by conditional_effects().

My dataset is tiny at n=15-20, depending on brain area and cognitive function under investigation as the disease I’m working on is a rare disease. In case that’s relevant.

Thank you very much for helping me out!

Welcome to the Stan community. Indeed, the formulation you choose would be more like for a cumulative model, even though it wouldn’t be fully correct there either, given that you did not specify what SD you are referring to (those of the assumed latent continuous response variable). For the sequential models (stopping sequential here), perhaps you could write this implying that the probability of transitioning to higher response categories was lower for category X than category Y, whereas the probit coefficient for a single transition between adjacent categories was X via CI X-X. I know this is not super easy to understand either, but perhaps it helps you moving into the right direction.

Hi Paul!

Thank you for the warm welcome! I’ll go with the write up you suggested and see if I can make it flow nicely.

Kind regards,
Anna