Get and interpret thresholds in a Cumulative probit model: the case of a ordinal predicted variable and a metric predictor

Hi, I am very new to Bayesian and brms. I have questions about getting the ‘actual thresholds’ from a Cumulative probit model.

My data structure is similar to the Example: ’ Happiness and money’, in which Happiness is Y (an ordinal predicted value with 4 categories); Money is X’.
23 Ordinal Predicted Variable | Doing Bayesian Data Analysis in brms and the tidyverse.

But I am interested in finding out the ‘real thresholds’ of Happiness in terms of the amount of Money. For example, if people have less than 2000 yuan, they will rate their happiness level as 1(not happy); if they have 2001 < x <5000, they rate it as 2 (not very happy); if they have 5001 < x < 8000, they rate 3(so-so); if they have more than 8000, they rate 4(very happy).-- (of course here are faked thresholds).

I first standardized the variable of Money, then I fitted a model and got the model output (intercepts). But my question is how can I get t/compute the ‘real thresholds’ (like “2000”, " 5000" and “8000”) from intercept estimates (or from other estimates)? I know the intercept values from the model output (negative values) cannot be the ‘real threshold values’ because the Money variable ranged from 5 to 10000. Should I inverse the probit link function? Should I unstandardized the intercept estimates? What codes should I use? Thanks!

Another minor question about interpretation is the parameter of " disc’ under Family Specific Parameters output. I think it refers to ‘discrimination’; and it is fixed to 1 in my model. My question is what is the meaning of it in my model? How should I report it?

Below is my code.

# standardized X: Money
my_data <-
  my_data %>% 
  mutate(Money_standarized = (Money - mean(Money)) / sd(Money))
# fit model
fit1 <-
  brm(data = my_data,
      family = cumulative(probit),
      Happy ~ 1 + Money_standarized
      )

Output

Family: cumulative 
  Links: mu = probit; disc = identity 
Formula: Happy ~ 1 + Money_standarized 
   Data: my_data (Number of observations: 1000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Population-Level Effects: 
                                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept[1]                  -2.28      0.08    -2.44    -2.12 1.00     1855     2173
Intercept[2]                  -1.32      0.07    -1.45    -1.18 1.00     2290     2718
Intercept[3]                  -0.34      0.06    -0.46    -0.23 1.00     2866     2947
Money_standarized       2.68      0.15     2.38     2.97 1.00     2299     2461

Family Specific Parameters: 
     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
disc     1.00      0.00     1.00     1.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

There aren’t “real thresholds” of happiness in terms of amount of money. Some rich people are very unhappy. The thresholds in cumulative probit regression are not interpretable in terms of “amounts of money”. Instead, they serve to divide up the simplex based on the rise of the normal CDF between the thresholds. The intercepts tell you “what is the distribution of what somebody will say about their happiness if they have an average amount of money?” In this model, money is a covariate that controls the position of the thresholds, which chop up the normal PDF into pieces whose integrals give the probability that somebody will be in a given happiness category.

1 Like

Thank you so much for your reply and clarification, jsocolar! I think I have got a bit more understanding of it! : )

1 Like