# Understanding brms model code: Ordered Probit with Ordered Predictor (monotonic effects)

Hi all,

I need some help interpreting some of the model code from a brms model. I have an ordered outcome y with 12 categories (11 thresholds) and an ordered predictor x with 4 ordinal categories. At the most basic, the model can be coded as follows:

``````dat <- list(x = x, y = y)

mod <- brm(y ~ (mo(x), data = dat, family = cumulative("probit"))
``````

The model from brms `mod\$model` is output as follows:

``````functions {
/* cumulative-probit log-PDF for a single response
* Args:
*   y: response category
*   mu: latent mean parameter
*   disc: discrimination parameter
*   thres: ordinal thresholds
* Returns:
*   a scalar to be added to the log posterior
*/
real cumulative_probit_lpmf(int y, real mu, real disc, vector thres) {
int nthres = num_elements(thres);
real p;
if (y == 1) {
p = Phi(disc * (thres[1] - mu));
} else if (y == nthres + 1) {
p = 1 - Phi(disc * (thres[nthres] - mu));
} else {
p = Phi(disc * (thres[y] - mu)) -
Phi(disc * (thres[y - 1] - mu));
}
return log(p);
}
/* cumulative-probit log-PDF for a single response and merged thresholds
* Args:
*   y: response category
*   mu: latent mean parameter
*   disc: discrimination parameter
*   thres: vector of merged ordinal thresholds
*   j: start and end index for the applid threshold within 'thres'
* Returns:
*   a scalar to be added to the log posterior
*/
real cumulative_probit_merged_lpmf(int y, real mu, real disc, vector thres, int[] j) {
return cumulative_probit_lpmf(y | mu, disc, thres[j[1]:j[2]]);
}
/* compute monotonic effects
* Args:
*   scale: a simplex parameter
*   i: index to sum over the simplex
* Returns:
*   a scalar between 0 and rows(scale)
*/
real mo(vector scale, int i) {
if (i == 0) {
return 0;
} else {
return rows(scale) * sum(scale[1:i]);
}
}
}
data {
int<lower=1> N;  // total number of observations
int Y[N];  // response variable
int<lower=2> nthres;  // number of thresholds
int<lower=1> Ksp;  // number of special effects terms
int<lower=1> Imo;  // number of monotonic variables
int<lower=1> Jmo[Imo];  // length of simplexes
int Xmo_1[N];  // monotonic variable
vector[Jmo[1]] con_simo_1;  // prior concentration of monotonic simplex
int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
ordered[nthres] Intercept;  // temporary thresholds for centered predictors
simplex[Jmo[1]] simo_1;  // monotonic simplex
vector[Ksp] bsp;  // special effects coefficients
}
transformed parameters {
real disc = 1;  // discrimination parameters
real lprior = 0;  // prior contributions to the log posterior
lprior += student_t_lpdf(Intercept | 3, 0, 2.5);
lprior += dirichlet_lpdf(simo_1 | con_simo_1);
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = rep_vector(0.0, N);
for (n in 1:N) {
// add more terms to the linear predictor
mu[n] += (bsp[1]) * mo(simo_1, Xmo_1[n]);
}
for (n in 1:N) {
target += cumulative_probit_lpmf(Y[n] | mu[n], disc, Intercept);
}
}
// priors including constants
target += lprior;
}
generated quantities {
// compute actual thresholds
vector[nthres] b_Intercept = Intercept;
}
``````

I am interested in the model code because I am using this in a more bespoke modeling scenario in cmdstanr. However, I am struggling trying to understand the code. I have questions about the following:

• In the data block, I am confused by variables that were not initially āinputā by myself. I know `Y`, `N`, `nthresh`, and `Xmo` all come from my data declarations. What about ākspā, `Imo`, `Jmo`, and `con_simo_1`? I think `Jmo` is the number of ordered categories of `X` (4), but I am unsureā¦ Is `Ksp` then 1 in my case because I only have 1 predictor? But, `Imo` is also 1?

Admittedly this is my first attempt at working with simplexes outside of the brms framework, so I want to make sure I am understanding what is going on in the stan code.

Thank you all!

1 Like

You can use the function `standata(mod)` at it will print the data list that was passed to Stan. This way you can figure out what data is being used in the actual Stan model behind brms.

2 Likes