PyStan model specification with multiple monotonic predictors

I can’t share the data I’m working with, and I don’t have a working model yet, because I’m struggling to think through how to specify the model, as described below. I hope these limitations won’t be a problem.

Using pystan, I am modeling perceptual judgments of pathological voice samples. There are multiple raters (i.e., people giving perceptual judgments), multiple patients (i.e., people who produced the voice samples), and multiple materials (i.e., the patients produced voice samples in a baseline condition and after swallowing materials with different viscosity).

On some trials, there are either secretions or prandial material (i.e., material that the patients failed to swallow) in the airway. There are four regions we have defined (medial vocal folds, lateral vocal folds, ventricular folds, and the laryngeal rim). For each region, two judges determined if there was nothing (no secretions or prandial material), a small amount of either secretions or prandial material, or a large amount of either secretions or prandial material.

So, I have an overall intercept and “random intercepts” for rater, patient, and material. (As I write this out, it occurs to me that dropping the intercepts for material is probably a good idea.) In a working version of the model, I have slopes for each region for each of secretions and prandial material. In the data prepared for this model, there are dummy coded absence (0) vs presence (1) predictors for secretions and prandial material for each region (i.e., “small amount” and “large amount” is collapsed into “presence”).

Here is a model for binary judgments (the rater either thinks there is material in the airway or not - there is an analogous model for visual analog scale judgments with the same data set):

data {
  int nobs; // number of observations
  int nreg; // number of regions
  int npat; // number of patients
  int nmat; // number of materials
  int nrat; // number of raters
  int<lower=0,upper=1> y[nobs]; // mia judgment present = 1, absent = 0
  int r[nobs]; // rater index
  int p[nobs]; // patient index
  int m[nobs]; // material index
  row_vector[nreg] Xs[nobs]; // presence/absence secretion for each region
  row_vector[nreg] Xp[nobs]; // presence/absence prand mat for each region
parameters {
  real a_o; // overall intercept
  real a_p[npat]; // patient intercept
  real a_m[nmat]; // material intercept
  real a_r[nrat]; // rater intercept
  vector[nreg] b_s; // rater slopes by region
  vector[nreg] b_p; // rater slopes by region
  real<lower=0> sd_p; // patient int sd
model {
  // data model
  for(i in 1:nobs)
    y[i] ~ bernoulli(inv_logit(a_o + a_p[p[i]] + a_m[m[i]] + a_r[r[i]] + Xs[i]*b_s + Xp[i]*b_p));
  // model model
  a_o ~ normal(0,2);
  a_p ~ normal(0,sd_p);
  a_m ~ normal(0,2);
  a_r ~ normal(0,2);
  b_s ~ normal(0,2);
  b_p ~ normal(0,2);
  sd_p ~ gamma(3,2);

I would like to incorporate the amount variable into this model, rather than just having absence vs presence, but I’m not sure how to do it. My intuition is that there will be identifiability issues with both of the two possible solutions I have come up with so far.

On the one hand, I could get rid of the absence/presence predictors and, for each region for each of secretions and prandial material, have indices for nothing, small amount, large amount, and then use simplexes for each region and material type (implementing the monotonic predictor in brms). If I do this, I am not sure what will happen with the “nothing” level of each simplex with respect to the overall intercept and the “random intercepts”. I (think I) want the “small amount” and “large amount” levels to indicate a deviation from the reference condition of “nothing”, and it seems like the intercepts and the “nothing” level will be encoding (partially) redundant information (e.g., an upward shift in the overall intercept could be offset by a reduction in the “nothing” level of the simplexes and/or the magnitude of the amount slope parameters, and vice versa, depending on the sign of the slope parameters).

On the other hand, I could keep the absence/presence variable in the model and multiply it by the amount simplexes and slope parameters, thereby forcing the “nothing” level of each simplex to contribute nothing. This doesn’t seem quite right, either, though I don’t have any strong intuitions about what might happen with this approach.

I would love to hear any thoughts folks have about how to incorporate the amount variables into these models (and any other thoughts people have about any of this). Thanks.

1 Like

Hi, sorry that your question was left unanswered - it is a good question!

What brms does is that for the first level of the monotonic covariete, the contribution to the total linear predictor will be zero. In other words, if the monotonic covariate has N levels, then the related simplex has N - 1 elements, so you have N - 1 effective parameters: N - 2 to describe the simplex and for the overall effect.

Here’s an excerpt from code generated by brms for a monotonic effect:

functions {
  /* compute monotonic effects
   * Args:
   *   scale: a simplex parameter
   *   i: index to sum over the simplex
   * Returns:
   *   a scalar between 0 and 1
  real mo(vector scale, int i) {
    if (i == 0) {
      return 0;
    } else {
      return rows(scale) * sum(scale[1:i]);

parameters {
vector[Ksp] bsp;  // special effects coefficients
  // simplexes of monotonic effects
  simplex[Jmo[1]] simo_1;}

model {
mu[n] += (bsp[1]) * mo(simo_1, Xmo_1[n]) + ...

Does that answer your question?

Best of luck with your model!