Monotonic effect sampling error

  • Operating System: OS X Catalina
  • brms Version: latest GitHub version

Question 1:
When I run this data.csv (18.1 KB), and convert f4f_rls

df$f4f_rls_oc ← factor(df$f4f_rls, levels=c(“1”, “2”), ordered=TRUE)

running this model doesn’t work:

brm(outcome ~ mo(f4f_rls_oc), family = bernoulli, data = df)

Compiling the C++ model
Start sampling
Error in new_CppObject_xp(fields$.module, fields$.pointer, …) :
Exception: model11ee364d8d835_49a9c3ccd6faf61c3570f864000a672c_namespace::model11ee364d8d835_49a9c3ccd6faf61c3570f864000a672c: Jmo[i_0__] is 1, but must be greater than or equal to 2 (in ‘model11ee364d8d835_49a9c3ccd6faf61c3570f864000a672c’ at line 26)
failed to create the sampler; sampling not done

The predictor f4f_rls_oc should be an ordered categorical variable "1" < "2". I’m a bit baffled so the question is where I’ve made a mistake?

Question 2
a) If I have ordered categorical predictors from 1-5 and 4 has not been used should I still use ordered categorical in five levels, i.e., factor(x, "1", "2", "3", "4", "5", ordered=TRUE)? My question is due to the rls variable above which is really 1-8, but the data has recorded only levels 1 and 2.
b) If yes, does the same apply when 5 has not been used above in a), i.e., should I still use factor(x, "1", "2", "3", "4", "5", ordered=TRUE)?

I’ve read Modeling monotonic effects of ordinal predictors in Bayesian regression models, and I can’t really see that i) it has to be >2 levels and, ii) how to deal with the cases of no responses on certain levels (i.e., it’s not missing data).

Sorry, I’m a bit new to this whole mo() stuff :)

The code brms generates is

// generated with brms 2.9.0
functions {

  /* compute monotonic effects
   * Args:
   *   scale: a simplex parameter
   *   i: index to sum over the simplex
   * Returns:
   *   a scalar between 0 and 1
   */
  real mo(vector scale, int i) {
    if (i == 0) {
      return 0;
    } else {
      return rows(scale) * sum(scale[1:i]);
    }
  }
}
data {
  int<lower=1> N;  // number of observations
  int Y[N];  // response variable
  int<lower=1> Ksp;  // number of special effects terms
  int<lower=1> Imo;  // number of monotonic variables
  int<lower=2> Jmo[Imo];  // length of simplexes
  // monotonic variables
  int Xmo_1[N];
  // prior concentration of monotonic simplexes
  vector[Jmo[1]] con_simo_1;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  real temp_Intercept;  // temporary intercept
  // special effects coefficients
  vector[Ksp] bsp;
  // simplexes of monotonic effects
  simplex[Jmo[1]] simo_1;
}
transformed parameters {
}
model {
  vector[N] mu = temp_Intercept + rep_vector(0, N);
  for (n in 1:N) {
    mu[n] += (bsp[1]) * mo(simo_1, Xmo_1[n]);
  }
  // priors including all constants
  target += student_t_lpdf(temp_Intercept | 3, 0, 10);
  target += dirichlet_lpdf(simo_1 | con_simo_1);
  // likelihood including all constants
  if (!prior_only) {
    target += bernoulli_logit_lpmf(Y | mu);
  }
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = temp_Intercept;
}

The error points to the line 24

  int<lower=2> Jmo[Imo];  // length of simplexes

The problem is that Jmo=1 but constraint says lower=2. This looks like oversight in brms; there is no reason to forbid a length 1 simplex.

Monotone effect does not mean increasing. It means either_ always increasing or always decreasing.
If there are three levels that means whenever the second level is higher than first then third level must be higher than second; whenever the second level is lower than first the third is lower than second.
But if there are only two levels then…it makes no difference.

I don’t know much about this either but I think missing levels are just a matter of priors. A priori the prediction is that every gap between levels is about the same. Adding an empty level between two levels makes the expected gap between the nonempty levels twice as big. The priors are probably quite weak though, so posterior shouldn’t depend too much on it.

So, a bug in brms?

Yes, I think so.

I’ll add it to GitHub then so @paul.buerkner is aware of this.

Thank you! It should be fixed now on github.

1 Like

Dear Paul, I had the same problem.

For a model I’ve been running with ordinal predictors, one of the ordinal predictor had zero cases for level 1 and the model error message suggests that I should use a dirichlet prior with one less parameters (Error: Invalid Dirichlet prior for the simplex of coefficient ‘moPM_feedback1’. Expected input of length 3).

This thread suggests that you have fixed the bug in your github version of brms. Is this also fixed on cran or would you recommend we use your latest github version?

Pls let me know if there was a systematic way for me to change priors or the dataset to handle ordinal variables that have no cases for some levels.

Thanks for all your effort in creating brms. It’s an immensely helpful package for folk like me that are new to bayesian data analyses.

Look forward to your response. Thanks.

Warm rgds
Sid

I would recommend using the latest github version. If that fixes your problem, I cannot tell exactly.