Yesterday I upgraded brms and rstan, and my ordinal regression models stopped to work.
Here a reproducible example:
data("stemcell", package = "brglm2")
fit_sc1 = brm(formula = research ~ 1 + religion, data = stemcell, family = cumulative("probit"))
Here’s the output:
Compiling Stan program...
Start sampling
SAMPLING FOR MODEL '7c133df9b554e766be6089f9fd09b46b' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 3.5e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.35 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
Selection:
and that is: “Selection” stands for when R crashes, but the classical options are not printed…
No debug, no hints… I tried on another machine without the upgrade and all runs smoothly.
Any help is appreciated
Thanks for reporting this. It seems to run fine on my mac with the latest CRAN versions of brms and rstan, but I see you’re on linux so maybe this is OS specific.
Thanks @jonah!
Is there AFAYK a way to have some log? Or, are you so gentle, to paste here the stancode(fit_sc1) produced by brms, so I can try it with rstan or cmdstan and see if I get more hints on what is happening?
Yeah no problem. Below I’ve pasted the Stan code brms generates.
Another thing you can try is specifying brm(..., backend = "cmdstanr") and, if you have CmdStanR installed, it will try running the model using that instead of RStan. (Of course you can also just use the Stan code directly with CmdStanR yourself, but the new backend argument that @paul.buerkner added is pretty nice!)
Here’s the Stan code:
// generated with brms 2.13.3
functions {
/* cumulative-probit log-PDF for a single response
* Args:
* y: response category
* mu: latent mean parameter
* disc: discrimination parameter
* thres: ordinal thresholds
* Returns:
* a scalar to be added to the log posterior
*/
real cumulative_probit_lpmf(int y, real mu, real disc, vector thres) {
int nthres = num_elements(thres);
real p;
if (y == 1) {
p = Phi(disc * (thres[1] - mu));
} else if (y == nthres + 1) {
p = 1 - Phi(disc * (thres[nthres] - mu));
} else {
p = Phi(disc * (thres[y] - mu)) -
Phi(disc * (thres[y - 1] - mu));
}
return log(p);
}
/* cumulative-probit log-PDF for a single response and merged thresholds
* Args:
* y: response category
* mu: latent mean parameter
* disc: discrimination parameter
* thres: vector of merged ordinal thresholds
* j: start and end index for the applid threshold within 'thres'
* Returns:
* a scalar to be added to the log posterior
*/
real cumulative_probit_merged_lpmf(int y, real mu, real disc, vector thres, int[] j) {
return cumulative_probit_lpmf(y | mu, disc, thres[j[1]:j[2]]);
}
}
data {
int<lower=1> N; // number of observations
int Y[N]; // response variable
int<lower=2> nthres; // number of thresholds
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc = K;
matrix[N, Kc] Xc; // centered version of X
vector[Kc] means_X; // column means of X before centering
for (i in 1:K) {
means_X[i] = mean(X[, i]);
Xc[, i] = X[, i] - means_X[i];
}
}
parameters {
vector[Kc] b; // population-level effects
ordered[nthres] Intercept; // temporary thresholds for centered predictors
}
transformed parameters {
real<lower=0> disc = 1; // discrimination parameters
}
model {
// initialize linear predictor term
vector[N] mu = Xc * b;
// priors including all constants
target += student_t_lpdf(Intercept | 3, 0, 2.5);
// likelihood including all constants
if (!prior_only) {
for (n in 1:N) {
target += cumulative_probit_lpmf(Y[n] | mu[n], disc, Intercept);
}
}
}
generated quantities {
// compute actual thresholds
vector[nthres] b_Intercept = Intercept + dot_product(means_X, b);
}
I place here the solution for those who will have my same problem, or for myself, given the decline of my short-term memory.
The solution is to use cmdstanr as a backend (so the problem, i think is rstan):
data("stemcell", package = "brglm2")
fit_sc1 = brm(formula = research ~ 1 + religion, data = stemcell, family = cumulative("probit"), backend = "cmdstanr")
@jonah, @paul.buerkner if you think it’s a good idea to get our boots dirty and understand the origin of the problem, I’m happy to help.
Thanks! I think it’s worth looking into this further but maybe not right away since I think there will be another rstan release soon and maybe that will fix this. If it’s still broken after that then we should investigate this further.
@bgoodri not sure if it’s helpful but tagging you in case it’s useful for you to have an example of rstan crashing R on linux.