Sampling problems in ordinal model

paul.buerkner · March 10, 2020, 1:46pm

In brms, I have recently implemented a feature to model different thresholds for different groups of observations. However, the model is not only terribly slow, it often doesn’t run at all, soaks up multiple GB of RAM (effectively all my RAM even for small data) or crashes the R session right away. It is likely a stupid mistake I made, but I don’t see it. Also, I would think that the memory problems should not occur even if I made a coding mistake. Anyway, I would be grateful for feedback and ideas.

The simplified Stan code looks as follows:

functions {
  /* cumulative-logit log-PDF for a single response and merged thresholds
   * Args:
   *   y: response category
   *   mu: latent mean parameter
   *   thres: vector of merged ordinal thresholds
   *   j: start and end index for the applid threshold within 'thres'
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real cumulative_logit_merged_lpmf(int y, real mu, vector thres, int[] j) {
     return ordered_logistic_lpmf(y | mu, thres[j[1]:j[2]]);
   }
}
data {
  int<lower=1> N;  // number of observations
  int Y[N];  // response variable
  int<lower=1> ngrthres;  // number of threshold groups
  int<lower=1> nthres[ngrthres];  // number of thresholds
  int<lower=1> Jthres[N, 2];  // threshold indices
}
transformed data {
  int<lower=1> nmthres = prod(nthres);  // total number of thresholds
  int<lower=1> Kthres_start[ngrthres];  // start index per threshold group
  int<lower=1> Kthres_end[ngrthres];  // end index per threshold group
  Kthres_start[1] = 1;
  Kthres_end[1] = nthres[1];
  for (i in 2:ngrthres) {
    Kthres_start[i] = Kthres_end[i-1] + 1;
    Kthres_end[i] = Kthres_end[i-1] + nthres[i];
  }
}
parameters {
  ordered[nthres[1]] Intercept_1;  // temporary thresholds for centered predictors
  ordered[nthres[2]] Intercept_2;  // temporary thresholds for centered predictors
  ordered[nthres[3]] Intercept_3;  // temporary thresholds for centered predictors
  ordered[nthres[4]] Intercept_4;  // temporary thresholds for centered predictors
  ordered[nthres[5]] Intercept_5;  // temporary thresholds for centered predictors
  ordered[nthres[6]] Intercept_6;  // temporary thresholds for centered predictors
  ordered[nthres[7]] Intercept_7;  // temporary thresholds for centered predictors
  ordered[nthres[8]] Intercept_8;  // temporary thresholds for centered predictors
  ordered[nthres[9]] Intercept_9;  // temporary thresholds for centered predictors
  ordered[nthres[10]] Intercept_10;  // temporary thresholds for centered predictors
  ordered[nthres[11]] Intercept_11;  // temporary thresholds for centered predictors
  ordered[nthres[12]] Intercept_12;  // temporary thresholds for centered predictors
  ordered[nthres[13]] Intercept_13;  // temporary thresholds for centered predictors
  ordered[nthres[14]] Intercept_14;  // temporary thresholds for centered predictors
  ordered[nthres[15]] Intercept_15;  // temporary thresholds for centered predictors
  ordered[nthres[16]] Intercept_16;  // temporary thresholds for centered predictors
  ordered[nthres[17]] Intercept_17;  // temporary thresholds for centered predictors
  ordered[nthres[18]] Intercept_18;  // temporary thresholds for centered predictors
  ordered[nthres[19]] Intercept_19;  // temporary thresholds for centered predictors
  ordered[nthres[20]] Intercept_20;  // temporary thresholds for centered predictors
}
transformed parameters {
}
model {
  vector[nmthres] merged_Intercept;  // merged thresholds
  // initialize linear predictor term
  vector[N] mu = rep_vector(0, N);
  merged_Intercept[Kthres_start[1]:Kthres_end[1]] = Intercept_1;
  merged_Intercept[Kthres_start[2]:Kthres_end[2]] = Intercept_2;
  merged_Intercept[Kthres_start[3]:Kthres_end[3]] = Intercept_3;
  merged_Intercept[Kthres_start[4]:Kthres_end[4]] = Intercept_4;
  merged_Intercept[Kthres_start[5]:Kthres_end[5]] = Intercept_5;
  merged_Intercept[Kthres_start[6]:Kthres_end[6]] = Intercept_6;
  merged_Intercept[Kthres_start[7]:Kthres_end[7]] = Intercept_7;
  merged_Intercept[Kthres_start[8]:Kthres_end[8]] = Intercept_8;
  merged_Intercept[Kthres_start[9]:Kthres_end[9]] = Intercept_9;
  merged_Intercept[Kthres_start[10]:Kthres_end[10]] = Intercept_10;
  merged_Intercept[Kthres_start[11]:Kthres_end[11]] = Intercept_11;
  merged_Intercept[Kthres_start[12]:Kthres_end[12]] = Intercept_12;
  merged_Intercept[Kthres_start[13]:Kthres_end[13]] = Intercept_13;
  merged_Intercept[Kthres_start[14]:Kthres_end[14]] = Intercept_14;
  merged_Intercept[Kthres_start[15]:Kthres_end[15]] = Intercept_15;
  merged_Intercept[Kthres_start[16]:Kthres_end[16]] = Intercept_16;
  merged_Intercept[Kthres_start[17]:Kthres_end[17]] = Intercept_17;
  merged_Intercept[Kthres_start[18]:Kthres_end[18]] = Intercept_18;
  merged_Intercept[Kthres_start[19]:Kthres_end[19]] = Intercept_19;
  merged_Intercept[Kthres_start[20]:Kthres_end[20]] = Intercept_20;
  // priors including all constants
  target += normal_lpdf(Intercept_1 | 0, 1);
  target += normal_lpdf(Intercept_2 | 0, 1);
  target += normal_lpdf(Intercept_3 | 0, 1);
  target += normal_lpdf(Intercept_4 | 0, 1);
  target += normal_lpdf(Intercept_5 | 0, 1);
  target += normal_lpdf(Intercept_6 | 0, 1);
  target += normal_lpdf(Intercept_7 | 0, 1);
  target += normal_lpdf(Intercept_8 | 0, 1);
  target += normal_lpdf(Intercept_9 | 0, 1);
  target += normal_lpdf(Intercept_10 | 0, 1);
  target += normal_lpdf(Intercept_11 | 0, 1);
  target += normal_lpdf(Intercept_12 | 0, 1);
  target += normal_lpdf(Intercept_13 | 0, 1);
  target += normal_lpdf(Intercept_14 | 0, 1);
  target += normal_lpdf(Intercept_15 | 0, 1);
  target += normal_lpdf(Intercept_16 | 0, 1);
  target += normal_lpdf(Intercept_17 | 0, 1);
  target += normal_lpdf(Intercept_18 | 0, 1);
  target += normal_lpdf(Intercept_19 | 0, 1);
  target += normal_lpdf(Intercept_20 | 0, 1);
  // likelihood including all constants
  for (n in 1:N) {
    target += cumulative_logit_merged_lpmf(Y[n] | mu[n], merged_Intercept, Jthres[n]);
  }
}
generated quantities {
  // compute actual thresholds
  vector[nthres[1]] b_Intercept_1 = Intercept_1;
  vector[nthres[2]] b_Intercept_2 = Intercept_2;
  vector[nthres[3]] b_Intercept_3 = Intercept_3;
  vector[nthres[4]] b_Intercept_4 = Intercept_4;
  vector[nthres[5]] b_Intercept_5 = Intercept_5;
  vector[nthres[6]] b_Intercept_6 = Intercept_6;
  vector[nthres[7]] b_Intercept_7 = Intercept_7;
  vector[nthres[8]] b_Intercept_8 = Intercept_8;
  vector[nthres[9]] b_Intercept_9 = Intercept_9;
  vector[nthres[10]] b_Intercept_10 = Intercept_10;
  vector[nthres[11]] b_Intercept_11 = Intercept_11;
  vector[nthres[12]] b_Intercept_12 = Intercept_12;
  vector[nthres[13]] b_Intercept_13 = Intercept_13;
  vector[nthres[14]] b_Intercept_14 = Intercept_14;
  vector[nthres[15]] b_Intercept_15 = Intercept_15;
  vector[nthres[16]] b_Intercept_16 = Intercept_16;
  vector[nthres[17]] b_Intercept_17 = Intercept_17;
  vector[nthres[18]] b_Intercept_18 = Intercept_18;
  vector[nthres[19]] b_Intercept_19 = Intercept_19;
  vector[nthres[20]] b_Intercept_20 = Intercept_20;
}

The data can be generated and model run as follows:

library(tidyverse)
library(rstan)
library(brms)
set.seed(12345)
nyears <- 20
norm <- rep(rnorm(100, mean = 0, sd = 2), nyears)
year <- rep(1:nyears, each = 100)
trend <- rep(cos(unique(year)/3), each = 100)
trust <- norm + trend
plot(year, trust)
plot(year, trend)
data <- cbind(norm, year, trend, trust) %>%
  data.frame() %>%
  mutate(
    trust_ord = cut(trust, c(-20,-2,-1,0,1,2,20), ordered_result = TRUE),
    year_f = factor(year)
  )

sdata <- make_standata(
  trust_ord | thres(5, year_f) ~ 1,
  data = data,
  family = cumulative(),
  prior = prior(normal(0, 1), class = "Intercept"),
  chains = 1
)

scode <- "
functions {
  /* cumulative-logit log-PDF for a single response and merged thresholds
   * Args:
   *   y: response category
   *   mu: latent mean parameter
   *   thres: vector of merged ordinal thresholds
   *   j: start and end index for the applid threshold within 'thres'
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real cumulative_logit_merged_lpmf(int y, real mu, vector thres, int[] j) {
     return ordered_logistic_lpmf(y | mu, thres[j[1]:j[2]]);
   }
}
data {
  int<lower=1> N;  // number of observations
  int Y[N];  // response variable
  int<lower=1> ngrthres;  // number of threshold groups
  int<lower=1> nthres[ngrthres];  // number of thresholds
  int<lower=1> Jthres[N, 2];  // threshold indices
}
transformed data {
  int<lower=1> nmthres = prod(nthres);  // total number of thresholds
  int<lower=1> Kthres_start[ngrthres];  // start index per threshold group
  int<lower=1> Kthres_end[ngrthres];  // end index per threshold group
  Kthres_start[1] = 1;
  Kthres_end[1] = nthres[1];
  for (i in 2:ngrthres) {
    Kthres_start[i] = Kthres_end[i-1] + 1;
    Kthres_end[i] = Kthres_end[i-1] + nthres[i];
  }
}
parameters {
  ordered[nthres[1]] Intercept_1;  // temporary thresholds for centered predictors
  ordered[nthres[2]] Intercept_2;  // temporary thresholds for centered predictors
  ordered[nthres[3]] Intercept_3;  // temporary thresholds for centered predictors
  ordered[nthres[4]] Intercept_4;  // temporary thresholds for centered predictors
  ordered[nthres[5]] Intercept_5;  // temporary thresholds for centered predictors
  ordered[nthres[6]] Intercept_6;  // temporary thresholds for centered predictors
  ordered[nthres[7]] Intercept_7;  // temporary thresholds for centered predictors
  ordered[nthres[8]] Intercept_8;  // temporary thresholds for centered predictors
  ordered[nthres[9]] Intercept_9;  // temporary thresholds for centered predictors
  ordered[nthres[10]] Intercept_10;  // temporary thresholds for centered predictors
  ordered[nthres[11]] Intercept_11;  // temporary thresholds for centered predictors
  ordered[nthres[12]] Intercept_12;  // temporary thresholds for centered predictors
  ordered[nthres[13]] Intercept_13;  // temporary thresholds for centered predictors
  ordered[nthres[14]] Intercept_14;  // temporary thresholds for centered predictors
  ordered[nthres[15]] Intercept_15;  // temporary thresholds for centered predictors
  ordered[nthres[16]] Intercept_16;  // temporary thresholds for centered predictors
  ordered[nthres[17]] Intercept_17;  // temporary thresholds for centered predictors
  ordered[nthres[18]] Intercept_18;  // temporary thresholds for centered predictors
  ordered[nthres[19]] Intercept_19;  // temporary thresholds for centered predictors
  ordered[nthres[20]] Intercept_20;  // temporary thresholds for centered predictors
}
transformed parameters {
}
model {
  vector[nmthres] merged_Intercept;  // merged thresholds
  // initialize linear predictor term
  vector[N] mu = rep_vector(0, N);
  merged_Intercept[Kthres_start[1]:Kthres_end[1]] = Intercept_1;
  merged_Intercept[Kthres_start[2]:Kthres_end[2]] = Intercept_2;
  merged_Intercept[Kthres_start[3]:Kthres_end[3]] = Intercept_3;
  merged_Intercept[Kthres_start[4]:Kthres_end[4]] = Intercept_4;
  merged_Intercept[Kthres_start[5]:Kthres_end[5]] = Intercept_5;
  merged_Intercept[Kthres_start[6]:Kthres_end[6]] = Intercept_6;
  merged_Intercept[Kthres_start[7]:Kthres_end[7]] = Intercept_7;
  merged_Intercept[Kthres_start[8]:Kthres_end[8]] = Intercept_8;
  merged_Intercept[Kthres_start[9]:Kthres_end[9]] = Intercept_9;
  merged_Intercept[Kthres_start[10]:Kthres_end[10]] = Intercept_10;
  merged_Intercept[Kthres_start[11]:Kthres_end[11]] = Intercept_11;
  merged_Intercept[Kthres_start[12]:Kthres_end[12]] = Intercept_12;
  merged_Intercept[Kthres_start[13]:Kthres_end[13]] = Intercept_13;
  merged_Intercept[Kthres_start[14]:Kthres_end[14]] = Intercept_14;
  merged_Intercept[Kthres_start[15]:Kthres_end[15]] = Intercept_15;
  merged_Intercept[Kthres_start[16]:Kthres_end[16]] = Intercept_16;
  merged_Intercept[Kthres_start[17]:Kthres_end[17]] = Intercept_17;
  merged_Intercept[Kthres_start[18]:Kthres_end[18]] = Intercept_18;
  merged_Intercept[Kthres_start[19]:Kthres_end[19]] = Intercept_19;
  merged_Intercept[Kthres_start[20]:Kthres_end[20]] = Intercept_20;
  // priors including all constants
  target += normal_lpdf(Intercept_1 | 0, 1);
  target += normal_lpdf(Intercept_2 | 0, 1);
  target += normal_lpdf(Intercept_3 | 0, 1);
  target += normal_lpdf(Intercept_4 | 0, 1);
  target += normal_lpdf(Intercept_5 | 0, 1);
  target += normal_lpdf(Intercept_6 | 0, 1);
  target += normal_lpdf(Intercept_7 | 0, 1);
  target += normal_lpdf(Intercept_8 | 0, 1);
  target += normal_lpdf(Intercept_9 | 0, 1);
  target += normal_lpdf(Intercept_10 | 0, 1);
  target += normal_lpdf(Intercept_11 | 0, 1);
  target += normal_lpdf(Intercept_12 | 0, 1);
  target += normal_lpdf(Intercept_13 | 0, 1);
  target += normal_lpdf(Intercept_14 | 0, 1);
  target += normal_lpdf(Intercept_15 | 0, 1);
  target += normal_lpdf(Intercept_16 | 0, 1);
  target += normal_lpdf(Intercept_17 | 0, 1);
  target += normal_lpdf(Intercept_18 | 0, 1);
  target += normal_lpdf(Intercept_19 | 0, 1);
  target += normal_lpdf(Intercept_20 | 0, 1);
  // likelihood including all constants
  for (n in 1:N) {
    target += cumulative_logit_merged_lpmf(Y[n] | mu[n], merged_Intercept, Jthres[n]);
  }
}
generated quantities {
  // compute actual thresholds
  vector[nthres[1]] b_Intercept_1 = Intercept_1;
  vector[nthres[2]] b_Intercept_2 = Intercept_2;
  vector[nthres[3]] b_Intercept_3 = Intercept_3;
  vector[nthres[4]] b_Intercept_4 = Intercept_4;
  vector[nthres[5]] b_Intercept_5 = Intercept_5;
  vector[nthres[6]] b_Intercept_6 = Intercept_6;
  vector[nthres[7]] b_Intercept_7 = Intercept_7;
  vector[nthres[8]] b_Intercept_8 = Intercept_8;
  vector[nthres[9]] b_Intercept_9 = Intercept_9;
  vector[nthres[10]] b_Intercept_10 = Intercept_10;
  vector[nthres[11]] b_Intercept_11 = Intercept_11;
  vector[nthres[12]] b_Intercept_12 = Intercept_12;
  vector[nthres[13]] b_Intercept_13 = Intercept_13;
  vector[nthres[14]] b_Intercept_14 = Intercept_14;
  vector[nthres[15]] b_Intercept_15 = Intercept_15;
  vector[nthres[16]] b_Intercept_16 = Intercept_16;
  vector[nthres[17]] b_Intercept_17 = Intercept_17;
  vector[nthres[18]] b_Intercept_18 = Intercept_18;
  vector[nthres[19]] b_Intercept_19 = Intercept_19;
  vector[nthres[20]] b_Intercept_20 = Intercept_20;
}
"

fit <- stan(model_code = scode, data = sdata)

rok_cesnovar · March 10, 2020, 1:52pm

Are you running rstan 2.19.2?

That version has a bug in the vectorized ordered_logistic_lpmf
There is a temporary solution in the second post

Seems like the first thing to look at. Not sure if that could be the culprit for the RAM usage.

mkolczynska · March 10, 2020, 1:54pm

Just to add more information - @paul.buerkner and I are working on this problem together - when I run the model Paul posted, I get the following output:

SAMPLING FOR MODEL '6f84b05902aa94352b63e6276a5d1f49' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: std::bad_alloc  (in 'model24f4832c8f_6f84b05902aa94352b63e6276a5d1f49' at line 59)
 [origin: bad_alloc]
[1] "Error in sampler$call_sampler(args_list[[i]]) : "                                               
[2] "  Exception: std::bad_alloc  (in 'model24f4832c8f_6f84b05902aa94352b63e6276a5d1f49' at line 59)"
[3] " [origin: bad_alloc]"                                                                           
[1] "error occurred during calling the sampler; sampling not done"

paul.buerkner · March 10, 2020, 1:56pm

We have rstan 2.19.3 and the error also occurs in a handcoded cumulative_lpdf as well. So vectorized ordered_logistic_lpmf is not the problem.

// generated with brms 2.12.1
functions {
  /* cumulative-logit log-PDF for a single response
   * Args:
   *   y: response category
   *   mu: latent mean parameter
   *   disc: discrimination parameter
   *   thres: ordinal thresholds
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real cumulative_logit_lpmf(int y, real mu, real disc, vector thres) {
     int nthres = num_elements(thres);
     real p;
     if (y == 1) {
       p = inv_logit(disc * (thres[1] - mu));
     } else if (y == nthres + 1) {
       p = 1 - inv_logit(disc * (thres[nthres] - mu));
     } else {
       p = inv_logit(disc * (thres[y] - mu)) -
           inv_logit(disc * (thres[y - 1] - mu));
     }
     return log(p);
   }
  /* cumulative-logit log-PDF for a single response and merged thresholds
   * Args:
   *   y: response category
   *   mu: latent mean parameter
   *   disc: discrimination parameter
   *   thres: vector of merged ordinal thresholds
   *   j: start and end index for the applid threshold within 'thres'
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real cumulative_logit_merged_lpmf(int y, real mu, real disc, vector thres, int[] j) {
     return cumulative_logit_lpmf(y | mu, disc, thres[j[1]:j[2]]);
   }
}
data {
  int<lower=1> N;  // number of observations
  int Y[N];  // response variable
  int<lower=1> ngrthres;  // number of threshold groups
  int<lower=1> nthres[ngrthres];  // number of thresholds
  int<lower=1> Jthres[N, 2];  // threshold indices
  real<lower=0> disc;  // discrimination parameters
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int<lower=1> nmthres = prod(nthres);  // total number of thresholds
  int<lower=1> Kthres_start[ngrthres];  // start index per threshold group
  int<lower=1> Kthres_end[ngrthres];  // end index per threshold group
  Kthres_start[1] = 1;
  Kthres_end[1] = nthres[1];
  for (i in 2:ngrthres) {
    Kthres_start[i] = Kthres_end[i-1] + 1;
    Kthres_end[i] = Kthres_end[i-1] + nthres[i];
  }
}
parameters {
  ordered[nthres[1]] Intercept_1;  // temporary thresholds for centered predictors
  ordered[nthres[2]] Intercept_2;  // temporary thresholds for centered predictors
  ordered[nthres[3]] Intercept_3;  // temporary thresholds for centered predictors
  ordered[nthres[4]] Intercept_4;  // temporary thresholds for centered predictors
  ordered[nthres[5]] Intercept_5;  // temporary thresholds for centered predictors
  ordered[nthres[6]] Intercept_6;  // temporary thresholds for centered predictors
  ordered[nthres[7]] Intercept_7;  // temporary thresholds for centered predictors
  ordered[nthres[8]] Intercept_8;  // temporary thresholds for centered predictors
  ordered[nthres[9]] Intercept_9;  // temporary thresholds for centered predictors
  ordered[nthres[10]] Intercept_10;  // temporary thresholds for centered predictors
  ordered[nthres[11]] Intercept_11;  // temporary thresholds for centered predictors
  ordered[nthres[12]] Intercept_12;  // temporary thresholds for centered predictors
  ordered[nthres[13]] Intercept_13;  // temporary thresholds for centered predictors
  ordered[nthres[14]] Intercept_14;  // temporary thresholds for centered predictors
  ordered[nthres[15]] Intercept_15;  // temporary thresholds for centered predictors
  ordered[nthres[16]] Intercept_16;  // temporary thresholds for centered predictors
  ordered[nthres[17]] Intercept_17;  // temporary thresholds for centered predictors
  ordered[nthres[18]] Intercept_18;  // temporary thresholds for centered predictors
  ordered[nthres[19]] Intercept_19;  // temporary thresholds for centered predictors
  ordered[nthres[20]] Intercept_20;  // temporary thresholds for centered predictors
}
transformed parameters {
}
model {
  vector[nmthres] merged_Intercept;  // merged thresholds
  // initialize linear predictor term
  vector[N] mu = rep_vector(0, N);
  merged_Intercept[Kthres_start[1]:Kthres_end[1]] = Intercept_1;
  merged_Intercept[Kthres_start[2]:Kthres_end[2]] = Intercept_2;
  merged_Intercept[Kthres_start[3]:Kthres_end[3]] = Intercept_3;
  merged_Intercept[Kthres_start[4]:Kthres_end[4]] = Intercept_4;
  merged_Intercept[Kthres_start[5]:Kthres_end[5]] = Intercept_5;
  merged_Intercept[Kthres_start[6]:Kthres_end[6]] = Intercept_6;
  merged_Intercept[Kthres_start[7]:Kthres_end[7]] = Intercept_7;
  merged_Intercept[Kthres_start[8]:Kthres_end[8]] = Intercept_8;
  merged_Intercept[Kthres_start[9]:Kthres_end[9]] = Intercept_9;
  merged_Intercept[Kthres_start[10]:Kthres_end[10]] = Intercept_10;
  merged_Intercept[Kthres_start[11]:Kthres_end[11]] = Intercept_11;
  merged_Intercept[Kthres_start[12]:Kthres_end[12]] = Intercept_12;
  merged_Intercept[Kthres_start[13]:Kthres_end[13]] = Intercept_13;
  merged_Intercept[Kthres_start[14]:Kthres_end[14]] = Intercept_14;
  merged_Intercept[Kthres_start[15]:Kthres_end[15]] = Intercept_15;
  merged_Intercept[Kthres_start[16]:Kthres_end[16]] = Intercept_16;
  merged_Intercept[Kthres_start[17]:Kthres_end[17]] = Intercept_17;
  merged_Intercept[Kthres_start[18]:Kthres_end[18]] = Intercept_18;
  merged_Intercept[Kthres_start[19]:Kthres_end[19]] = Intercept_19;
  merged_Intercept[Kthres_start[20]:Kthres_end[20]] = Intercept_20;
  // priors including all constants
  target += normal_lpdf(Intercept_1 | 0, 1);
  target += normal_lpdf(Intercept_2 | 0, 1);
  target += normal_lpdf(Intercept_3 | 0, 1);
  target += normal_lpdf(Intercept_4 | 0, 1);
  target += normal_lpdf(Intercept_5 | 0, 1);
  target += normal_lpdf(Intercept_6 | 0, 1);
  target += normal_lpdf(Intercept_7 | 0, 1);
  target += normal_lpdf(Intercept_8 | 0, 1);
  target += normal_lpdf(Intercept_9 | 0, 1);
  target += normal_lpdf(Intercept_10 | 0, 1);
  target += normal_lpdf(Intercept_11 | 0, 1);
  target += normal_lpdf(Intercept_12 | 0, 1);
  target += normal_lpdf(Intercept_13 | 0, 1);
  target += normal_lpdf(Intercept_14 | 0, 1);
  target += normal_lpdf(Intercept_15 | 0, 1);
  target += normal_lpdf(Intercept_16 | 0, 1);
  target += normal_lpdf(Intercept_17 | 0, 1);
  target += normal_lpdf(Intercept_18 | 0, 1);
  target += normal_lpdf(Intercept_19 | 0, 1);
  target += normal_lpdf(Intercept_20 | 0, 1);
  // likelihood including all constants
  if (!prior_only) {
    for (n in 1:N) {
      target += cumulative_logit_merged_lpmf(Y[n] | mu[n], disc, merged_Intercept, Jthres[n]);
    }
  }
}
generated quantities {
  // compute actual thresholds
  vector[nthres[1]] b_Intercept_1 = Intercept_1;
  vector[nthres[2]] b_Intercept_2 = Intercept_2;
  vector[nthres[3]] b_Intercept_3 = Intercept_3;
  vector[nthres[4]] b_Intercept_4 = Intercept_4;
  vector[nthres[5]] b_Intercept_5 = Intercept_5;
  vector[nthres[6]] b_Intercept_6 = Intercept_6;
  vector[nthres[7]] b_Intercept_7 = Intercept_7;
  vector[nthres[8]] b_Intercept_8 = Intercept_8;
  vector[nthres[9]] b_Intercept_9 = Intercept_9;
  vector[nthres[10]] b_Intercept_10 = Intercept_10;
  vector[nthres[11]] b_Intercept_11 = Intercept_11;
  vector[nthres[12]] b_Intercept_12 = Intercept_12;
  vector[nthres[13]] b_Intercept_13 = Intercept_13;
  vector[nthres[14]] b_Intercept_14 = Intercept_14;
  vector[nthres[15]] b_Intercept_15 = Intercept_15;
  vector[nthres[16]] b_Intercept_16 = Intercept_16;
  vector[nthres[17]] b_Intercept_17 = Intercept_17;
  vector[nthres[18]] b_Intercept_18 = Intercept_18;
  vector[nthres[19]] b_Intercept_19 = Intercept_19;
  vector[nthres[20]] b_Intercept_20 = Intercept_20;
}

paul.buerkner · March 11, 2020, 6:35am

I am more and more convinced that this is a likely bug in rstan or at a deeper level.
Tagging @bgoodri and @mitzimorris.

Our machines use R 3.6.x with the latest CRAN versions of rstan and stanHeaders on Windows 10 and Rtools 3.5.

rok_cesnovar · March 11, 2020, 7:28am

Its definitely not a rstan issue as the model uses up all of my 16 GB of RAM in 5 seconds even with cmdstan via cmdstanr

library(cmdstanr)
set_cmdstan_path("~/Desktop/cmdstan/")
file <- file.path("ordinal_issue.stan")
mod <- cmdstan_model(file)
d = list(
  N = sdata$N,
  Y = sdata$Y,
  ngrthres = sdata$ngrthres,
  nthres = sdata$nthres,
  Jthres = sdata$Jthres
)
fit <- mod$sample(
  data = d,
  num_chains = 1
)

rok_cesnovar · March 11, 2020, 8:32am

Actually its an issue with the model.

int<lower=1> nmthres = prod(nthres);

nthres is [5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5]
and prod(nthres) should be 9.536743e+13 which doesnt fit in an int and overflows to 1977800241 and a vector of that size fills up the RAM.

paul.buerkner · March 11, 2020, 8:45am

Oh boi. You are right. This is an embarrsing error. Thanks for catching it! I will fix it asap.

Topic		Replies	Views
Partial pooling on ordinal response (cumulative probit) brms ordinal-response	16	224	May 23, 2025
Brms with categorical model crashes R after package upgrade to brms_2.13.4 brms	8	903	August 6, 2020
Ordinal Mixture Model w/ Set Outcome for One Component brms	28	2513	February 17, 2023
Model with sample data runs fine; actual data of 1000 participants returns several errors Modeling	15	741	October 2, 2020
New User: Model crashes R Modeling	13	1148	September 7, 2020

Sampling problems in ordinal model

Related topics