Passing a stan formula to brm() using cmdstanr as backend

I have a brms implementation of a weighted bayes model (integrating information from different sources) relying on a stan function, which runs without any issue using the default backend.
However, when I try to change backend to cmdstanr, I get an error: "Error in formula.default(object, env = baseenv()) : invalid formula.

Data: https://www.dropbox.com/s/nekgayjpugs8z51/d.csv?dl=0

library(brms)

F_stancode = "
real F(real a_raw, real L_raw, real w_raw) {
    real a;
    real L;
    real w;
    a = exp(a_raw);
    L = exp(L_raw * a);
    w = 0.5 + inv_logit(w_raw)/2;
    return log((w * L + 1 - w)./((1 - w) * L + w));
}
"

m_f <- bf(RedChoice ~ bias + 
                    F(0, lSelf, wSelf) + 
                    F(0, lOthers1, wOthers + bConfidence * Confidence1) + 
                    F(0, lOthers2, wOthers + bConfidence * Confidence2) + 
                    F(0, lOthers3, wOthers + bConfidence * Confidence3) + 
                    F(0, lOthers4, wOthers + bConfidence * Confidence4),
                wSelf + wOthers + bConfidence + bias ~ 1,
                nl = TRUE)

m_prior <- c(prior(normal(2,1), nlpar = "bConfidence"),
                   prior(normal(0,1), nlpar = "wSelf", coef = "Intercept"),
                   prior(normal(0,1), nlpar = "wOthers", coef = "Intercept"),
                   prior(normal(0,.1), nlpar = "bias"))

m <- brm(m_f,
                               d,
                               stan_funs = F_stancode,
                               prior = m_prior,
                               sample_prior = TRUE,
                               family = "bernoulli",
                               chains = 2, 
                               cores = 2,
                               #backend = "cmdstanr", # uncomment to see the error
                               #threads = threading(2),
                               control = list(adapt_delta = 0.99,
                                              max_treedepth=20))

The issue might be in me using stan_fun(), instead of stanvar(), but I get an error when trying to convert to stanvar() even when using the default rstan backend.

  • Operating System: MacOS Catalina
  • brms Version: 2.14.2

Update: thanks to Pavel Logacev I’ve converted the code to use stanvars. This works with the default backend, but not with cmdstanr:

m <- brm(m_f,
                               d,
                               stanvars = stanvar(scode = F_stancode, block = "functions"),
                               prior = m_prior,
                               sample_prior = TRUE,
                               family = "bernoulli",
                               chains = 2, 
                               cores = 2,
                               #backend = "cmdstanr", # uncomment to see the error
                               #threads = threading(2),
                               control = list(adapt_delta = 0.99,
                                              max_treedepth=20))

If I change the backend, I get the following error:

Semantic error in ‘/var/folders/3m/f039n0x549vfxhdtj55yykzhfjr0d6/T/RtmpaKPu7Y/model-16ad237a85ad8.stan’, line 11, column 15 to column 49:
9: L = exp(L_raw * a);
10: w = 0.5 + inv_logit(w_raw)/2;
11: return log((w * L + 1 - w)./((1 - w) * L + w));
^
12: }
13:

Ill-typed arguments supplied to infix operator ./. Available signatures:
(real, vector) => vector
(vector, real) => vector
(vector, vector) => vector
(real, row_vector) => row_vector
(row_vector, real) => row_vector
(row_vector, row_vector) => row_vector
(real, matrix) => matrix
(matrix, real) => matrix
(matrix, matrix) => matrix
Instead supplied arguments of incompatible type: real, real.

make: *** [/var/folders/3m/f039n0x549vfxhdtj55yykzhfjr0d6/T/RtmpaKPu7Y/model-16ad237a85ad8.hpp] Error 1
Error: An error occured during compilation! See the message above for more information.

Hey @fusaroli,

the problem is that element division operator ./ was missing the implementation for scalar ./ scalar in cmdstan 2.24. That was fixed here: https://github.com/stan-dev/stanc3/pull/703

Make sure to run cmdstanr::install_cmdstan() to install the latest version (2.25) that has this fix. Or change the code to use / instead of ./.

1 Like

This solved it. Thank you very much.

1 Like

Re-opening this as the issue pops up again when I try threading over two cores:

Semantic error in ‘/var/folders/3m/f039n0x549vfxhdtj55yykzhfjr0d6/T/RtmpaKPu7Y/model-16ad23146c532.stan’, line 32, column 14 to column 44:
30: int nn = n + start - 1;
31: // compute non-linear predictor values
32: mu[n] = F(0 , C_1[nn] , nlp_wSelf[nn]) + F(0 , C_2[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_3[nn]) + F(0 , C_4[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_5[nn]) + F(0 , C_6[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_7[nn]) + F(0 , C_8[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_9[nn]);
^
33: }
34: ptarget += bernoulli_logit_lpmf(Y[start:end] | mu);

A returning function was expected but an undeclared identifier ‘F’ was supplied.
make: *** [/var/folders/3m/f039n0x549vfxhdtj55yykzhfjr0d6/T/RtmpaKPu7Y/model-16ad23146c532.hpp] Error 1
Error: An error occured during compilation! See the message above for more information.

Can you post the entire Stan code?

// generated with brms 2.14.2
functions {
  /* integer sequence of values
   * Args: 
   *   start: starting integer
   *   end: ending integer
   * Returns: 
   *   an integer sequence from start to end
   */ 
  int[] sequence(int start, int end) { 
    int seq[end - start + 1];
    for (n in 1:num_elements(seq)) {
      seq[n] = n + start - 1;
    }
    return seq; 
  } 
  // compute partial sums of the log-likelihood
  real partial_log_lik(int[] seq, int start, int end, int[] Y, matrix X_wSelf, vector b_wSelf, matrix X_wOthers, vector b_wOthers, matrix X_bConfidence, vector b_bConfidence, vector C_1, vector C_2, vector C_3, vector C_4, vector C_5, vector C_6, vector C_7, vector C_8, vector C_9) {
    real ptarget = 0;
    int N = end - start + 1;
    // initialize linear predictor term
    vector[N] nlp_wSelf = X_wSelf[start:end] * b_wSelf;
    // initialize linear predictor term
    vector[N] nlp_wOthers = X_wOthers[start:end] * b_wOthers;
    // initialize linear predictor term
    vector[N] nlp_bConfidence = X_bConfidence[start:end] * b_bConfidence;
    // initialize non-linear predictor term
    vector[N] mu;
    for (n in 1:N) {
      int nn = n + start - 1;
      // compute non-linear predictor values
      mu[n] = F(0 , C_1[nn] , nlp_wSelf[nn]) + F(0 , C_2[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_3[nn]) + F(0 , C_4[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_5[nn]) + F(0 , C_6[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_7[nn]) + F(0 , C_8[nn] , nlp_wOthers[nn] + nlp_bConfidence[nn] * C_9[nn]);
    }
    ptarget += bernoulli_logit_lpmf(Y[start:end] | mu);
    return ptarget;
  }

real F(real a_raw, real L_raw, real w_raw) {
    real a;
    real L;
    real w;
    a = exp(a_raw);
    L = exp(L_raw * a);
    w = 0.5 + inv_logit(w_raw)/2;
    return log((w * L + 1 - w)./((1 - w) * L + w));
}

}
data {
  int<lower=1> N;  // total number of observations
  int Y[N];  // response variable
  int<lower=1> K_wSelf;  // number of population-level effects
  matrix[N, K_wSelf] X_wSelf;  // population-level design matrix
  int<lower=1> K_wOthers;  // number of population-level effects
  matrix[N, K_wOthers] X_wOthers;  // population-level design matrix
  int<lower=1> K_bConfidence;  // number of population-level effects
  matrix[N, K_bConfidence] X_bConfidence;  // population-level design matrix
  // covariate vectors for non-linear functions
  vector[N] C_1;
  vector[N] C_2;
  vector[N] C_3;
  vector[N] C_4;
  vector[N] C_5;
  vector[N] C_6;
  vector[N] C_7;
  vector[N] C_8;
  vector[N] C_9;
  int grainsize;  // grainsize for threading
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int seq[N] = sequence(1, N);
}
parameters {
  vector[K_wSelf] b_wSelf;  // population-level effects
  vector[K_wOthers] b_wOthers;  // population-level effects
  vector[K_bConfidence] b_bConfidence;  // population-level effects
}
transformed parameters {
}
model {
  // likelihood including all constants
  if (!prior_only) {
    target += reduce_sum(partial_log_lik, seq, grainsize, Y, X_wSelf, b_wSelf, X_wOthers, b_wOthers, X_bConfidence, b_bConfidence, C_1, C_2, C_3, C_4, C_5, C_6, C_7, C_8, C_9);
  }
  // priors including all constants
  target += normal_lpdf(b_wSelf[1] | 0, 1);
  target += normal_lpdf(b_wOthers[1] | 0, 1);
  target += normal_lpdf(b_bConfidence | 2, 1);
}
generated quantities {
  // additionally draw samples from priors
  real prior_b_wSelf_1 = normal_rng(0,1);
  real prior_b_wOthers_1 = normal_rng(0,1);
  real prior_b_bConfidence = normal_rng(2,1);
}

This might be a code-generation problem. Tagging @paul.buerkner

Can you post the Stan code when not using threading?

// generated with brms 2.14.2
functions {

real F(real a_raw, real L_raw, real w_raw) {
    real a;
    real L;
    real w;
    a = exp(a_raw);
    L = exp(L_raw * a);
    w = 0.5 + inv_logit(w_raw)/2;
    return log((w * L + 1 - w)./((1 - w) * L + w));
}

}
data {
  int<lower=1> N;  // total number of observations
  int Y[N];  // response variable
  int<lower=1> K_wSelf;  // number of population-level effects
  matrix[N, K_wSelf] X_wSelf;  // population-level design matrix
  int<lower=1> K_wOthers;  // number of population-level effects
  matrix[N, K_wOthers] X_wOthers;  // population-level design matrix
  int<lower=1> K_bConfidence;  // number of population-level effects
  matrix[N, K_bConfidence] X_bConfidence;  // population-level design matrix
  // covariate vectors for non-linear functions
  vector[N] C_1;
  vector[N] C_2;
  vector[N] C_3;
  vector[N] C_4;
  vector[N] C_5;
  vector[N] C_6;
  vector[N] C_7;
  vector[N] C_8;
  vector[N] C_9;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector[K_wSelf] b_wSelf;  // population-level effects
  vector[K_wOthers] b_wOthers;  // population-level effects
  vector[K_bConfidence] b_bConfidence;  // population-level effects
}
transformed parameters {
}
model {
  // likelihood including all constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] nlp_wSelf = X_wSelf * b_wSelf;
    // initialize linear predictor term
    vector[N] nlp_wOthers = X_wOthers * b_wOthers;
    // initialize linear predictor term
    vector[N] nlp_bConfidence = X_bConfidence * b_bConfidence;
    // initialize non-linear predictor term
    vector[N] mu;
    for (n in 1:N) {
      // compute non-linear predictor values
      mu[n] = F(0 , C_1[n] , nlp_wSelf[n]) + F(0 , C_2[n] , nlp_wOthers[n] + nlp_bConfidence[n] * C_3[n]) + F(0 , C_4[n] , nlp_wOthers[n] + nlp_bConfidence[n] * C_5[n]) + F(0 , C_6[n] , nlp_wOthers[n] + nlp_bConfidence[n] * C_7[n]) + F(0 , C_8[n] , nlp_wOthers[n] + nlp_bConfidence[n] * C_9[n]);
    }
    target += bernoulli_logit_lpmf(Y | mu);
  }
  // priors including all constants
  target += normal_lpdf(b_wSelf[1] | 0, 1);
  target += normal_lpdf(b_wOthers[1] | 0, 1);
  target += normal_lpdf(b_bConfidence | 2, 1);
}
generated quantities {
  // additionally draw samples from priors
  real prior_b_wSelf_1 = normal_rng(0,1);
  real prior_b_wOthers_1 = normal_rng(0,1);
  real prior_b_bConfidence = normal_rng(2,1);
}

Possibly a long shot, but can you try updating brms to the latest version (2.14.4) and try again?

1 Like

no, no luck running it on the computer with the updated version. Exactly same error.

I’ve tracked this down to bug in the brms code generation. The problem is that the user-defined functions were being inserted into the code after the definition of the partial_log_lik function used by brms's threading. The consequence of this is that the F function was getting called before it was defined.

I’ll open an issue on the brms github and let you know when there’s a fixed version available

2 Likes

Thank you. This is much appreciated

@fusaroli I think this was just fixed by @paul.buerkner
Try installing the Github version again.

Thanks @andrjohns for tracking this down!

2 Likes

yeps, it all works now! thank you.

3 Likes