Brms supporting expose_functions from models fit using cmdstanr backend?

Not sure how complicated this would be to implement (or if it actually is supported already and I’m missing something), but is it possible for brms to support the expose_functions(..., vectorize = TRUE) for models fit using a cmdstanr backend? Recognizing the underlying expose_stan_functions() appears to be rstan functionality.

Trying it anyway, I get: Error in if (model_code != "" && is.character(model_code)) { : missing value where TRUE/FALSE needed.

There should be a better error message, but unfortunately that’s not going to work with CmdStanR. In theory something like this should be possible but CmdStan isn’t set up for that. Maybe in the future though!

Good to know. Thank you!

I improved the error message to be more informative.

1 Like

Is there a way to use custom distributions with threading (backend=cmdstanr) without using expose_functions?

1 Like

Not yet. This requires a new feature in cmdstan from what I understand.

2 Likes

is there a way to use a custom distribution with threading outside of brms?

Sure. Extract the stan code via make_stancode() and data via make_standata()

1 Like

and then…

Use cmdstan® directly. Sorry I didnt think it through when writing.

1 Like

is this cmdstanr?

See https://github.com/stan-dev/cmdstanr for all the details

1 Like

Great! Thanks!

I was able to create the model successful but I can’t get pp_check to work on the stanfit object. “could not find gamma_rng”.

To get the succesful model, however, I had to move the custom gamma2 function in the function block to the top of the body to remove parsing errors saying gamma2 does not exist. This same error is marked in my stancode file for partial_log_lik and shows up when trying to call expose_stan_functions().

Variable "partial_log_lik" does not exist.
 error in 'model45ad79d125e8_User_defined_functions' at line 77, column 40
  -------------------------------------------------
    75:   // likelihood including all constants
    76:   if (!prior_only) {
    77:     target += reduce_sum(partial_log_lik, seq, grainsize, Y, Xc, b, Intercept, Xc_v, b_v, Intercept_v);
                                               ^
    78:   }
  -------------------------------------------------

Error in stanc(model_code = mc, model_name = "User-defined functions",  : 
  failed to parse Stan model 'User-defined functions' due to the above error. 
// generated with brms 2.13.11
functions {
  real gamma2_lpdf(real y, real mu, real v) {
    return gamma_lpdf(y | mu * mu / v, mu / v);
  } 
  real gamma2_rng(real mu, real v) {
    return gamma_rng(mu * mu / v, mu / v);
  }
  int[] sequence(int start, int end) { 
    int seq[end - start + 1];
    for (n in 1:num_elements(seq)) {
      seq[n] = n + start - 1;
    }
    return seq; 
  } 
  // compute partial sums of the log-likelihood
  real partial_log_lik(int[] seq, int start, int end, vector Y, matrix Xc, vector b, real Intercept, matrix Xc_v, vector b_v, real Intercept_v) {
    real ptarget = 0;
    int N = end - start + 1;
    // initialize linear predictor term
    vector[N] mu = Intercept + Xc[start:end] * b;
    // initialize linear predictor term
    vector[N] v = Intercept_v + Xc_v[start:end] * b_v;
    for (n in 1:N) {
      // apply the inverse link function
      v[n] = exp(v[n]);
    }
    for (n in 1:N) {
      // apply the inverse link function
      mu[n] = exp(mu[n]);
    }
    for (n in 1:N) {
      int nn = n + start - 1;
      ptarget += gamma2_lpdf(Y[nn] | mu[n], v[n]);
    }
    return ptarget;
  }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> K_v;  // number of population-level effects
  matrix[N, K_v] X_v;  // population-level design matrix
  int grainsize;  // grainsize for threading
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  int Kc_v = K_v - 1;
  matrix[N, Kc_v] Xc_v;  // centered version of X_v without an intercept
  vector[Kc_v] means_X_v;  // column means of X_v before centering
  int seq[N] = sequence(1, N);
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
  for (i in 2:K_v) {
    means_X_v[i - 1] = mean(X_v[, i]);
    Xc_v[, i - 1] = X_v[, i] - means_X_v[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  vector[Kc_v] b_v;  // population-level effects
  real Intercept_v;  // temporary intercept for centered predictors
}
transformed parameters {
}
model {
  // likelihood including all constants
  if (!prior_only) {
    target += reduce_sum(partial_log_lik, seq, grainsize, Y, Xc, b, Intercept, Xc_v, b_v, Intercept_v);
  }
  // priors including all constants
  target += normal_lpdf(b | 0, 3);
  target += normal_lpdf(Intercept | 3, 1);
  target += normal_lpdf(b_v | 0, 3);
  target += normal_lpdf(Intercept_v | 3, 1);
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // actual population-level intercept
  real b_v_Intercept = Intercept_v - dot_product(means_X_v, b_v);
  // additionally draw samples from priors
  real prior_b = normal_rng(0,3);
  real prior_Intercept = normal_rng(3,1);
  real prior_b_v = normal_rng(0,3);
  real prior_Intercept_v = normal_rng(3,1);
}

image

is there a workaround? a Cpp workaround?

So brms code generation yields a parsing error? If so can you provide a minimal reproducible example?

in order to use this in expose_stan_functions, this code would have to work in rstan, which is currently still does not (at least not until the next rstan release).

Rstudio syntax highlighter also uses rstan and the syntax rstan supports. Which does not recognize reduce_sum. Meaning that it does not know that partial_log_lik was supposed to be a UDF and instead treats it as a variable, hence the error variable partial_log_lik does not exist

2 Likes

Once 2.25 is out (Thursday ETA) the cmdstan side of things is actually ready. Once we finish some of the important cmdstanr issues, we can start looking into that.

1 Like

Thats great news!

will you post here when its ready?

Of course.

2 Likes