K Fold Cross Validation with Logistic Regression Model

I’m following along with this vignette to perform K-fold cross validation for a logistic regression model. I’ve included my model code at the end of this post. At one point, the author does this:

  fit <- sampling(stanmodel, data = data_train, seed = seed, refresh = 0)
  gen_test <- gqs(stanmodel, draws = as.matrix(fit), data= data_test)

I do something similar, except I use stan instead of sampling:

  fit <-  stan(model_code = logisticRegressionModel, data = trainList, 
  gen_test <- gqs(logisticRegressionModel, draws = as.matrix(fit), data= data_test)

When I do this, I get:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘gqs’ for signature ‘"character"’

I don’t quite understand what’s going wrong. But essentially what I’m trying to do is to compute the accuracy that I defined in generated quantities for the test set using the model I fit using the train set.

  logisticRegressionModel <- "
  data {
    int<lower=0> N;   // number of data items
    int<lower=0> K;   // number of predictors
    matrix[N, K] X;   // predictor matrix
    int y[N];      // outcome vector
  parameters {
    real alpha;       // intercept
    vector[K] gamma;
    vector<lower=0>[K] tau;
    vector[K] beta;   // coefficients for predictors
  model {
    // Priors:
    gamma ~ normal(0, 5);
    tau ~ cauchy(0, 2.5);
    alpha ~ normal(gamma, tau);
    beta ~ normal(gamma, tau);
    // Likelihood
    y ~ bernoulli_logit(alpha + X * beta);
  generated quantities {
    vector[N] y_preds;
    real correct = 0;
    real accuracy;
    for (n in 1:N) {
      y_preds[n] = bernoulli_logit_rng(alpha + X[n] * beta);
      correct += logical_eq(y_preds[n], y[n]);
    accuracy = correct / N;

You might have better luck saving your model to a .stan file and calling it that way.

For some reason I can’t find documentation on how to do this. Could you point me to it?
EDIT: Actually, I was able to find it here. I’ll try it out and let you know what happens.

It seems like it works now when I do:

  fit <-  stan(file = "logisticRegressionModel.stan", data = trainList, 
  gen_test <- gqs(fit@stanmodel, draws = as.matrix(fit), 
                  data = valList)

Thank you! I’m curious I guess as to why this happens though

There shouldn’t be any difference between passing the model code and passing a file containing the code. I’m guessing there may be some stray characters somewhere that made them different. If you are sure you have the exact same text both ways, that’s a bug in RStan and we’d really appreciate it if you could file a bug report. Thanks!

See also rstan::gqs help page