Strange error requiring generated quantitites

I get this strange error when trying to compile the code. I haven’t yet seen such an error where it asks for generated quantities block with regard to the line where model block starts

error:

Code:

functions {
  real partial_sum(int[] y_slice,
                   int start, 
                   int end,
                   matrix x1,
                   matrix x2,
                   vector beta,
                   matrix rbeta,
                   int J, 
                   int[] ll) {

    return bernoulli_logit_lpmf(y_slice[start:end] |x1[start:end,:] * beta + (x2[start:end,:] .* rbeta[ll[start:end]]) * rep_vector(1,J));
  }
}
data {
  int<lower=0> N;//Number of observations
  int<lower=1> J;//Number of predictors with random slope
  int<lower=1> K;//Number of predictors with non-random slope
  int<lower=1> L;//Number of customers/groups
  int<lower=0,upper=1> y[N];//Binary response variable
  int<lower=1,upper=L> ll[N];//Number of observations in groups
  matrix[N,K] x1;
  matrix[N,J] x2;
}
parameters {
  vector[J] rbeta_mu; //mean of distribution of beta parameters
  vector<lower=0>[J] rbeta_sigma; //variance of distribution of beta parameters
  vector[J] beta_raw[L]; //group-specific parameters beta
  vector[K] beta;
}
transformed parameters {
  vector[J] rbeta[L];
  for (l in 1:L)
    rbeta[l] = rbeta_mu + rbeta_sigma .* beta_raw[l]; // coefficients on x
}
model {
  rbeta_mu ~ normal(0,10);
  rbeta_sigma ~ cauchy(0,5);
  beta~normal(0,5);
  
  for (l in 1:L)
    beta_raw[l] ~ normal(0,1);

  target += reduce_sum(partial_sum,y,1,x1,x2,beta,rbeta,J,ll);
}

This is my first time trying within chain parallelization.

Can anyone tell me what is the error about and how I can correct it?

Look for an unmatched “{“ or “}” somewhere.

Already have. All are balanced that is every “{” has a corresponding “}” and similarly all the “(” has corresponding “)”

Huh. Very strange. Because that message is saying it thought it reached the end of the model block (by seeing a } to match the block’s opening {), and then encountered text that was not the only valid text allowed after the end of the model block (generated quantities {).

Any chance you can try with cmdstanr?

I’ll try.

Ok it compiles with cmdstanr so why won’t it with rstan. Any clues ?

My bad… does not compile even with cmdstan… forgot to turn on multithreading while compiling earlier…

any suggestions on whats going wrong ?

If I copy-paste that model to a Stan file and then try to compile with cmdstanr I error with

Ill-typed arguments supplied to function 'reduce_sum'. Expected arguments:
(array[] int, int, int, matrix, matrix, vector, matrix, int, array[] int) => real, array[] int, int, matrix, matrix, vector, matrix, int, array[] int

Instead supplied arguments of incompatible type: (array[] int, int, int, matrix, matrix, vector, matrix, int, array[] int) => real, array[] int, int, matrix, matrix, vector, array[] vector, int, array[] int
make: *** [/var/folders/j6/dg5l3gl11xb9v8w61w99ngh80000gn/T/Rtmp5yrh3D/model-372f58bfb7ab.hpp] Error 1

That’s because partial_sum expects rbeta as a matrix but you pass rbeta as as an array of vectors. Maybe that’s the whole problem? The error message you’re seeing certainly isn’t a helpful one. Maybe a later version of rstan or cmdstan would fix that?

1 Like

Wait, if you were able to compile this without turning on multithreading it strongly suggests to me that the Stan program that you included above is not identical to the one you’re actually using. No way should this one compile with or without multithreading, due to the mis-typed argument in partial_sum

1 Like

@jscocolar you are right I got that error then I converted rbeta to matrix and it compiled in single threaded version.

This is the code that compiled:

functions {
  real partial_sum(int[] y_slice,
                   int start, 
                   int end,
                   matrix x1,
                   matrix x2,
                   vector beta,
                   matrix rbeta,
                   int J, 
                   int[] ll) {

    return bernoulli_logit_lpmf(y_slice[start:end] |x1[start:end,:] * beta + (x2[start:end,:] .* rbeta[ll[start:end]]) * rep_vector(1,J));
  }
}
data {
  int<lower=0> N;//Number of observations
  int<lower=1> J;//Number of predictors with random slope
  int<lower=1> K;//Number of predictors with non-random slope
  int<lower=1> L;//Number of customers/groups
  int<lower=0,upper=1> y[N];//Binary response variable
  int<lower=1,upper=L> ll[N];//Number of observations in groups
  matrix[N,K] x1;
  matrix[N,J] x2;
}
parameters {
  row_vector[J] rbeta_mu; //mean of distribution of beta parameters
  row_vector<lower=0>[J] rbeta_sigma; //variance of distribution of beta parameters
  row_vector[J] beta_raw[L]; //group-specific parameters beta
  vector[K] beta;
}
transformed parameters {
  matrix[L,J] rbeta;
  for (l in 1:L)
    rbeta[l] = rbeta_mu + rbeta_sigma .* beta_raw[l]; // coefficients on x
}
model {
  rbeta_mu ~ normal(0,10);
  rbeta_sigma ~ cauchy(0,5);
  beta~normal(0,5);
  
  for (l in 1:L){
    beta_raw[l] ~ normal(0,1);
  }

  target += reduce_sum(partial_sum,y,1,x1,x2,beta,rbeta,J,ll);
}


But this code does not compile in multithreaded in cmdstan or on rstan.

1 Like

Thanks for providing the updated model! This compiles just fine for me in single- or multithreaded mode with cmdstan 2.27.0

cmdstanr::cmdstan_model("/Users/JacobSocolar/Desktop/testmod.stan")
cmdstanr::cmdstan_model("/Users/JacobSocolar/Desktop/testmod.stan", 
                           cpp_options = list(stan_threads = TRUE))

Interestingly, I was able to reproduce your original error with rstan 2.26. Updating to 2.27 eliminated the issue. I cannot be certain whether the issue was a bug in 2.26 or whether it was a problem with my rstan installation that was fixed by re-installing.

So either there is a bug in earlier versions that is fixed in 2.27 or there’s a problem (apparently a common one) with your rstan installation. In either case, updating to the latest Stan seems to be a fix. To update to the latest rstan, do

To update to the latest Cmdstan, do

cmdstanr::install_cmdstan()

I am on the latest version of rstan and cmdstan. Yet it does not compile for me.

This is what happens when I try to compile it:

I have already unistalled, R, Rstudio, cmdstan and have done a fresh install all over.

I am using the experimental version of rstan and have installed the latest cmdstan as well.

So if it compiled for you can you tell me what changes you made if any ?

If i try to sample from this file I get these errors :

I have no issues compiling your model copy-pasted from your discourse post. I don’t have access to your data, so I cannot attempt to sample the compiled model. In a very cursory glance at the images of the output from the compiler that you posted above I didn’t see an error message, so maybe compiling is working fine on your end as well.

Perhaps there’s a problem in your data. One useful check might be: can you compile and sample the example reduce_sum model here:
https://mc-stan.org/users/documentation/case-studies/reduce_sum_tutorial.html

If you’re able to share your data, I’d be happy to check whether or not I can sample from your model+data.

I was able to run the example in the link you shared without issues. So am assuming it is to do with data.

Here is a generated data based on my actual data. I cannot share my actual data but this is the closest thing.

The variables X1 and X2 are fixed effects in case of the model X1 and the rest X3:X1 are random effects that is in the stan model would be X2.

customer_no would be ll in stan.
sample_data.csv (69.7 KB)
All your help is greatly appreciated.
test_reduce_sum.R (752 Bytes)

So after some experimenting, I have realised the problem is to do with random effects.
Since I was able to compile and sample from the model with all effects being fixed but when I introduced even a single random effect like only random intercept I got errors like before :

I can’t run this code. It contains

y=data$bought[1:500]

but there is no column bought in data.

Edit, if I replace $bought with $y I can run and reproduce your error

Edit 2: Nevermind, I still can’t reproduce. I error because y, L, and ll are all the wrong size in the data. If I replace the associated [1:500]s with [1:1000]s, I progress to a new set of errors that still aren’t your error. These new errors are probably related to the fact that in your partial_sum function, you are indexing into rbeta, a matrix, using a single index instead of two indices.

1 Like

My sense here, and this is consistent with all of the various error messages that you’ve reported, is that your problems all have to do with your indexing. I suggest you try to go over the indexing with a fine-toothed comb using synthetic, shareable data. If you can’t get it working, then if you could re-post exactly the data file, stan file, and and R script that reproduces the error, with no changes from the version that you are running in order to see the error, then we can take a look and try to troubleshoot the issue.