Ill-formed phrase. "{" should be followed by a statement, variable declaration or expression

I am still new to rstan library, and I am trying to apply Bayesian inference using rstan, I wrote the following stan file, but I am receiving the following error.

functions {
  real matern_kernel_2_5(vector x1, vector x2, real length_scale) {
    real nu = 2.5;
    real kappa = sqrt(5 * nu);
    real dist = sqrt(dot_self(x1 - x2));
    real term1 = 1 + kappa * dist / length_scale;
    real term2 = exp(-kappa * dist / length_scale);
    real covariance = term1 * term2;
    return covariance;
  }
}

data {
  int<lower=1> N;              // Number of data points
  vector[N] size;              // Size variable
  vector[N] strain;            // Strain variable
  matrix[N, N] x;              // Input features for covariance matrix
  vector[N] stress;            // Observed stress data
}

parameters {
  real<lower=0> sigma;         // Noise parameter
  real a;                      // Intercept term
  real bsize;                  // Coefficient of size variable
  real bstrain;                // Coefficient of strain variable
}

model {
  // Priors for hyperparameters
  a ~ normal(0, 1);            // Prior for intercept term
  bsize ~ normal(0, 1);        // Prior for coefficient of size variable
  bstrain ~ normal(0, 1);      // Prior for coefficient of strain variable
  
  matrix[N, N] cov_matrix;
  
  // Construct the covariance matrix using the Matérn kernel
  for (i in 1:N) {
    for (j in 1:N) {
      cov_matrix[i, j] = matern_kernel_2_5(x[i], x[j], 1.0);  // Use 1.0 as length_scale for simplicity
      if (i == j) cov_matrix[i, j] = cov_matrix[i, j] + sigma^2;
    }
  }
  
  // Model likelihood
  stress ~ multi_normal(a + bsize * size + bstrain * strain, cov_matrix);
}

I recieved the following error:

stan_model <- stan_model(file = "C:/Users/GTS/Desktop/R file/multivariate_matern_priors.stan")
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  0

Syntax error in 'string', line 36, column 6 to column 16, parsing error:

Ill-formed phrase. "{" should be followed by a statement, variable declaration or expression.


In addition: Warning message:
In readLines(file, warn = TRUE) :
  incomplete final line found on 'C:\Users\GTS\Desktop\R file\multivariate_matern_priors.stan'
> 

I really appreciate you help.

it did not work I modified the code as following

functions {
  real matern_kernel_2_5(vector x1, vector x2, real length_scale) {
    real nu = 2.5;
    real kappa = sqrt(5 * nu);
    real dist = sqrt(dot_self(x1 - x2));
    real term1 = 1 + kappa * dist / length_scale;
    real term2 = exp(-kappa * dist / length_scale);
    real covariance = term1 * term2;
    return covariance;
  }
}

data {
  int<lower=1> N;              // Number of data points
  vector[N] size;              // Size variable
  vector[N] strain;            // Strain variable
  matrix[N, N] x;              // Input features for covariance matrix
  vector[N] stress;            // Observed stress data
}

parameters {
  real<lower=0> sigma;         // Noise parameter
  real a;                      // Intercept term
  real bsize;                  // Coefficient of size variable
  real bstrain;                // Coefficient of strain variable
}

model {
  // Priors for hyperparameters
  a ~ normal(0, 1);            // Prior for intercept term
  bsize ~ normal(0, 1);        // Prior for coefficient of size variable
  bstrain ~ normal(0, 1);      // Prior for coefficient of strain variable
  matrix[N, N] cov_matrix;
  for (i in 1:N) {
    for (j in 1:N) {
      cov_matrix[i, j] = matern_kernel_2_5(x[i], x[j], 1.0);  // Use 1.0 as length_scale for simplicity
      if (i == j) cov_matrix[i, j] = cov_matrix[i, j] + sigma^2;
    }
  }
  
  // Model likelihood
  stress ~ multi_normal(a + bsize * size + bstrain * strain, cov_matrix);
}

And I still getting the following error:

stan_model <- stan_model(file = "C:/Users/GTS/Desktop/R file/multivariate_matern_priors.stan")
Error in stanc(file = file, model_code = model_code, model_name = model_name,  : 
  0

Syntax error in 'string', line 33, column 6 to column 16, parsing error:

Ill-formed phrase. "{" should be followed by a statement, variable declaration or expression.


In addition: Warning message:
In readLines(file, warn = TRUE) :
  incomplete final line found on 'C:\Users\GTS\Desktop\R file\multivariate_matern_priors.stan'

I think the error is because cov_matrix is also the name of a type in Stan. Can you try changing all uses of cov_matrix to cov_mat or something else and see if that fixes the error (if it does then @mitzimorris I think we need a better error message here… well actually either way we need a better error message).

If that fixes the error I think you’ll get another error though because in

x[i] and x[j] are rows of the x matrix and I think matern_kernel_2_5 will want vectors and not row vectors. If that’s the case then you can transpose them using '

matern_kernel_2_5(x[i]', x[j]', 1.0)

here’s the list of reserved words: 6.2 Variables | Stan Reference Manual

regarding error messages - @WardBrian - can the parser detect reserved words used for a variable name?

Thanks, so yeah that’s definitely the issue here. cov_matrix is in that list. But there would be no way to know to check that from reading the error message, so it would be great if we could detect this and have a more informative message.

It can, but only if it succeeds in parsing. If you comment out everything below the declaration, the error message is

Semantic error in 'string', line 34, column 15 to column 25:
   -------------------------------------------------
    32:    bstrain ~ normal(0, 1);      // Prior for coefficient of strain variable
    33:    
    34:    matrix[N, N] cov_matrix;
                        ^
    35:    
    36:    // Construct the covariance matrix using the Matérn kernel
   -------------------------------------------------

Identifier 'cov_matrix' clashes with reserved keyword.

But we never get there because the like cov_matrix[i, j] = matern_kernel_2_5(x[i], x[j], 1.0); // Use 1.0 as length_scale for simplicity fails to parse (in this case with not a great error).

We have a note in the parsing code that says “Keywords cannot be identifiers but semantic check produces a better error message.”. I suspect this is not (or at least is not any longer) true, so we can probably fix this.

1 Like

Looking at the pull request that introduced this, I believe it was intended to get rid of the “UNREACHABLE token” hack, but the author forgot that of course parsing could fail before semantic check anyway and there are no added tests whatsoever. Bringing back UNREACHABLE is the best option here.

I think I found an option I like better than UNREACHABLE, I will open a PR later today

1 Like

Thank you it worked fine.
However, I am wondering what is the sampling method used by stan
by running the following sampling(stan_model, data = stan_data, chains = 4, iter = 1000) is it the HMC (Hamiltonian Monte Carlo)?

yes

Thank you for your prompt response. I noticed that it takes too long to sample. It took about 13974.6 seconds to finish chain 1 and its now in the start of chain 2. Do you think pystan is faster? I have about 670 lines of data

I always recommend CmdStanR or CmdStanPy.
they will use the latest version of Stan, which may be faster.
however, the slowness is probably due to the model, specifically, the number of gradient evaluations, not the interface.