I am still new to rstan library, and I am trying to apply Bayesian inference using rstan, I wrote the following stan file, but I am receiving the following error.
stan_model <- stan_model(file = "C:/Users/GTS/Desktop/R file/multivariate_matern_priors.stan")
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
0
Syntax error in 'string', line 36, column 6 to column 16, parsing error:
Ill-formed phrase. "{" should be followed by a statement, variable declaration or expression.
In addition: Warning message:
In readLines(file, warn = TRUE) :
incomplete final line found on 'C:\Users\GTS\Desktop\R file\multivariate_matern_priors.stan'
>
functions {
real matern_kernel_2_5(vector x1, vector x2, real length_scale) {
real nu = 2.5;
real kappa = sqrt(5 * nu);
real dist = sqrt(dot_self(x1 - x2));
real term1 = 1 + kappa * dist / length_scale;
real term2 = exp(-kappa * dist / length_scale);
real covariance = term1 * term2;
return covariance;
}
}
data {
int<lower=1> N; // Number of data points
vector[N] size; // Size variable
vector[N] strain; // Strain variable
matrix[N, N] x; // Input features for covariance matrix
vector[N] stress; // Observed stress data
}
parameters {
real<lower=0> sigma; // Noise parameter
real a; // Intercept term
real bsize; // Coefficient of size variable
real bstrain; // Coefficient of strain variable
}
model {
// Priors for hyperparameters
a ~ normal(0, 1); // Prior for intercept term
bsize ~ normal(0, 1); // Prior for coefficient of size variable
bstrain ~ normal(0, 1); // Prior for coefficient of strain variable
matrix[N, N] cov_matrix;
for (i in 1:N) {
for (j in 1:N) {
cov_matrix[i, j] = matern_kernel_2_5(x[i], x[j], 1.0); // Use 1.0 as length_scale for simplicity
if (i == j) cov_matrix[i, j] = cov_matrix[i, j] + sigma^2;
}
}
// Model likelihood
stress ~ multi_normal(a + bsize * size + bstrain * strain, cov_matrix);
}
And I still getting the following error:
stan_model <- stan_model(file = "C:/Users/GTS/Desktop/R file/multivariate_matern_priors.stan")
Error in stanc(file = file, model_code = model_code, model_name = model_name, :
0
Syntax error in 'string', line 33, column 6 to column 16, parsing error:
Ill-formed phrase. "{" should be followed by a statement, variable declaration or expression.
In addition: Warning message:
In readLines(file, warn = TRUE) :
incomplete final line found on 'C:\Users\GTS\Desktop\R file\multivariate_matern_priors.stan'
I think the error is because cov_matrix is also the name of a type in Stan. Can you try changing all uses of cov_matrix to cov_mat or something else and see if that fixes the error (if it does then @mitzimorris I think we need a better error message here… well actually either way we need a better error message).
If that fixes the error I think you’ll get another error though because in
x[i] and x[j] are rows of the x matrix and I think matern_kernel_2_5 will want vectors and not row vectors. If that’s the case then you can transpose them using '
Thanks, so yeah that’s definitely the issue here. cov_matrix is in that list. But there would be no way to know to check that from reading the error message, so it would be great if we could detect this and have a more informative message.
But we never get there because the like cov_matrix[i, j] = matern_kernel_2_5(x[i], x[j], 1.0); // Use 1.0 as length_scale for simplicity fails to parse (in this case with not a great error).
We have a note in the parsing code that says “Keywords cannot be identifiers but semantic check produces a better error message.”. I suspect this is not (or at least is not any longer) true, so we can probably fix this.
Looking at the pull request that introduced this, I believe it was intended to get rid of the “UNREACHABLE token” hack, but the author forgot that of course parsing could fail before semantic check anyway and there are no added tests whatsoever. Bringing back UNREACHABLE is the best option here.
Thank you it worked fine.
However, I am wondering what is the sampling method used by stan
by running the following sampling(stan_model, data = stan_data, chains = 4, iter = 1000) is it the HMC (Hamiltonian Monte Carlo)?
Thank you for your prompt response. I noticed that it takes too long to sample. It took about 13974.6 seconds to finish chain 1 and its now in the start of chain 2. Do you think pystan is faster? I have about 670 lines of data
I always recommend CmdStanR or CmdStanPy.
they will use the latest version of Stan, which may be faster.
however, the slowness is probably due to the model, specifically, the number of gradient evaluations, not the interface.