Is there a difference/advantage in defining data vectors either of the two ways in the data block of a stan program?
data{
vector[N] y;
}
versus
data{
real y[N];
}
I am not seeing a difference in simple example. From what I’m seeing in the documentation it seems that either is acceptable.
Mike A
It’s generally better to use vector types:
data {
vector[N] y;
}
As these are more easily compatible with other matrix/vector operations, and are able to used more efficiently in the underlying c++.
2 Likes
Thanks for your answer. I was wondering the same thing.
So in this model : (found from Aki Vehtari study case Gaussian process demonstration with Stan ); real xn[N] = to_array_1d((x - xmean)/xsd); should be change to vector[N] xn = (x- xmean)/xsd ? Or do you think there is a reason why xn and x2n are array but yn is a vector here ?
data {
int<lower=1> N; // number of observations
vector[N] x; // univariate covariate
vector[N] y; // target variable
int<lower=1> N2; // number of test points
vector[N2] x2; // univariate test points
}
transformed data {
// Normalize data
real xmean = mean(x);
real ymean = mean(y);
real xsd = sd(x);
real ysd = sd(y);
real xn[N] = to_array_1d((x - xmean)/xsd);
real x2n[N2] = to_array_1d((x2 - xmean)/xsd);
vector[N] yn = (y - ymean)/ysd;
real sigma_intercept = 0.1;
vector[N] zeros = rep_vector(0, N);
}
In that case study, xn
and x2n
are specified as arrays because they’re to be passed to the gp_exp_quad_cov
functions, which requires an array input for particular parameters
2 Likes