I’m new to Stan and not an expert on statistics, so I hope my question doesn’t seem too trivial.
I’m mostly interested in time series analysis, so I’m starting with a simple AR(K) model to learn the basics of Stan. The user’s guide shows this example of a model:
data {
int<lower=0> K;
int<lower=0> N;
real y[N];
}
parameters {
real alpha;
real beta[K];
real sigma;
}
model {
for (n in (K+1):N) {
real mu = alpha;
for (k in 1:K)
mu += beta[k] * y[n-k];
y[n] ~ normal(mu, sigma);
}
}
which is slightly different than the model I originally wrote:
data {
int<lower=0> N;
int<lower=0> K;
vector[N] y;
}
parameters {
real alpha;
row_vector[K] beta;
real<lower=0> sigma;
}
model {
for (n in K+1:N)
y[n] ~ normal(alpha + beta * y[n-K:n-1], sigma);
}
The results are the same on my test data, but I want to make sure that I’m not developing bad habits:
is there a fundamental difference between real y[N] and vector[N] y ? Which one should be preferred for time series data?
same question for real beta[K] and row_vector[K] beta: the latter lets me avoid having two nested for loops when defining the model. Maybe this is not relevant?
As far as I know the the main difference between real and vector is that a ‘vector’ allows for matrix-vector multiplication while a ‘real’ doesn’t. So this answers your second question as well - while it’s possible to do a row_vector * vector multiplication (as long as they are the same length), this is not possible for real and therefore requires an additional loop.
By the way, I assume P in your model is a typo and should be K, right?
I’ll just add that, there are functions like to_vector, to_row_vector and to_array that let you translate between the types.
I’d just say that in most cases, it is customary to prefer vector and row_vector over real[], but it is mostly an aesthetic choice.
The dot product of vector and row_vector will likely be slightly faster to compute than the for loop, but for most use cases it should not matter much.
Best of luck with your model!
(EDIT: Removed reaction to Maurits’ statement that he removed :-D )
I also happen to be experimenting with the same model – I found it quite appealing to learn to write custom functions for the model so that you can just write something like y ~ ar_model(alpha, beta, sigma);. The relevant docs are here.