Simulate the data of ARMA

I’m new to Stan and met a problem of my first try. I’m using the ARMA from the book to simulate a few data of the future. but it showed me errors in the block. I think the generated quantities block is correct but it has problems.
the code is:

ARMA_code ="""
data {
  int<lower=1> T;
  real y[T];
  int<lower=0> n_new;
}
parameters {
  real mu;
  real phi;
  real theta;
  real<lower=0> sigma;
} 

model {
   // real err;
    vector[T] nu;
    vector[T] err;
    nu[1] <- mu + phi * mu;
    err[1] <- y[1] - nu[1];
    for (t in 2:T) {
        nu[t] <- mu + phi * y[t-1] + theta * err[t-1];
        err[t] <- y[t] - nu[t];
        }
    err ~ normal(0,sigma);
    mu ~ normal(0,10);
    phi ~ normal(0,2);
    theta ~ normal(0,2);
    sigma ~ cauchy(0,5);
 }

generated quantities{
   vector[n_new+T] y_new;
  // y_new[1:T] <- mu;
   vector[n_new] x_pred;   
   y_new[T+1] = mu+phi*y[T]+theta*err+sigma;
   for (i in 2:n_new){
       y_new[T+i] = normal_rng(mu+phi*y_new[T+i-1]+theta*err, sigma);
    }
}
"""

Here is the error in detail:

ValueError: Failed to parse Stan model 'anon_model_983af5eff8a1020ec870838c279bf070'. Error message:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Info: assignment operator <- deprecated in the Stan language; use = instead.
Info: assignment operator <- deprecated in the Stan language; use = instead.
Info: assignment operator <- deprecated in the Stan language; use = instead.
Info: assignment operator <- deprecated in the Stan language; use = instead.
Variable "err" does not exist.
 error in 'unknown file name' at line 35, column 37
  -------------------------------------------------
    33:   // y_new[1:T] <- mu;
    34:    vector[n_new] x_pred;   
    35:    y_new[T+1] = mu+phi*y[T]+theta*err+sigma;
                                            ^
    36:    for (i in 2:n_new){
  ------------------------------------------------

Additive question:
Can anyone explain what is the process inner of stan when fitting? I already read papers but still confuse.

1 Like

Morning and welcome! First off all your <- need to be changed to = . And it’s a bit hard to read you model code so let me see if I can reformat that for you.

1 Like

So the code should look more like this

data {
  int<lower=1> T;            // num observations
  real y[T];                 // observed outputs
}
parameters {
  real mu;                   // mean coeff
  real phi;                  // autoregression coeff
  real theta;                // moving avg coeff
  real<lower=0> sigma;       // noise scale
}
model {
  vector[T] nu;              // prediction for time t
  vector[T] err;             // error for time t
  nu[1] = mu + phi * mu;    // assume err[0] == 0
  err[1] = y[1] - nu[1];
  for (t in 2:T) {
    nu[t] = mu + phi * y[t-1] + theta * err[t-1];
    err[t] = y[t] - nu[t];
  }
  mu ~ normal(0, 10);         // priors
  phi ~ normal(0, 2);
  theta ~ normal(0, 2);
  sigma ~ cauchy(0, 5);
  err ~ normal(0, sigma);    // likelihood
}

but add in the generate block. This is from the manual here: https://mc-stan.org/docs/2_19/stan-users-guide/autoregressive-moving-average-models.html

thanks for the suggestion.
I want to predict the future data in generated quantities block. confuse how to fix it?

[quote=“S_mengyuan, post:1, topic:13108”]

generated quantities{
   vector[n_new+T] y_new;
   vector[n_new] x_pred;   
   y_new[1:T] = mu;
   y_new[T+1] = mu+phi*y[T]+theta*err+sigma;
   for (i in 2:n_new){
       y_new[T+i] = normal_rng(mu+phi*y_new[T+i-1]+theta*err, sigma);
    }
}

not sure what you’re trying to do, but your generated quantities block should look something like the above.
your original program had some lines commented out - I uncommented them, and put the data declarations before assignments.
try using CmdStanR or CmdStanPy to get the new Stan3 parser which has better error messages.

3 Likes

I want to predict the future n_new data.
I tried your block.

 error in 'unknown file name' at line 32, column 3
  -------------------------------------------------
    30:    vector[n_new+T] y_new;
    31:    vector[n_new] x_pred;   
    32:    y_new[1:T] = mu;
          ^
    33:    y_new[T+1] = mu+phi*y[T]+theta*err+sigma;
  -------------------------------------------------

PARSER EXPECTED: <one of the following:
  a variable declaration, beginning with type
      (int, real, vector, row_vector, matrix, unit_vector,
       simplex, ordered, positive_ordered,
       corr_matrix, cov_matrix,
       cholesky_corr, cholesky_cov
  or a <statement>
  or '}' to close variable declarations and definitions>

hi S, in the Forums posts, use 3 backticks for blockquotes.

I would look in the language ref manual or the users manual for more information on the generated quanitites block. the error message indicates that the kind of vectorized assigment you’re trying to do isn’t allowed.

do you think the logic of genertated quantities block is corrext?

ARMA_code ="""

data {
  int<lower=1> T;            // num observations
  real y[T];                 // observed outputs
  int<lower=0> n_new;
}
parameters {
  real mu;                   // mean coeff
  real phi;                  // autoregression coeff
  real theta;                // moving avg coeff
  real<lower=0> sigma;       // noise scale
  vector[T] output;
}
model {

  for (x in 1:T) {
      output[x] = y[x];
      }
  
  vector[T] nu;              // prediction for time t
  vector[T] err;             // error for time t
  nu[1] = mu + phi * mu;    // assume err[0] == 0
  err[1] = y[1] - nu[1];
  for (t in 2:T) {
    nu[t] = mu + phi * y[t-1] + theta * err[t-1];
    err[t] = y[t] - nu[t];
  }
  mu ~ normal(0, 10);         // priors
  phi ~ normal(0, 2);
  theta ~ normal(0, 2);
  sigma ~ cauchy(0, 5);
  err ~ normal(0, sigma);    // likelihood
}

generated quantities{
   vector[n_new+T] y_new;
   vector[n_new] x_pred;   
   y_new[1:T] = output;
   y_new[T+1] = mu+phi*y[T]+theta*err+sigma;
   for (i in 2:n_new){
       y_new[T+i] = normal_rng(mu+phi*y_new[T+i-1]+theta*err, sigma);
    }
}

here is the full code? can you correct where is wrong?

Did you manage to resolve this? Also note that if your goal is to just run the model (as opposed to learning Stan to be able to do advanced modelling), you should be IMHO able to setup an ARMA model using brms, which has much friendlier learning curve and has facilities to predict and other common operations out of the box.

Best of luck with your model!

thanks, I just want to learn it and the problem was solved.

If you could share your final code here, for possible future inquirers, it would be great! Thanks!

data {
  int<lower=1> T;            // num observations
  real y[T];                 // observed outputs
  int<lower=0> n_new;
}
transformed data{
    int<lower=0> x=T+n_new;
}
parameters {
  real mu;                   // mean coeff
  real phi;                  // autoregression coeff
  real theta;                // moving avg coeff
  real<lower=0> sigma;       // noise scale
}
model { 
  vector[x] nu;              // prediction for time t
  vector[T] err;             // error for time t
  nu[1] = mu + phi * mu;    // assume err[0] == 0
  err[1] = y[1] - nu[1];
  for (t in 2:T) {
    nu[t] = mu + phi * y[t-1] + theta * err[t-1];
    err[t] = y[t] - nu[t];
    }
  for(t in (T+1):x) {
     nu[t] = mu + phi * nu[t-1] + theta*(nu[t-1]-nu[t-2]);
      }  
  mu ~ normal(0, 10);         // priors
  phi ~ normal(0, 2);
  theta ~ normal(0, 2);
  sigma ~ cauchy(0, 5);
  err ~ normal(0, sigma);    // likelihood
}
generated quantities{
 vector[x] prediction;
 for(t in 1:T){
    prediction[t] = y[t];
 }
 for(t in T:x){ 
    prediction[t] = mu+phi*prediction[t-1]+theta*(prediction[t-1]-prediction[t-2]); 
 } 
}

here is the code, but I’m not sure how to do MA part after existing series. alternatively, i just used random walk as error term.
hope it is helpful!

According to:

http://www-stat.wharton.upenn.edu/~stine/insr260_2009/lectures/arma_forc.pdf

for unknown error, the error is 0 . Thus err[T] is your last error, err[T+1:x] is 0.

for(t in 1:T){
prediction[t] = y[t];
}

shouldn’t that be nu[t] instead of y[t]?

1 Like

Thanks, I’m exploring the unknown error right now and try to use a acceptable to represent it. As you said err[T+1:x]=0 is one way, but I used the random walk to represent the error after T, maybe it’s not good. but I will try to using statistical model to simulate the errors after T.
it should be nu[t] in stead of y[t], I just want to plot the original value so that I used y[t].

1 Like

Hi if you still have some struggles you can see my varstan package for time series models

I have implemented, SARIMA, GARCH VARMA, BEKK, and Dynamic Regression

Is still on developing so installing it could be a little bit hard but you can try it or I can tell you where you can find what you are looking at

2 Likes

thanks, that’s good.

If I want to use arma to predict the future value. do you think my model works or I should do something elso? I’m new to stan.

Instead of this

for(t in T:x){
prediction[t] = mu+phiprediction[t-1]+theta(prediction[t-1]-prediction[t-2]);
}

Put:

for(t in T:x){
prediction[t] = normal_rng(mu+phiprediction[t-1]+theta(prediction[t-1]-prediction[t-2]) ,sigma) ;
}

Because you wanna have a sample of the 1-step ahead predictivie distribution.
Actually a forecast is a probability distribution.

1 Like

do you think I should drop theta(prediction[t-1]-prediction[t-2]) out instead, because I’m doing the test, I should make the model as simple as possible first.