Different intercept terms in Bayesian and Frequentist ARMA

Hello everyone, I need to estimate a ARMA(1,1) model with Stan. My Stan model is as follows:

data {
  int<lower=1> N;            
  real y[N];                 
}

parameters {
  real mu;                   
  real<lower = -1, upper = 1> phi;
  real<lower = -1, upper = 1> theta;
  real<lower=0> sigma;       
}

model {
  real err;             
  mu ~ normal(0, 10);        
  phi ~ normal(0, 2);
  theta ~ normal(0, 2);
  sigma ~ cauchy(0, 5);
  err = y[1] - mu + phi * mu;
  for (n in 2:N) {
    err = y[n] - (mu + phi * y[n-1] + theta * err);
    err ~ normal(0, sigma);
  }
}

This model is basically same as the model in Stan documentation (https://mc-stan.org/docs/2_22/stan-users-guide/autoregressive-moving-average-models.html).

arima function from forecast package estimates this model:

X[t] - m = a[1]X[t-1] + … + a[p]X[t-p] + e[t] + b[1]e[t-1] + … + b[q]e[t-q]

Since, in my situation, p and q equal to 1, above equation induces to

X[t] - m = a[1]X[t-1] + b[1]e[t-1]

which is the same as my Stan model.

I don’t have a problem with AR(1) and MA(1) coefficients, they are similar in Bayesian and Frequentist estimations (aroma from forecast package in R). But the intercept term is differing between models.

This is ARMA(1,1) frequentist estimate:

Coefficients:
        ar1         ma1    intercept
        0.5447  -0.1733    17.3737
s.e.    0.0285   0.0335     0.0416

and this is the Bayesian results:

             mean	    se_mean	    sd	
mu	         8.2029658	0.0162509	0.4749644
phi	         0.5278543	0.0009340	0.0273189
theta	    -0.1524502	0.0010340	0.0312320

There is an interesting fact: the frequentist intercept term nearly equals two times Bayesian intercept.
I couldn’t figure it this difference. Any ideas?

I tried to change prior mean for mu but it doesn’t make any difference in result.

I think if you want to see results closer to the Frequentist ARMA you will be some uniform priors. I can’t remember exactly what the constraints are on the frequentist ARMA models? Hopefully someone else knows.

I tried uniform priors but it didn’t help either.

I just quickly skimmed the code. Aren’t you missing a pair of parentheses?

From

  err = y[1] - mu + phi * mu;

to

  err = y[1] - (mu + phi * mu);

?