Single source of error seasonal local level model: is this correct?

I am trying to gradually build up some single source of error state space models for revenue forecasting. I’m a relative newbie to STAN and want to ensure that my code is correct. I think it is, but I need feedback.

I’m building upon the multiple source of error (MSOE) state space code here:

data {
  int<lower=1> n;
  vector[n] y;
}
parameters {
  vector[n] mu;
  vector[n] seasonal;
  real<lower=0> sigma_level;
  real<lower=0> sigma_seas;
  real<lower=0> sigma_irreg;
}
transformed parameters {
  vector[n] yhat;
  yhat = mu + seasonal;
}
model {
  for(t in 4:n)
seasonal[t] ~ normal(- sum(seasonal[t-3:t-1]), sigma_seas);

  for(t in 2:n)
mu[t] ~ normal(mu[t-1], sigma_level);

  y ~ normal(yhat, sigma_irreg);
}

However, that does not use a single source of error, as I want. As a first pass, I want to use Hyndman’s exponential smoothing models as templates (e.g. here). The key difference between the SSOE and MSOE models is that in the code above, sigma_level and sigma_seas are multiples of the residual sigma_irreg. To use just a level example (omitting the seasons for space):

y_t = l_{t-1} + e_t
l_t = l_{t-1} + \alpha e_t

So if we know what l_{t-1} is, the expectation and variance of l_t are l_{t-1} and \alpha^2 \sigma^2, if e_t \sim N(0, \sigma^2). I think. I hope.

Because it lets me update the above code to:

data {
  int<lower=1> n;
  vector[n] y;
  int<lower=1> S;              // Seasons
}
parameters {
  vector[n] mu;
  vector[n] seasonal;
  real<lower=0, upper=1> alpha;                  // Level smoother
  real<lower=0, upper=1> gamma;                  // Seasonal smoother
  real<lower=0> sigma_irreg;
}
transformed parameters {
  real<lower=0> sigma_level;
  sigma_level = alpha * sigma_irreg;
  real<lower=0> sigma_seas;
  sigma_seas  = gamma * sigma_irreg;
  vector[n] yhat;
  yhat = mu + seasonal;
}
model {
  // Priors
  alpha ~ beta(2, 1);
  gamma ~ beta(2, 2);
  // Model
  for(t in S:n)
    seasonal[t] ~ normal(- sum(seasonal[t-S+1:t-1]), sigma_seas);

  for(t in 2:n)
    mu[t] ~ normal(mu[t-1], sigma_level);

  y ~ normal(yhat, sigma_irreg);
}

Anyway, I would greatly appreciate feedback. I deeply hope this is correct, because there is no way in hell I’m ever going to program a Kalman filter and I would love to actually use Stan and Bayes in my work.

1 Like

My first question would be does the local level and seasonal base model run with simulated data that looks like your real data and recovers the known parameters. Second would be does it run with your real data?

Good question. I will create fake data with parameters and try to recover it today.

The model does run with real data and it does seem to fit well. I feel optimistic but want to check it to be sure. Worst case scenario, I might try to rewrite it in error-correction form.

I’ll update soon.

1 Like

Long story short, it does run and I think it is correct. I recently found a resource that uses a very similar approach (here. This seems kosher but does not rely on innovations so much as errors (innovations, as I understand it, being the error from the previous period’s forecast).