Timeseries with multiplicative noise

Hi there!

I am new to Stan (and time series) and I like the reasoning of probabilistic programming :-) However, I am having trouble estimating uncertainty for time series with multiplicative noise… Out of laziness I have asked this question earlier this week over at Cross Validated without a satisfying answer yet, so I try it here too.

Say we have a monthly time series y_t \geq 0 dominated by seasonality, where the absolute differences from year to year are much smaller during the low season. To avoid negative values and capture the higher uncertainty during high seasons I would suggest to fit such time series as

y_t = \alpha \cdot y_{t-12} \cdot \varepsilon_t

with e.g.

\varepsilon_t \sim \mathcal{N}(1,\, \sigma).

I would then argue that y_t \sim \mathcal{N}(\alpha \cdot y_{t-12}, \,\sigma \cdot \alpha \cdot y_{t-12}), leading to the following Pystan example with simulated data:

import numpy as np
import pystan

code = """
data {
    int<lower=0> N;
    vector[N] y;
}
parameters {
    real alpha;
    real<lower=0> sigma;
}
model {
    y[13:N] ~ normal(alpha * y[1:N-12], sigma * alpha * y[1:N-12]);
}
"""

n_years = 4
y = (np.sin(np.linspace(0, n_years*np.pi, n_years*12))**2  + 0.05) * (1 + np.random.randn(n_years*12)/10)

season = pystan.StanModel(model_code=code)
fit = season.sampling(data={'N': len(y), 'y': y}, iter=1000, chains=4)

The resulting posterior for \alpha seems reasonable to me, but \sigma ends up around 0.2 while I used 0.1 to generate the data:

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
alpha    1.0  1.0e-3   0.04   0.93   0.98    1.0   1.02   1.07   1188    1.0
sigma   0.21  8.3e-4   0.03   0.17   0.19   0.21   0.23   0.28   1091    1.0
lp__   73.14    0.04   1.05  70.32  72.74  73.47  73.88  74.14    891    1.0

Furthermore, if I now draw \alpha and \sigma and plot the fit, I get the following quartile bands:
fit
Pretty much half of the data points lie between the first and the third quartile, but minimum and maximum during high season seem fairly extreme to me …

What is the reason for this large uncertainty during high season? What would be a good model for forecasting that kind of data?

Thank you!

One might argue, that multiplicative errors should follow the log-normal distribution, and I finally got something with log-normally distributed errors going: As the median of \textrm{Lognormal}(\mu, \sigma) is e^\mu, the model

y_t \sim \textrm{Lognormal}(\log(\alpha \cdot y_{t-12}), \,\sigma)

might be more reasonable. It results in fairly similar posteriors for \alpha and \sigma:

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
alpha   1.01  9.0e-4   0.03   0.95   0.99   1.01   1.03   1.07   1255    1.0
sigma   0.18  6.9e-4   0.02   0.14   0.17   0.18    0.2   0.24   1209    1.0
lp__   41.44    0.05   1.12  38.68  41.03   41.8  42.23  42.52    601    1.0

And generated fits look somewhat more reasonable, at least for the minimum:
fit

So should I just assume an error distribution with lighter tales? What would be a justification for this? I feel like I am missing quite some things here :-)

Cheers!

You just have a lot of uncertainty with limited data but you don’t see it when the process is near zero because you’re multiplying the large uncertainty by something near zero. This is just the deal with multiplicative uncertainty.

1 Like

What you’re seeing is pretty standard for time series and process with sequential errors in general when considering full posterior uncertainty.

If I use 40 instead of 4 years, the \sigma indeed decreases a little (which is, to me, rather alarming, shouldn’t in a good model only the spread change when adding more data?):

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
alpha    1.0  1.4e-4 6.3e-3   0.99   0.99    1.0    1.0   1.01   2165    1.0
sigma   0.14  1.4e-4 4.6e-3   0.13   0.14   0.14   0.14   0.15   1156    1.0
lp__  689.34    0.03   1.01 686.47 688.93 689.67 690.06 690.32    992    1.0

But to me it seems still inconsistent with the way I generated the data, i.e. (np.sin(np.linspace(0, n_years*np.pi, n_years*12))**2 + 0.05) * np.random.lognormal(0, 0.1, n_years*12) and the 90% bands are still rather pessimistic:
fit

Mean \sigma should definitely decrease as you add data because the distribution is positively skewed so when it narrows the mean becomes smaller.

I’m not sure how this model is a good model for this process. Your model is not stationary in a short time window so your \alpha estimates comes out to be about 1 (as expected). The generating model actually alternates being slightly auto-regressive (alpha a little below 1) and slightly explosive (alpha above 1). A good model might be a hierarchical model with an \alpha_m for each part of the season and a meanalpha for each year that evolves seasonally (that would be stationary).

Thanks for your comments regarding the shifting mean. It feels rather extreme, but I guess I can live with this :-)

But I cannot follow your argument on the second part, would you mind elaborating a little? How does data like y_t = (\sin^2(t) + \varepsilon) \cdot \nu_t seasonally switch from regressive to explosive?

I didn’t see ν_t in the previous models (?) but I’ll assume I’m not missing anything because what I’m saying is true of fitting a mean plus AR process to finely sampled seasonal data. Think of the up-swing and decay phases of your process as separate data. To use an AR process to simulate the up-swing only you need an AR process with an \alpha_t between 1 and 2 depending on the part of the season (judging from your first plot with just four years). To use an AR process to simulate the decay to zero only (as seen in your plot) you need an AR process with an \alpha_t near .8. In both cases you need a very small noise term. If you fit the same alpha to both parts you get something with an \alpha around 1 and not a very good fit. You can make a model with a seasonal \alpha that would fit/predict better if you had enough data to estimate it.

I ignored the effect of the intercept but adding it only changes the details (turns it into decay->explode->decay->explode per season rather than just two phases). If you want to fit an AR process to something like this I suggest simulating from one and seeing if you can produce data remotely like what you want to fit, stationary AR processes are really limited.

Thanks for the clarification. \nu_t is just the noise term, because I use \varepsilon as a small constant to avoid zeros.

I am not using a normal AR process, I use the value from last year y_{t-12} and not the value from last month. Once I can recover those parameters (or understand why I can’t) I plan to take maybe the (geometric) mean of the two last years values and add a model for the trend.

I missed the seasonal bit, but then why no mean parameter? With a mean alpha would be near zero

Well, seasonal intercept

In this super simple case I just use \alpha to make sure the thing fits something reasonable, no deep meaning in that.

1 Like

The problem above was that the the simulated data is shifted in phase. One reasonable way to generate data would be y = (np.sin(np.linspace(0, n_years*np.pi, n_years*12+1))**2 + 0.05) * np.random.lognormal(0, 0.1, n_years*12+1), where the +1 makes the difference …

With the data in phase the resulting fit gets better:

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
alpha   0.99  6.0e-4   0.02   0.95   0.98   0.99    1.0   1.04   1237    1.0
sigma   0.13  4.2e-4   0.02    0.1   0.12   0.13   0.14   0.16   1362    1.0
lp__   55.94    0.04   1.04   53.1  55.48  56.28  56.72  56.97    857    1.0

And can be further improved by taking the (geometric) average:

        mean se_mean     sd   2.5%    25%    50%    75%  97.5%  n_eff   Rhat
sigma   0.11  6.1e-4   0.02   0.08    0.1   0.11   0.12   0.15    704    1.0
lp__   41.34    0.02   0.69  39.33  41.18   41.6  41.77  41.82    846    1.0

Resulting in fairly acceptable fits (to be honest, for most random data it looks slightly worse, this run was particularly lucky ;-)):
fit

1 Like