Time series modelling: intermediate values

huffyhenry · August 28, 2017, 1:37pm

Hi,

First time poster, so apologies in advance if this kind of question is unsuitable for this forum, or I’m ignoring local etiquette.

I am doing predictive sports modelling in Stan. If you think of a sequence of N team results (scores) as data, and the team’s strength y as a parameter varying in time, then Chapter 10.1 of the manual shows how to do this – have y as a vector, and condition y[k] on y[k-1] as well as on the result of the k-th game. That is, the core part of the model block would look something like this:

for (k in 1:N){
    if (k > 1){
        y[k] ~ normal(y[k-1], sigma);
    }
    score[k] ~ some_distro(some_transform(y[k]));
}

So far so good, but even in this formulation, the final estimate of y[k] is going to take into account the score in all N games played, instead of only games 1 till k-1 (this is a crucial point, so please correct me if I’m wrong). But to analyse the predictive power of the model, I need precisely these intermediate estimates. Do these values even “exist” during a Stan run, and if so, is there a way to smuggle them out? Or perhaps I am missing a clever re-parametrization that would make y[k] mean what I need it to? The only way I found so far is re-running the model on incremental datasets, and that’s very expensive computationally.

Thanks for any and all pointers.

sakrejda · August 28, 2017, 1:53pm

Pretty sure that formulation of AR fails miserably in terms of posterior geometry and the standard solution is to re-parameterize in terms of the incremental difference which can be made N(0,1).
Yes you’re right about this evaluation problem. Without approximations there’s no easy way to get around it. You can make the model simple enough so you can produce estimates via something like a Kalman filter (something that essentially doesn’t look forward when it makes estimates) but that would be a niche solution. Just make the model fast and you can run the different estimates in parallel.

huffyhenry · August 28, 2017, 2:05pm

Thanks Krzysztof. Re: 1, this is a good tip and I’m going to re-implement along these lines. Re 2, can you point me to a canonical implementation of a Kalman filter in Stan that doesn’t do “look-ahead”? Just seeing how it can be done would be very useful to me, even if it’s not the best solution.

sakrejda · August 28, 2017, 2:13pm

I don’t have a model with a Kalman filter in it on hand—just ask a separate question on the forum… I think @jrnold did some models like that a while back. For the AR piece here’s an implementation I do have on hand:

data {
  int T;  // Number of time points.
  int K;  // Number of coefficients.
  int M;  
  int y[T];  // Measurement at each T
  vector[M] kappa; // scaling.
  int T_star;
} 

parameters {
  vector[K] beta;
  vector<lower=0>[1] sigma;
  vector[T] zeta;
}

transformed parameters {
  vector[T] mu;
  mu[1] = zeta[1];
  for (t in 2:T) {
    mu[t] = zeta[t]*sigma[1] + inv_logit(beta[1]) * mu[t-1];
//    print("t: ", t, ", mu[t]: ", mu[t], ", zeta[t]: ", zeta[t], 
//          ", sigma[1]: ", sigma[1], ", beta[1]: ", beta[1], 
//          ", mu[t-1]: ", mu[t-1]
//    );
  }
}

model {
  target += normal_lpdf(beta | 0.0, 1.0);
  target += gamma_lpdf(sigma | 4.0, 10);
  target += normal_lpdf(zeta | 0.0, 1);
  for (t in 1:T) {
    target += poisson_lpmf(y[t] | exp(mu[t]));
  }
}

huffyhenry · August 28, 2017, 2:25pm

Right, since this is essentially the same issue, I won’t open a new thread, but will repeat a general call for a Kalman filter / time series model example w/o look-ahead. Thanks in advance.

sakrejda · August 28, 2017, 2:28pm

¯\_(ツ)_/¯ , this makes it harder for people to find it in the future since the title doesn’t really describe the issue.

huffyhenry · August 28, 2017, 2:31pm

You’re right, I am going to collect my thoughts and post another question.

Bob_Carpenter · August 29, 2017, 11:28pm

See: Kalman filter in Stan · GitHub

Then someone wrote a followup, which I found, but haven’t read:

Bob_Carpenter · August 29, 2017, 11:30pm

That one from Juho Kokkala’s case study was way out of date. You want this sweet bookdown project:

huffyhenry · August 30, 2017, 1:00pm

Thanks. This looks like essential reading for me. Thanks to @jrnold too!

Topic		Replies	Views
AR/Kalman-like models where the k-th estimate of a variable depends only on obs 1..k-1 Modeling	11	1402	August 29, 2017
State space / kalman filtering advice for inverse model Modeling	13	3875	May 1, 2019
Predictions Garch model - one step ahead Modeling	1	946	December 5, 2018
Kalman filter runs very slow for over 400 time-series Modeling rstan , techniques , fitting-issues , specification , performance	16	1137	September 2, 2021
Kalman-type model Modeling techniques , cmdstanpy	2	408	January 9, 2024

Time series modelling: intermediate values

Related topics