Implementing an autoregressive distributed lag (ADR) model

I am trying to implement an autoregressive distributed lag (ADR) model something to the effect of this (also):

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j} + \epsilon_{t}

It’s commonly used in economics and ecology but haven’t found a Stan model example with it yet.

I’m somewhat stuck as to what would be best to do after I add the AR(p) component. I’ve had a few attempts without success. This is the code that I’ve been using as my base model from brms, I’ll explain what I’ve tried to do to it:

data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  // data needed for ARMA correlations
  int<lower=0> Kar;  // AR order
  int<lower=0> Kma;  // MA order
  // number of lags per observation
  int<lower=0> J_lag[N];
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  int max_lag = max(Kar, Kma);
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  // temporary intercept for centered predictors
  real Intercept;
  real<lower=0> sigma;  // residual SD
  vector[Kar] ar;  // autoregressive effects
}
model {
  // initialize linear predictor term
  vector[N] mu = Intercept + Xc * b;
  // objects storing residuals
  matrix[N, max_lag] Err = rep_matrix(0, N, max_lag);
  vector[N] err;
  // include ARMA terms
  for (n in 1:N) {
    err[n] = Y[n] - mu[n];
    for (i in 1:J_lag[n]) {
      Err[n + 1, i] = err[n + 1 - i];
    }
    mu[n] += head(Err[n], Kar) * ar;
  }
  // priors including all constants
  target += student_t_lpdf(Intercept | 3, 0, 10);
  target += student_t_lpdf(sigma | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
  // likelihood including all constants
  if (!prior_only) {
    target += normal_lpdf(Y | mu, sigma);
  }
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
}

My strategy was to get the AR(p) component

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \epsilon_{t}

then repeat it for the lagged exogenous regressors substituting \phi with \beta and reindex it with a

Please share your Stan program and accompanying data if possible.


When including Stan code in your post it really helps if you make it as readable as possible by using Stan code chunks (```stan) with clear spacing and indentation. For example, use

model {
  vector[N] mu = alpha + beta * x;
  y ~ normal(mu, sigma); 
} 

instead of

model{
vector[N] mu = alpha+beta*x;
y~normal(mu,sigma);
}


To include mathematical notation in your post put LaTeX syntax between two $ symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).I am trying to implement an autoregressive distributed lag (ADR) model something to the effect of this:

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j} + \epsilon_{t}

I’m somewhat stuck as to what would be best to do after I add the AR(p) component. I’ve had a few attempts without success. This is the code that I’ve been using as my base model from brms, I’ll explain what I’ve tried to do to it:

data {
  int<lower=1> N;  // number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  // data needed for ARMA correlations
  int<lower=0> Kar;  // AR order
  int<lower=0> Kma;  // MA order
  // number of lags per observation
  int<lower=0> J_lag[N];
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  int max_lag = max(Kar, Kma);
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  // temporary intercept for centered predictors
  real Intercept;
  real<lower=0> sigma;  // residual SD
  vector[Kar] ar;  // autoregressive effects
}
model {
  // initialize linear predictor term
  vector[N] mu = Intercept + Xc * b;
  // objects storing residuals
  matrix[N, max_lag] Err = rep_matrix(0, N, max_lag);
  vector[N] err;
  // include ARMA terms
  for (n in 1:N) {
    err[n] = Y[n] - mu[n];
    for (i in 1:J_lag[n]) {
      Err[n + 1, i] = err[n + 1 - i];
    }
    mu[n] += head(Err[n], Kar) * ar;
  }
  // priors including all constants
  target += student_t_lpdf(Intercept | 3, 0, 10);
  target += student_t_lpdf(sigma | 3, 0, 10)
    - 1 * student_t_lccdf(0 | 3, 0, 10);
  // likelihood including all constants
  if (!prior_only) {
    target += normal_lpdf(Y | mu, sigma);
  }
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
}

My strategy was to get the AR(p) component from this:

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \epsilon_{t}

then repeat the sum operation for the lagged exogenous regressors substituting \phi with \beta and reindex it with a max_lag_beta = max_lag + 1, err_beta, etc I am a little lost when it comes to how to change this:

for (n in 1:N) {
    err[n] = Y[n] - mu[n];
    for (i in 1:J_lag[n]) {
      Err[n + 1, i] = err[n + 1 - i];
    }
    mu[n] += head(Err[n], Kar) * ar;
  }

Would the lagged effects need to be included at the initialisation stage?


  // initialize linear predictor term
  vector[N] mu = Intercept + Xc * b;

My next strategy was to nest a modified version of the AR(p) like section with a for loop for k of the M * N matrix

\sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j}

This would model the errors of k = 1, ..., K. as being iid but they clearly are warped normal distribution. Again, I’m unsure how to go about doing this and I suspect it would be a really inefficent/computionally expensive parametisation.

@mansillo I wonder whether you can generate design matrices that represent the autoregressive distributed lag effects using the dlnm R package, which includes functions to generate design matrices for distributed lag (linear and nonlinear) models.

If so, then your Stan model might be considerably simpler.

@mbjoseph my (temporary) way around this has been to make a lagged design matrix in R then add it to the data stationary design matrix. That was simple but I am also concerned that having the lag(s) will induce enough multicollinearity the estimates could be unreliable.
Having a generalised model just would have made it easier to do model comparison with p and q of different lengths.
I’ll look into dlnm thanks