# Implementing an autoregressive distributed lag (ADR) model

I am trying to implement an autoregressive distributed lag (ADR) model something to the effect of this (also):

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j} + \epsilon_{t}

It’s commonly used in economics and ecology but haven’t found a Stan model example with it yet.

I’m somewhat stuck as to what would be best to do after I add the AR(p) component. I’ve had a few attempts without success. This is the code that I’ve been using as my base model from brms, I’ll explain what I’ve tried to do to it:

data {
int<lower=1> N;  // number of observations
vector[N] Y;  // response variable
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
// data needed for ARMA correlations
int<lower=0> Kar;  // AR order
int<lower=0> Kma;  // MA order
// number of lags per observation
int<lower=0> J_lag[N];
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc;  // centered version of X without an intercept
vector[Kc] means_X;  // column means of X before centering
int max_lag = max(Kar, Kma);
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b;  // population-level effects
// temporary intercept for centered predictors
real Intercept;
real<lower=0> sigma;  // residual SD
vector[Kar] ar;  // autoregressive effects
}
model {
// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;
// objects storing residuals
matrix[N, max_lag] Err = rep_matrix(0, N, max_lag);
vector[N] err;
// include ARMA terms
for (n in 1:N) {
err[n] = Y[n] - mu[n];
for (i in 1:J_lag[n]) {
Err[n + 1, i] = err[n + 1 - i];
}
mu[n] += head(Err[n], Kar) * ar;
}
// priors including all constants
target += student_t_lpdf(Intercept | 3, 0, 10);
target += student_t_lpdf(sigma | 3, 0, 10)
- 1 * student_t_lccdf(0 | 3, 0, 10);
// likelihood including all constants
if (!prior_only) {
target += normal_lpdf(Y | mu, sigma);
}
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}


My strategy was to get the AR(p) component

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \epsilon_{t}

then repeat it for the lagged exogenous regressors substituting \phi with \beta and reindex it with a

When including Stan code in your post it really helps if you make it as readable as possible by using Stan code chunks (stan) with clear spacing and indentation. For example, use

model {
vector[N] mu = alpha + beta * x;
y ~ normal(mu, sigma);
}


model{
vector[N] mu = alpha+beta*x;
y~normal(mu,sigma);
}

To include mathematical notation in your post put LaTeX syntax between two \$ symbols, e.g.,
p(\theta | y) \propto p(\theta) p(y | \theta).I am trying to implement an autoregressive distributed lag (ADR) model something to the effect of this:

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j} + \epsilon_{t}

I’m somewhat stuck as to what would be best to do after I add the AR(p) component. I’ve had a few attempts without success. This is the code that I’ve been using as my base model from brms, I’ll explain what I’ve tried to do to it:

data {
int<lower=1> N;  // number of observations
vector[N] Y;  // response variable
int<lower=1> K;  // number of population-level effects
matrix[N, K] X;  // population-level design matrix
// data needed for ARMA correlations
int<lower=0> Kar;  // AR order
int<lower=0> Kma;  // MA order
// number of lags per observation
int<lower=0> J_lag[N];
int prior_only;  // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc;  // centered version of X without an intercept
vector[Kc] means_X;  // column means of X before centering
int max_lag = max(Kar, Kma);
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b;  // population-level effects
// temporary intercept for centered predictors
real Intercept;
real<lower=0> sigma;  // residual SD
vector[Kar] ar;  // autoregressive effects
}
model {
// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;
// objects storing residuals
matrix[N, max_lag] Err = rep_matrix(0, N, max_lag);
vector[N] err;
// include ARMA terms
for (n in 1:N) {
err[n] = Y[n] - mu[n];
for (i in 1:J_lag[n]) {
Err[n + 1, i] = err[n + 1 - i];
}
mu[n] += head(Err[n], Kar) * ar;
}
// priors including all constants
target += student_t_lpdf(Intercept | 3, 0, 10);
target += student_t_lpdf(sigma | 3, 0, 10)
- 1 * student_t_lccdf(0 | 3, 0, 10);
// likelihood including all constants
if (!prior_only) {
target += normal_lpdf(Y | mu, sigma);
}
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}


My strategy was to get the AR(p) component from this:

y_{t} = \alpha + \sum\limits_{j=1}^{p}\phi_{j} y_{t-j} + \epsilon_{t}

then repeat the sum operation for the lagged exogenous regressors substituting \phi with \beta and reindex it with a max_lag_beta = max_lag + 1, err_beta, etc I am a little lost when it comes to how to change this:

for (n in 1:N) {
err[n] = Y[n] - mu[n];
for (i in 1:J_lag[n]) {
Err[n + 1, i] = err[n + 1 - i];
}
mu[n] += head(Err[n], Kar) * ar;
}


Would the lagged effects need to be included at the initialisation stage?


// initialize linear predictor term
vector[N] mu = Intercept + Xc * b;


My next strategy was to nest a modified version of the AR(p) like section with a for loop for k of the M * N matrix

\sum\limits_{j=0}^{q,1}\beta_{1,j} x_{1,t-j} + \dots + \sum\limits_{j=0}^{q,k}\beta_{k,j} x_{k,t-j}

This would model the errors of k = 1, ..., K. as being iid but they clearly are warped normal distribution. Again, I’m unsure how to go about doing this and I suspect it would be a really inefficent/computionally expensive parametisation.

@mansillo I wonder whether you can generate design matrices that represent the autoregressive distributed lag effects using the dlnm R package, which includes functions to generate design matrices for distributed lag (linear and nonlinear) models.

If so, then your Stan model might be considerably simpler.

@mbjoseph my (temporary) way around this has been to make a lagged design matrix in R then add it to the data stationary design matrix. That was simple but I am also concerned that having the lag(s) will induce enough multicollinearity the estimates could be unreliable.
Having a generalised model just would have made it easier to do model comparison with p and q of different lengths.
I’ll look into dlnm` thanks