Hi everyone,
we run the following linear model:
y_t=\alpha+Stock_{t}\beta+CV_t\gamma+\epsilon_t with Stock_{s,t}=\lambda_s Stock_{s,t-1}+Flow_{s,t} where 0<\lambda<1 and \epsilon_t = \theta\epsilon_{t-1}+\eta_t
We observe S flow variables Flow and D control variables CV.
The Stan code I run is the following:
"data {
int <lower=1> N; // number of obs
int <lower=1> S; // number of flow/stock variables
int <lower=1> D; // number of control variables
matrix[N,S] flow; //flow variables
matrix[N,D] cv; // control variables
vector[N] y;// outcome
}
transformed data {
int<lower=0> ar_lag[N];
for (n in 1:(N-1)) {
ar_lag[n] = 1;
ar_lag[N] = 0;
}
}
parameters {
//coefficients
real alpha; //intercept
vector[S] beta;//coefficients stock variable
vector[D] gamma;//coefficients control variables
real <lower=0, upper=1> ar; //autocorrelation coefficient
vector <lower=0, upper=1> [S] lambda; // carryover coefficient
//scales
real<lower=0> sigma_y; // scale for outcome
}
model {
matrix[N,S] stock;
vector[N] mu;
vector[N] Err;
vector[N] err;
// state equation
for(s in 1:S) {
stock[1,s] = mean(flow[1:52,s])/(1-lambda[s]);
for(t in 2:N) {
stock[t,s] = lambda[s] * stock[t-1, s] + flow[t,s];
}
}
//mu for model
mu = alpha + stock*beta + cv * gamma;
// include AR terms
Err = rep_vector(0, N);
for (n in 1:N) {
err[n] = y[n] - mu[n];
for (i in 1:ar_lag[n]) {
Err[n + 1] = err[n + 1 - i];
}
mu[n] += ar * Err[n];
}
//priors
//coefficients
alpha ~ student_t(4, 0, 10);
beta ~ student_t(4, 0, 10);
gamma ~ student_t(4, 0, 10);
ar ~ beta(4, 4);
lambda ~ beta(4,4);
//scale
sigma_y ~ student_t(4, 0, 10);
//likelihood
y ~ normal(mu, sigma_y) ;
}"
Unfortunately it is not possible to get this model running even with higher settings for max_treedepth (e.g. max_treedepth=17). For max_treedepth=17 the model already needed several hours to run and resulted in max_treedepth warnings.
I already made this model as simple as it could be by ignoring the latent stock variables (i.e. I created the stock variables before fitting the model by setting lambda to a fixed value) and the autocorrelated error. The model always results in max_treedepth warnings with a lot of transitions reaching the max_treedepth (something like 3800 out of 4000) and extremely low n_eff (i.e. n_eff = 2).
After that, I tried the QR decomposition and this immediately solves the problem. For the model with manifest stock variables the code was the following:
"data {
int <lower=1> N; // number of obs
int <lower=1> S; // number of flow/stock variables
int <lower=1> D; // number of control variables
matrix[N,S] flow; //flow variables
matrix[N,D] cv; // control variables
vector[N] y;// outcome
}
transformed data {
matrix[N,S] stock;
matrix[N, S+D] X; // final design matrix
matrix[N, S+D] Q;
matrix[S+D, S+D] R;
matrix[S+D, S+D] R_inv;
int<lower=0> ar_lag[N];
for(s in 1:S) {
stock[1,s] = mean(flow[1:52,s])/(1-0.52);
for(t in 2:N) {
stock[t,s] = 0.52 * stock[t-1, s] + flow[t,s];
}
}
X = append_col(stock, cv);
Q = qr_Q(X)[, 1:S+D] * sqrt(N-1);
R = qr_R(X)[1:S+D, ] / sqrt(N-1);
R_inv = inverse(R);
for (n in 1:(N-1)) {
ar_lag[n] = 1;
ar_lag[N] = 0;
}
}
parameters {
//coefficients
real alpha; //intercept
vector[S+D] beta_tilde;//coefficients
real <lower=0, upper=1> ar;
//scales
real<lower=0> sigma_y; // scale for outcome
}
transformed parameters {
vector[S+D] beta;//coefficients
beta = R_inv * beta_tilde;
}
model {
vector[N] mu;
vector[N] Err;
vector[N] err;
//mu for model
mu = alpha + Q*beta_tilde;
// include AR terms
Err = rep_vector(0, N);
for (n in 1:N) {
err[n] = y[n] - mu[n];
for (i in 1:ar_lag[n]) {
Err[n + 1] = err[n + 1 - i];
}
mu[n] += ar * Err[n];
}
//priors
//coefficients
alpha ~ student_t(4, 0 , 10);
beta ~ student_t(4, 0 , 10);
ar ~ beta(4, 4);
//scale
sigma_y ~ student_t(4, 0 , 10);
//likelihood
y ~ normal(mu, sigma_y) ;
}"
The problem now, of course, is that we still want to estimate the stock variable as a latent variable. But I have no idea how to implement the QR decomposition correctly, since we can only apply it to the flow variables. One attempt was as follows:
"data {
int <lower=1> N; // number of obs
int <lower=1> S; // number of flow/stock variables
int <lower=1> D; // number of control variables
matrix[N,S] flow; //flow variables
matrix[N,D] cv; // control variables
vector[N] y;// outcome
}
transformed data {
matrix[N, S+D] Q;
matrix[S+D, S+D] R;
matrix[S+D, S+D] R_inv;
int<lower=0> ar_lag[N];
Q = qr_Q(append_col(flow,cv))[, 1:S+D] * sqrt(N-1);
R = qr_R(append_col(flow,cv))[1:S+D, ] / sqrt(N-1);
R_inv = inverse(R);
for (n in 1:(N-1)) {
ar_lag[n] = 1;
ar_lag[N] = 0;
}
}
parameters {
//coefficients
real alpha; //intercept
vector[S+D] beta_tilde;//coefficients
real <lower=0, upper=1> ar;
vector <lower=0, upper=1>[S] lambda;
//scales
real<lower=0> sigma_y; // scale for outcome
}
transformed parameters {
vector[S+D] beta;//coefficients
beta = R_inv * beta_tilde;
}
model {
matrix[N,S] stock;
matrix[N, S+D] Q_final; // final Q matrix
vector[N] mu;
vector[N] Err;
vector[N] err;
for(s in 1:S) {
stock[1,s] = mean(Q[1:52,s])/(1-lambda[s]);
for(t in 2:N) {
stock[t,s] = lambda[s] * stock[t-1, s] + Q[t,s];
}
}
Q_final = append_col(stock, Q[, S+1:S+D]);
//mu for model
mu = alpha + Q_final*beta_tilde;
// include AR terms
Err = rep_vector(0, N);
for (n in 1:N) {
err[n] = y[n] - mu[n];
for (i in 1:ar_lag[n]) {
Err[n + 1] = err[n + 1 - i];
}
mu[n] += ar * Err[n];
}
//priors
//coefficients
alpha ~ student_t(4, 0 , 10);
beta ~ student_t(4, 0 , 10);
ar ~ beta(4, 4);
lambda ~ beta(4, 4);
//scale
sigma_y ~ student_t(4, 0 , 10);
//likelihood
y ~ normal(mu, sigma_y) ;
}"
Even though the model runs without any warnings and the results are face valid at first sight, this approach completely ignores the latent structure of the stock variable and due to that doesn’t look right.
Does anyone have an idea and could help how to implement the QR decomposition and reconvert the estimates for the coefficients correctly?
Thanks!