Hi all,
I have a two observed time series (Y1 and Y2) that I am trying to fit an ODE model to. The time series are at an hourly time step over multiple days. I use the Euler method to get the estimated values (Y1_hat and Y2_hat). Each time series has observation errors. I can get that to work fine. However, I want to also add autocorrelation of the errors to the model, as well. I am having a lot of trouble figuring this out. A bare bones version of the model is shown below. (FYI: I also use hierarchical hyperpriors for one of the parameters, but that’s not relevant to the question).
functions {
vector Y1Y2_ODE(vector y, real A, real B, real C) {
// Unpack states
real Y1 = y[1];
real Y2 = y[2];
// Difference equations
vector[2] dydt;
dydt[1] = (A - B*C + C^2); // Derivative of Y1
dydt[2] = (B*A + C - A^2); // Derivative of Y2
return dydt;
}
}
data {
int<lower=1> n_days; // Number of days
int<lower=0> n_obs; // Number of total observations
int<lower=0, upper=1> Y1_lik; // =1 if Y1 is part of the log l-hood; else 0
int<lower=0, upper=1> Y2_lik; // =1 if Y2 is part of the log l-hood; else 0
vector[2] y0_init; // Initial states
vector[n_obs] Y1_obs; // Observed Y1 data
vector[n_obs] Y2_obs; // Observed Y2 data
}
parameters {
// observation errors, unscaled
array[Y1_lik ? 1 : 0] real<lower=0> sigma_Y1; // Y1 obs. error sd
array[Y2_lik ? 1 : 0] real<lower=0> sigma_Y2; // Y2 obs. error sd
// Parameters to estimate
vector<lower=0>[n_days] A;
vector<lower=0>[n_days] B;
vector<lower=0>[n_days] C;
}
transformed parameters {
vector[2] y_hat = y0_init; // init. Y1 and Y2
for (d in 1:n_days){
for (i in 1:24){
int t = i + 24 * (d - 1); // time counter
if (t > 1) { // for each time greater than initial conditions, do euler
y_hat += Y1Y2_ODE(y_hat, A[d], B[d], C[d]) * dt;
}
Y1_hat[t] = y_hat[1];
Y2_hat[t] = y_hat[2];
}
}
}
model {
// Prior distributions of parameters across days
A ~ normal(0, 10);
B ~ normal(0, 10);
C ~ lognormal(log(10), 1);
// Likelihood for observed data
if (Y1_lik) {
sigma_Y1[1] ~ normal(1, 0.3);
Y1_obs ~ normal(Y1_hat, sigma_Y1);
}
if (Y2_lik) {
sigma_Y2[1] ~ normal(1, 0.3);
Y2_obs ~ normal(Y2_hat, sigma_Y2);
}
generated quantities {
vector[n_obs] F2;
for (d in 1:n_days){
for (i in 1:n_perday){
int t = i + 24 * (d - 1); // time counter
// calculate F2
F2[t] = Y2_hat[t] * A + B;
}
}
}
To do this in MATLAB, say, I could define the following errors (just for Y1 for example), where \rho is the autocorrelation coefficient:
Err_{Y1}=[\hat{Y1}(1:N-1)-Y1_{obs}]
Err_{AR1,Y1}=Err_{Y1}(2:N)-\rho_{Y1} *Err_{Y1}(1:N-1)
And then add the autocorrelation directly in the loglikelihood function:
LogY1 = -\frac{N}{2} \log(2\pi) - \frac{N}{2} \log\left(\frac{\sigma_{Y1}^2}{1-\rho_{Y1}^2}\right) - \frac{1}{2}(1-\rho_{Y1}^2)\left(\frac{Err_{Y1}(1)}{\sigma_{Y1}}\right)^2 - \frac{1}{2}\sum\left(\frac{Err_{AR1,Y1}}{\sigma_{Y1}}\right)^2
I do not know how to do the same thing in Stan despite searching all over the web and reading the autoregressive sections of the manual. Can anyone please show me the best way to incorporate this into the model? I asked ChatGPT how to do it and it provided me with the following option, which I do not think is correct:
target += normal_lpdf(Y1_obs[t] | Y1_hat[t], sigma_Y1 * sqrt(1 - rho_Y1^2))
+ normal_lpdf(Y1_obs[t] - Y1_hat[t] | 0, sigma_Y1 * rho_Y1);
How do I do this? I would also appreciate any thoughts on how to improve code efficiency.
Thank you in advance for your help!