I have a time-series with a state space description of:
\frac{dI}{dt} = J + k( (S B) - C)
J, k, S, B, C are all time varying and observed with some known measurement error.
e.g.
\hat{C} \sim N(C, \sigma_C) where \sigma_C is specified as data varying with the observations \hat{C}.
k also varies quite a lot between time steps.
This time series is also fairly long, with many observations (~13,000).
So as expected this is very difficult for a naive implementation like:
transformed parameters {
...
for(i in 2:N){
G[i-1] = k[i-1] * ((S[i-1] * B[i-1]) - C[i-1]);
I[i] = I[i-1] + ((G[i-1] + J[i-1]) * dt[i-1]) + process_error[N-1];
}
}
model {
// priors
J ~ normal(tau, sigma);
tau ~ normal(0, 100);
sigma ~ lognormal(0, 0.1);
process_error ~ normal(0, 0.1);
// initial condition prior
init_I ~ normal(0, I_obs_error[1]);
// measurements
C_obs ~ normal(C, C_obs_error);
I_obs ~ normal(I, I_obs_error);
....
}
I believe I need a kalman filter + smoother (or similar) to make this something Stan can fit.
I don’t think the built-in gaussian_dlm_obs
sampling statement can work here, but I may be mistaken.
I’m aware of the work of jarnold and I’m happy to give implementing the filter a go. Alternatively I know ctsem exists and may work here.
My questions are:
Is a kalman filter likely to be a good solution?
If so, do the measured variables (k, C etc) need to be treated as state variables in the filter or just I?
As an aside, am I right in thinking that when implemented in stan this is a form of the Ensemble kalman filter?
Many Thanks