Advice on fitting state-space with observational error

I have a time-series with a state space description of:

\frac{dI}{dt} = J + k( (S B) - C)

J, k, S, B, C are all time varying and observed with some known measurement error.

\hat{C} \sim N(C, \sigma_C) where \sigma_C is specified as data varying with the observations \hat{C}.

k also varies quite a lot between time steps.

This time series is also fairly long, with many observations (~13,000).
So as expected this is very difficult for a naive implementation like:

transformed parameters {
  for(i in 2:N){
    G[i-1] = k[i-1] * ((S[i-1] * B[i-1]) - C[i-1]);
    I[i] = I[i-1] + ((G[i-1] + J[i-1]) * dt[i-1]) + process_error[N-1];
model {
    // priors
  J ~ normal(tau, sigma);
  tau ~ normal(0, 100);
  sigma ~ lognormal(0, 0.1);
  process_error ~ normal(0, 0.1);

    // initial condition prior
  init_I ~ normal(0, I_obs_error[1]);

    // measurements
  C_obs ~ normal(C, C_obs_error);
  I_obs ~ normal(I, I_obs_error);

I believe I need a kalman filter + smoother (or similar) to make this something Stan can fit.
I don’t think the built-in gaussian_dlm_obs sampling statement can work here, but I may be mistaken.
I’m aware of the work of jarnold and I’m happy to give implementing the filter a go. Alternatively I know ctsem exists and may work here.

My questions are:
Is a kalman filter likely to be a good solution?
If so, do the measured variables (k, C etc) need to be treated as state variables in the filter or just I?

As an aside, am I right in thinking that when implemented in stan this is a form of the Ensemble kalman filter?

Many Thanks

Whether some form of Kalman filter is a ‘good’ solution will depend a lot on your requirements, specific implementation, and extent of non-linearity between measurements. Without knowing any more I guess there’s a decent chance it’s a ‘viable’ solution. Certainly, getting stan to efficiently sample the states in a multivariate and complex dependency as you’ve shown always seemed pretty hard to me. The math of a filter makes things computationally slower but you get much more efficient parameter sampling, with some additionl cost of whatever statistical approximations are involved in the filter.

If k and C are measured with error, and you want things to depend on ‘true’ k and C, you need to estimate the true states of k and C in some manner.

I understand the ensemble filter as propagating an ensemble of possible states at every step, rather than an expectation and covariance.