State space model and rejection proposals

Hi, I was trying to implement a simpler version of this state space model

http://www.juhokokkala.fi/blog/posts/kalman-filter-style-recursion-to-marginalize-state-variables-to-speed-up-stan-inference/

the model is just random walk with noise marginalized using the kalman filter:


// _test.stan
data {
  int<lower=0> N;
  vector[N] y;
  real P0;
  real x0;
}
parameters {
  real<lower=0> sigma_y; // process error
  real<lower=0> sigma_x; // observation error
}
transformed parameters {
  vector[N] x; //predicted mean x_t|y_1:t-1
  vector[N] m; //filtered mean x_t|y_1:t
  vector[N] S; //measurement variance, y_t|y_1:t-1
  vector[N] P_pred; //predicted var x_t|y_1:t-1

  {
    vector[N] P; //filtered var x_t|y_1:t
    real R; //measurement variance,  y_t|x_t
    real K;  //Kalman gain, depends on t but not stored

    //Filtering 
    R = sigma_y * sigma_y;
 
    x[1] = x0;
    P_pred[1] = P0;

    for (t in 1:N) {
        //Prediction step
        if (t>1) {
            x[t] = m[t-1];
            P_pred[t] = P[t-1] + sigma_x*sigma_x;
        }
        print(t, " ", x[t]);
        //Update step
        S[t] = P_pred[t] + R;
        K = P_pred[t]/S[t];
        m[t] = x[t] + K*(y[t] - x[t]);   //The measurement is just noise added to signal, so mu==m_pred
        P[t] = P_pred[t] - K*S[t]*K;
    }        
  }
}
model {
  sigma_x ~ cauchy(0,1);
  sigma_y ~ cauchy(0,1);
  y ~ normal(x, sqrt(S));
}


I feed the model the following data


    par(mfrow=c(2,1), mar=c(2,2,1,0.5))
    N <- 100
    x <- rep(0, N)
    y <- rep(0, N)
    y[1] <- x[1]
    sigma_x <- 0.25
    sigma_y <- 0.3
    for(i in 2:N) {
      x[i] <- x[i-1] + rnorm(1, 0, sigma_x)
      y[i] <- x[i] + rnorm(1, 0, sigma_y)
    }
    plot(x, type="l", col="red"); grid()
    points(y, pch=16, col="blue")
 
m <- cmdstanr::cmdstan_model("_test.stan") 
fit <- m$sample(data=list(N=length(y), y=y, x0=y[1], P0=1), chains=1, iter_warmup = 250, iter_sampling = 250)

when I run the sampler everything goes fine and the parameters are correctly estimated. However, at the beginning of the sampling I get the warnings:

Chain 1 Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:
Chain 1 Exception: normal_lpdf: Location parameter[3] is -nan, but must be finite! (in ‘/tmp/RtmpkkXaSX/model-16531619875d.stan’, line 46, column 2 to column 25)
Chain 1 If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
Chain 1 but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

I cannot quite get where the problem is, I guess it is related with the initial values or something…

thank you very much

If it otherwise looks fine and that’s the only issue, it is probably related to the initial conditions or some low probability region where the chain is wondering into a parameter region that has invalid values (e.g. zero or negative variance).
If the actual sampled values are reasonable and you didn’t get a lot of that (as mentioned in the error), which would affect performance and possibly divergences, it should be fine.

1 Like

If this is something you are interested in going in details, there is a coded-up function in Stan for this type of models: 25.8 Gaussian dynamic linear models | Stan Functions Reference

The problem, for me, is that I don’t quite understand how to set up each argument for the functions. Good luck.

1 Like

From my understanding, the error appears to hint that x[3] is -nan, so something is probably going wrong with your m[2]. In my experience nan is often produced by bad division. So I would guess that the culprit is:

K = P_pred[t]/S[t]; // with t==2

which progresses to:

m[t] = x[t] + K*(y[t] - x[t]); // again t==2

adding:

vector<lower=0>[N] S;

might help but I am not very sure if it will fix it.

Unfortunately, S is a transformed parameter so constraining it does no help, I only get: “S… is -0.00797864, but must be greater than or equal to 0”

Ok, but you are on to something. I would add the same constraints to

vector[N] P_pred;
// ...
real R;

Since, indeed S<0 assuming you do not allow imaginary standard deviation. Note that Stan uses sigma in their Normal parameterization and not sigma^2:

So I am not sure if you need the sqrt? (but this is outside the scoop of the error)

You can constrain transformed parameters – you can constrain any variables in a Stan program. Ideally it shouldn’t be necessary because the constrains on the original variable should carry onto it, but it could happen. Also, the constrains may not apply to numerical tolerances and you may get a tiny negative value that would be zero for all practical purposes, but that will still raise an error by a function that doesn’t accept negative values – a kludge is to just eliminate these values within a tolerance to make them actually zero (in the case of a poisson distribution, for instance, even zeros itself would raise an error).

So sometimes things don’t work as neatly as expected and a workaround is needed, but if this is not halting the chain or creating major problems it may be just as well to leave it and have a few more rejections along the way.

Thanks, I did not now about gaussian_dlm_obs, it seems to do the Kalman filter just like in the model I copied above.

Curiously enough I get the same error, but it occurs inside the stan function:

Exception: invese_spd: matrix not positive definite

Again, no problem with the fitting and parameters estimation, it’s only annoying warnings. I guess therefore it is something releated with the initial value and/or some sort of numerical approximation.