I do not succeed to get reasonable sampling when a cyclic component is added to a state-space model. The same model and data using PyMC provides excellent sampling. Maybe that my code for a cyclic component is wrong/sub-optimal. Any suggestion would be appreciated.
The code below implements a combination of integrated random walk with a cycle of known frequency.

data{
int<lower=1> N; // Time series length
vector[N] y; // Observations
real omega; // normalized angular frequency [rad] of the cycle
}
parameters{
real slope_ini;
real level_ini;
real cycle_ini;
real conj_cycle_ini;
vector[N-1] slope;
vector[N] cycle;
vector[N] conj_cycle;
real<lower=0> sigma_slope;
real<lower=0> sigma_irreg;
real<lower=0> sigma_cycle;
}
transformed parameters {
vector[N] level;
level[1] = level_ini;
for (n in 2:N) {
level[n] = level[n-1] + slope[n-1];
}
}
model{
// observation equations
y[1:N] ~ normal(level[1:N] + cycle[1:N], sigma_irreg);
// slope equations
slope[1] ~ normal(slope_ini, sigma_slope);
for (n in 2:N-1) {
slope[n] ~ normal(slope[n-1], sigma_slope);
}
// cycle equations
cycle[1] ~ normal(cycle_ini, sigma_cycle);
conj_cycle[1] ~ normal(conj_cycle_ini, sigma_cycle);
for (n in 2:N) {
cycle[n] ~ normal(cos(omega)*cycle[n-1] + sin(omega)*conj_cycle[n-1], sigma_cycle);
conj_cycle[n] ~ normal(-sin(omega)*cycle[n-1] + cos(omega)*conj_cycle[n-1], sigma_cycle);
}
}
```stan

Can you be more specific? Were there divergence warnings or low effective sample size or multiple chains failing to converge?

It would help if you could share the working PyMC code and the calls used to fit it. Were you using anything in PyMC or Stan other than random initialized NUTS? If you were just using PyMCâ€™s NUTS implementation, my guess would be that itâ€™s not the same model being implemented.

The main issue I see here computationally is that youâ€™re using a centered parameterization of a hierarchical model for cycle and conj_cycle. My guess is that your program will mix much better with a non-centered parameterization, which only requires a change in declaration (and a recent enough version of Stan).

and same for conj_cycle. Itâ€™d be more efficient still to define cycle and conj_cycle to be of length N-1 so this doesnâ€™t require any slicing and then cycle[1] and conj_cycle[1] can be scalar parameters.

Precompute cos(omega) and sin(omega) and store as transformed data variables and reuse.

transformed data {
real cos_omega = cos(omega0;
real sin_omega = sin(omega);
real neg_sin_omega = -sin_omega;
}

All convergence measures failed for the slope disturbance variance: poor sampling, strong autocorrelations over hundreds of lags, R-hat etc. Mixing very poor. I also found that even a simple integrated random walk model shows similar problems with the slope disturbance variance. Only the observation disturbance variance converges. Using the same (synthetic) dataset and model in PyMC gives perfect results. Therefore, I suspect that my model implementation is at least sub-optimal. I will first follow your recommendations regarding the simple integrated random walk model and will report the results. If I get this simple model working I will add a cycle, follow your instructions again, and let you know.

I hope itâ€™s okay to comment to this older post. I got curious about the claim that Stan does poorly at a model that PyMC samples well. I tried the recommended changes and those help but it seems slope also needs a non-centered parametrization (at least under the simulation in the SSM_cmdstanpyâ€¦dataset.py notebook):

With the non-centered parametrization (and max_treedepth = 12), the convergence diagnostics for the slope disturbance variance â€“ thatâ€™s sigma_slope â€“ improve.

I would like to try your example and see whether it works better. I wonder if you could send me the python code you used and/or the STAN file. To be honest, I find it frustrating that I canâ€™t get any reasonable sampling for even simple state space models with Stan, though everything works perfectly when using pymc or pymc-experimental.

I actually find the explicit non-centered parametrization (so slope = slope_ini + sigma_slope * slope_raw) instead of <offset=slope_ini, multiplier=sigma_slope>) easier to understand.

Aside: This way of implementing the non-centered parametrization also seems to work better on this problem. (For example: no need to increase max_treedepth.) I donâ€™t know why this might be the case.

Thanks a lot. I will work on it and let you know about my findings. Maybe a quick remark: cylce_ini must be a vector of size 2 as the recursive equations for the cycle assume that you have decomposed a cycle with non-zero phase into a cosine wave and sine wave each one with phase 0.