Looking for help to improving performance of Ornstein Uhlenbeck model

Hello,
I’ve been working on this model for a while now and started with a relatively easy model that fitted well as a starting position.
The issues with the model started once I introduced the parameter t_switch as a vector instead of a real and I would be extremely happy if somebody could share some opinion or ideas to help me improve this model.
Besides very long runtimes of up to 60 minutes, my main concern is the high number of divergent transitions which can be between 80 - 90% of the iteration number.
After trying to introduce transformed data and a lot of other things I am at a point where I am running out of ideas because I am still rather inexperienced with Stan.
At the moment my Stan model is fitted to dummy data that is generated on the basis of the same likelihood that Stan has. If useful I can gladly put the code for that in this thread as well. However, as for now here is my stan model:

data {
  int<lower=0> t;
  int N;
  vector[t] x;
}

transformed data {
  int t_sw_guess;
  t_sw_guess = t/N;
}

parameters {
  real<lower=0> tau; // characteristic time
  vector<lower=0>[N] x2; //for stan N might be enough
  vector<lower=0>[N - 1] t_switch; //position of concentration increment
  real<lower=0> sigma_x; // fluctuation
}

model {
  int m;
  real t1;
  real t2;
  vector[N]t_sw;

  // Priors
  tau ~ normal(150, 150);
  sigma_x ~ exponential(2.5);
  for(i in 2:N-1) {
    t_switch[i] ~ normal(i*t_sw_guess, t_sw_guess);
  }
  x2 ~ normal(2, 2);
  
  x[1] ~ normal(x2[1], sigma_x);

  m = 1;
  t_sw = append_row(t_switch, t);

  for (i in 2:t) {
    if(m > N) break;
    if(i - 1 <= t_sw[m]) {
      x[i] ~ normal((x[i - 1] - x2[m]) * exp(-1 / tau) + x2[m], sigma_x * sqrt(1 - exp(-2 / tau)));
    } else {
      t1 = t_sw[m] - (i - 2);
      t2 = (i - 1) - t_sw[m];
      x[i] ~ normal(((((x[i - 1]- x2[m]) * exp(-t1 / tau) + x2[m]) - x2[m + 1]) * exp(-t2 / tau)) + x2[m + 1], sigma_x * sqrt(1 - exp(-2 / tau)));
      m = m + 1;
    }
  }
  
}

The model is based on an Ornstein-Uhlenbeck process and will eventually take in data that looks quiet similar to the dummy data as seen in the image here, where the vertical red lines indicate the t_switches.

Letting it run for 3000 iterations in this example (I also tried 5000 and 7000 iterations) I get this pairs plot which indicates a big uncertainty and problem with the t_switch parameters

The last diagnostics I can provide is the nuts_params output in .csv-format
nuts_params.csv (332.7 KB)

I really hope these diagnostics make it easy for someone here with more experience than me to find a mistake or performance weakness in the code.

I can’t look in detail but I would suggest looking into a more careful parameterization of the states. Typically you want to non-center the unobserved states and center the observed states. See for example Section 2.3 of An Infinitesimal Introduction to Stochastic Differential Equation Modeling.

2 Likes

This vector must be positive_ordered, and even positive_ordered<upper=t>. The latter constraint doesn’t exist though, you need a workaround.

parameters {
  simplex[N] t_switch;
  ...
}
model {
  vector[N] t_sw = (t-1)*cumulative_sum(t_switch);
  ...

(I think you want t_sw[N] = t-1 instead of t as that’s the largest value that can take the else branch inside the loop.)

3 Likes

I am trying this at the moment and the first results look very promising. Thank you very much. This is the first time I have stumbled over the usage of simplexes. Is there a specific reason why this helps increase n_\mathrm{eff} so dramatically?