1-step-ahead forecasting model fitting

Hey all,

I’ve been trying to fit this model for months, and can’t figure out what’s going wrong, but suspect a combination of non-identifiability and dynamical complexity.

Now, I’m trying to write a 1-step-ahead forecasting model, and keep getting syntax errors that don’t make sense to me? Anywhere, here’s the model…

functions{
  real[] dZ_dt(
    real t,
    real[] Z,
    real[] theta,
    real[] x_r,
    int[] x_i){
    real P = Z[1];
    real H = Z[2];

    real r = theta[1];
    real O = theta[2];
    real h = theta[3];
    real b = theta[4];
    real c = theta[5];
    real u = theta[6];

    real dP_dt = P*r - H*(exp(O)*P/(1 + exp(O)*P*exp(h)));
    real dH_dt = b + H*(c*(exp(O)*P/(1 + exp(O)*P*exp(h)))-u);
    return({dP_dt,dH_dt});
  }
}

data {
  int<lower=0>N; // Define N as non-negative integer
  real ts[N]; // Assigns time points to N
  real<lower=0>y[N,2]; // Define y as real and non-negative
}

parameters {
  real<lower=0>r;
  real<lower=0,upper=1>O;
  real<lower=0>h;
  real<lower=0>b;
  real<lower=0>c;
  real<lower=0,upper=1>u;
  real<lower=0>sigma[2];
}

transformed parameters{
  for (i in 2:N){ // Start at 2nd time step because giving the first
  real Z[i,2]
  = integrate_ode_rk45(dZ_dt, // Function
  y[i-1], // Initial value (empirical data point at previous time step)
  ts[i-1], // Initial time step
  ts[i], // Next time step (time step to be solved/estimated)
  {r, exp(O), exp(h), b, c, u},
  rep_array(0.0,2),rep_array(0,2),1e-10,1e-10,2e4);
  }
}

model {
  r~normal(2.5,1);
  O~normal(log(0.01),2);
  h~normal(log(0.07),2);
  b~normal(35,1);
  c~normal(0.3,1);
  u~normal(0.4,1);
  sigma~lognormal(-1, 1);
  for (k in 1:2) {
    y[ , k] ~ lognormal(log(Z[ , k]), sigma[k]);
  }
}

generated quantities {
  real y_rep[N, 2];
  for (k in 1:2) {
    for (n in 1:N)
      y_rep[n, k] = lognormal_rng(log(Z[n, k]), sigma[k]);
  }
}

O and h are logged because their true values are close to 0, and when multiplied (as they are in the mechanistic model) are VERY close to zero.

And here’s the syntax error I can’t seem to shake:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Fourth argument to integrate_ode_rk45 must have type real[ ];  found type = real. 
 error in 'model2486523de9f_OSA_Lognormal' at line 51, column 49
  -------------------------------------------------
    49:   ts[i], // Next time step (time step to be solved/estimated)
    50:   {r, exp(O), exp(h), b, c, u},
    51:   rep_array(0.0,2),rep_array(0,2),1e-10,1e-10,2e4);
                                                        ^
    52:   }
  -------------------------------------------------

PARSER EXPECTED: ")"
Error in stanc(filename, allow_undefined = TRUE) : 
  failed to parse Stan model 'OSA_Lognormal' due to the above error.

Despite a ton of incredible support from this community, I’ve been stuck on this model for way too long, so any help would be appreciated!

I’m also happy to update this topic with what I was trying before if that would be helpful :)

1 Like

The function has signature real[ , ] integrate_ode_rk45(function ode, real[] initial_state, real initial_time, real[] times, real[] theta, real[] x_r, int[] x_i, real rel_tol, real abs_tol, int max_num_steps)

For the argument real[] times you’re passing in ts[i] which is of type real, which is the meaning of the error message Fourth argument to integrate_ode_rk45 must have type real[ ]; found type = real. The difference is that real[] is an array, and real is just a scalar.

I’ve not used the ODE methods before, but I’d guess that there should just be one call to the ODE solver that looks something like integrate_ode_rk45(dZ_dt, y[1], ts[1], ts, ...) rather than calling it in a loop? The argument times are the points at which the solution is returned.

EDIT: To fix just the error you have, replace ts[i] with ts[i:i], which will make sure it’s the right type.

4 Likes

Okay okay, that makes a lot of sense––I’ve made the change and the error is resolved! Thank you so much!

You’re right about the ODE solver; usually it’s just a single call, not a loop, but I’m trying to get the solver to solve at each time point independently, rather than using memory of previous solutions to inform the current solution.

I’m now (somehow) getting a more confusing error in the model block:

model {
  r~normal(2.5,1);
  O~normal(log(0.01),2);
  h~normal(log(0.07),2);
  b~normal(35,1);
  c~normal(0.3,1);
  u~normal(0.4,1);
  sigma~lognormal(-1, 1);
  for (k in 1:2) {
    y[ , k] ~ lognormal(log(Z[ , k]), sigma[k]);
  }
}

That reads:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Variable "Z" does not exist.
 error in 'model2481bab4a7c_OSA_Lognormal' at line 63, column 29
  -------------------------------------------------
    61:   sigma~lognormal(-1, 1);
    62:   for (k in 1:2) {
    63:     y[ , k] ~ lognormal(log(Z[ , k]), sigma[k]);
                                    ^
    64:   }
  -------------------------------------------------

Error in stanc(filename, allow_undefined = TRUE) : 
  failed to parse Stan model 'OSA_Lognormal' due to the above error.

I figure this must be an issue arising from the loop in the transformed parameters block, but have no idea how to address it:

transformed parameters{
  for (i in 2:N){ // Start at 2nd time step because giving the first
  real Z[i,2]
  = integrate_ode_rk45(dZ_dt, // Function
  y[i-1], // Initial value (empirical data point at previous time step)
  ts[i-1], // Initial time step
  ts[i:i], // Next time step (time step to be solved/estimated)
  {r, exp(O), exp(h), b, c, u},
  rep_array(0.0,2),rep_array(0,2),1e-10,1e-10,2e4);
  }
}

If you have any ideas, I’d be super appreciative to hear them :)

1 Like

Right, because you’re using it to go from known point to an estimate at some future time.

In the transformed parameters block you have written

for(i in 2:N){
    real Z[i, 2] = ...;
}

This is declaring an array of reals of size [i, 2] named Z on every iteration of the loop. The type specifiers real, vector etc. are used to declare and create new variables, and also the Z inside the for loop is in the scope of the for loop – see here https://mc-stan.org/docs/2_23/reference-manual/for-loops.html

What I believe you want is something like this:

transformed parameters{
real Z[N, 2];
for(i in 2:N){
    Z[i, :] = ...;
}

Which first declares an array named Z, and then assigns values to it’s elements.

1 Like

Glad the community has been helpful! Hopefully we can help you finish sorting this out.
ODE models are not my specialty, so I’m not 100% sure about this, but I think (like @RJTK suggested) you probably don’t need the loop because the integrator essentially handles that for you. So maybe all you need in transformed parameters is something like this:

transformed parameters{
real Z[N, 2] = integrate_ode_rk45(
  dZ_dt,  // Function
  y[1],   // Initial state
  ts[1],  // Initial time
  ts,     // Array of solution times
  {r, exp(O), exp(h), b, c, u},
  rep_array(0.0,2),rep_array(0,2),1e-10,1e-10,2e4);
}

Hopefully someone with more ODE experience like @bbbales2 can chime in here because he would know a lot better than I would.

1 Like

I think what OP has done is correct, each initial point y[i] is known, and they want to compare the forecast of y[i + 1] given by the ODE model to the actual value. AFAICT this does actually require a loop?

1 Like

Ah, good point!

@RJTK, This makes a lot of sense to me––thank you!

I tried your solution, and unfortunately, Stan seems unhappy with the “:”…

I changed the TP block to:

transformed parameters{
real Z[N, 2];
for(i in 2:N){
  Z[i, :] = integrate_ode_rk45(dZ_dt, // Function
  y[i-1], // Initial value (empirical data point at previous time step)
  ts[i-1], // Initial time step
  ts[i:i], // Next time step (time step to be solved/estimated)
  {r, exp(O), exp(h), b, c, u},
  rep_array(0.0,2),rep_array(0,2),1e-10,1e-10,2e4);
  }
}

Which throws the error code:

SYNTAX ERROR, MESSAGE(S) FROM PARSER:
Dimension mismatch in assignment; variable name = Z, type = real[ ]; right-hand side type = real[ , ].
Illegal statement beginning with non-void expression parsed as
  Z[i, :]
Not a legal assignment, sampling, or function statement.  Note that
  * Assignment statements only allow variables (with optional indexes) on the left;
  * Sampling statements allow arbitrary value-denoting expressions on the left.
  * Functions used as statements must be declared to have void returns

 error in 'model1c63cd99a8e_OSA_Lognormal_2' at line 45, column 2
  -------------------------------------------------
    43: real Z[N, 2];
    44: for(i in 2:N){
    45:   Z[i, :] = integrate_ode_rk45(dZ_dt, // Function
         ^
    46:   y[i-1], // Initial value (empirical data point at previous time step)
  -------------------------------------------------

PARSER EXPECTED: "}"
Error in stanc(filename, allow_undefined = TRUE) : 
  failed to parse Stan model 'OSA_Lognormal_2' due to the above error.

@Jonah: Thanks too for your response! Yes, I should have been clearer: I’m trying to write a one-step-ahead forecasting model, wherein the estimation at the current time step is made using only the previous time step. I’ve tried fitting this model traditionally:

transformed parameters {
  real Z[N, 2]
    = integrate_ode_rk45(dZ_dt, y_init, 0, ts, {r, exp(O), exp(h), b, c, u},
                         rep_array(0.0, 2), rep_array(0, 2),
                         1e-10, 1e-10, 2e4);
}

But think that the dynamical complexity of the data prevents the model from capturing the oscillations in the data, which, I think, could be worked around with a one-step-ahead model.

Because of this, I think I need a loop, but I’m open to trying anything! I tried your suggestion as well, and while syntactically sound, I’m not sure it would give me what I’m looking for?

1 Like

The error

Says that the LHS is a 1d array (it’s Z[i, :]) whereas the RHS is a 2D array real[ , ]. The correct statement is possibly Z[i:i, :] = integrate_ode_rk45(…)`? I’ve not used the ODE methods so I’m not certain.

1 Like

The 1 step ahead thing is essentially then a hack to allow some kind of system noise in – a kind of crude approach to a stochastic differential equation, I would say. You could go a step further and introduce a latent noise term, with variance g * delta t (ie some weighting of the time difference from last step), such that
y ~ lognormal(log(Z[ , k]+noise[k]), sigma[k])
This would still be kind of rough but less so, and at least more along the lines of where you’re trying to go, I think…

4 Likes

We discussed the possible non-identifiabilities of the model at Change from global model fit to one-step-ahead prediction

I would still strongly suspect the non-identifiability is the main problem - especially since you mention you work with O and h on the log scale as they are very small and I would expect some of the non-identifiabilities to arise precisely when O is small.

I don’t want to pressure you into following my advice (it might in the end turn out to be wrong advice :-), but my experience is that attempts to work around modelling problems with some empirical tweaks to how the model is coded rarely help and usually only delay the necessity to delve in the math and understand the root of the problem. But obviously, you do you - it definitely seems you are learning a lot about Stan in the process and that’s also something :-)

EDIT: If you also could share a plot of the actual data, that might also help us see if the observed dynamics is rich enough to inform all the parameters…

Looking at the traceplots in Model fitting and sampling issue: Only 1 chain sampling properly makes me suspect even more that non-identifiability is an issue as there seems to be strong negative correlation between O and h

Best of luck!

1 Like

Yes, this works! Awesome––thank you so much! I’ll update this post after I run everything again to say how it went! Thank you so much for your help and patience! :)

1 Like

This is a neat idea; I’ll give it a try! Thanks!! :)

1 Like

Yes––I still very much agree that this is an identifiability issue. I tried combining O and h, as well as simplifying the mechanistic model, without much luck. I think part of the problem is that when multiplied, the true value of Oh is very small. I tried logging the prior to get around this, but to no avail, so I’m kind of at a loss.

Here’s what I tried when logging the priors for O and h (I also tried combining O and h, and logging that single prior, but I won’t show that here for brevity):

functions{
  real[] dZ_dt(
    real t, 
    real[] Z, 
    real[] theta, 
    real[] x_r, 
    int[] x_i){
    real P = Z[1]; 
    real H = Z[2];

    real r = theta[1]; 
    real O = theta[2];
    real h = theta[3];
    real b = theta[4];
    real c = theta[5];
    real u = theta[6];

    real dP_dt = P*r - H*(exp(O)*P/(1 + exp(O)*exp(h)*P));
    real dH_dt = b + H*(c*(exp(O)*P/(1 + exp(O)*exp(h)*P))-u);
    return({dP_dt,dH_dt});
  }
}
data {
  int<lower=0> N;           
  real ts[N];                 
  real y_init[2]; 
  //real y0[2];
  real<lower=0> y[N, 2];    
}
parameters {
  real<lower=0>r;
  real O;
  real h;
  real<lower=0>b;
  real<lower=0>c;
  real<lower=0,upper=1>u;
  real<lower=0>Z_init[2];  
  real<lower=0>sigma[2];   
}
transformed parameters {
  real Z[N, 2]
    = integrate_ode_rk45(dZ_dt, y_init, 0, ts, {r, exp(O), exp(h), b, c, u},
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-10, 1e-10, 5e2);
}
model {
  r~normal(2.5,1);
  O~normal(log(0.01),2); 
  h~normal(log(0.07),2); 
  b~normal(35,1); 
  c~normal(0.3,1); 
  u~normal(0.4,1);
  sigma~lognormal(-1, 1);
  Z_init~lognormal(log(3), 1);
  for (k in 1:2) {
    y_init[k] ~ lognormal(log(Z_init[k]), sigma[k]);
    y[ , k] ~ lognormal(log(Z[ , k]), sigma[k]);
  }
}
generated quantities {
  real y_init_rep[2];
  real y_rep[N, 2];
  for (k in 1:2) {
    y_init_rep[k] = lognormal_rng(log(Z_init[k]), sigma[k]);
    for (n in 1:N)
      y_rep[n, k] = lognormal_rng(log(Z[n, k]), sigma[k]);
  }
}
2 Likes

I presume you can’t share the (plot of) your actual data? I totally understand people being reluctant to share unpublished data, but it makes it harder to guess what is happening.

So even the simplest model, e.g.

\frac{dP}{dt} = Pr

has fitting issues?

One more thing to try (and it is just guess) is to estimate with maximum likelihood (you can use the same Stan model - remove/skip all priors and use optimizing) and see how much do the ML estimates change when you perturb the data a bit (e.g. add or remove 1 here and there). If the model is poorly identified, you would expect at least some of the parameters to have wildly varying ML estimates upon perturbing. What subset of parameters fluctuates this way may be helpful to pinpoint the issue.

Also, looking at the pairs plot (excluding Z) could shed some additional light here.

Hi Martin,

Here’s my data:


And if I try to fit this data to the simplified mechanistic model, there are fitting issues, yes.

When I give the model the true values for O and h, it fits very well, which pretty well identifies non-identifiability as the problem, but is nonetheless strange considering that the model has fitting issues when the priors for O and h are severely restricted.

Unfortunately, I don’t have a very informative pairs plot to show as I can’t get multiple chains to converge.

I like your ML idea; I’ll give that a go. Thank you! :)

1 Like

I’ve now been playing with a version of the model including your corrections for a few days, and while the model seems to compile correctly, it runs into problems when sampling, and rejects the initial value:

CHECKING DATA AND PREPROCESSING FOR MODEL 'OSA_Log' NOW.

COMPILING MODEL 'OSA_Log' NOW.

STARTING SAMPLER FOR MODEL 'OSA_Log' NOW.
starting worker pid=1389 on localhost:11778 at 19:42:10.809
starting worker pid=1403 on localhost:11778 at 19:42:10.985

CHECKING DATA AND PREPROCESSING FOR MODEL 'OSA_Log' NOW.

COMPILING MODEL 'OSA_Log' NOW.

STARTING SAMPLER FOR MODEL 'OSA_Log' NOW.

SAMPLING FOR MODEL 'OSA_Log' NOW (CHAIN 1).
Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: lognormal_lpdf: Location parameter[1] is nan, but must be finite!  (in 'model1c54d0e2ad4_OSA_Log' at line 58)
...
Chain 2: 
Chain 2: Initialization between (-2, 2) failed after 100 attempts. 
Chain 2:  Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
[1] "Error in sampler$call_sampler(args_list[[i]]) : Initialization failed."
error occurred during calling the sampler; sampling not done
here are whatever error messages were returned
[[1]]
Stan model 'OSA_Log' does not contain samples.

[[2]]
Stan model 'OSA_Log' does not contain samples.

Warning message:
In .local(object, ...) :
  some chains had errors; consider specifying chains = 1 to debug

So I’m not sure what’s going on, but it seems like Stan might not be reading the loop/reading in the data as intended. Any ideas?

Thank you! :)

The logarithm of non-positive numbers is nan, so if any element of Z is \le 0 then the initialization will fail. Since your model is pretty complicated you might need to find a fixed set of initializing parameters that ensures Z > 0.

Alternatively, you could try some modified models or reparameterizations. Possibly put log_y = log(y) in the transformed data and then use log_y ~ normal(Z), or use log(y) ~ normal(Z) and correct the lpdf gradient.

1 Like

If by “fitting issues” you mean divergences/treedepth warnings, then I would try to resolve those first (as those almost certainly propagate to the more complex model as well).

This is IMHO a bit strange, but not unheard of - how much restricted were the priors? In a toy example with a problematic Gaussian process I once played with, the model had problems even with very restricted priors… (e.g. beta(430, 70))

Also what happens, if you give the model true values only for O or only for h?

Those could still help - if possible post both complete pairs plots as well as pairs plots taking only samples from the chains that didn’t have too many divergences (if there are such)

Oh, those look quite rich - so I now find it unlikely that what you are seeing is a problem primarily with some property of the data.

Hey @martinmodrak––sorry for the late reply!

Sorry, I should have clarified: when I fit the data generated using the complex model to the simpler model, the fit is very poor.

When I give the model the true value for h, it fits very well, which is encouraging! When I give the model the true value for O, it does not fit very well. Here are some graphical outputs from that run (where I estimate r, h, b, c, u, and the population abundances, using 4 chains at 2000 inter):

functions {
  real[] dz_dt(real t,       
               real[] z,     
               real[] theta, 
               real[] x_r,  
               int[] x_i) {
    real P = z[1];
    real H = z[2];

    real r = theta[1];  
    real O = theta[2];
    real b = theta[3];
    real c = theta[4];
    real u = theta[5];

    real dP_dt = P*r - H*(O*P/(1 + O*0.075*P));
    real dH_dt = b + H*(c*(O*P/(1 + O*0.075*P))-u);
    return { dP_dt, dH_dt };
  }
}
data {
  int<lower = 0> N;           
  real ts[N];                 
  real y_init[2];             
  real<lower = 0> y[N, 2];    
}
parameters {
  real<lower = 0> theta[5];   
  real<lower = 0> z_init[2];  
  real<lower = 0> sigma[2];   
}
transformed parameters {
  real z[N, 2]
    = integrate_ode_rk45(dz_dt, z_init, 0, ts, theta,
                         rep_array(0.0, 0), rep_array(0, 0),
                         1e-5, 1e-3, 5e2);
}
model {
  theta[{1}] ~ normal(2.5, 1);
  log(theta[{2}]) ~ normal(0.5,0.5);
  theta[{3}] ~ normal(35,1);
  theta[{4, 5}] ~ normal(0.5,0.5);
  sigma ~ lognormal(-1, 1);
  z_init ~ lognormal(log(140), 1);
  for (k in 1:2) {
    y_init[k] ~ lognormal(log(z_init[k]), sigma[k]);
    y[ , k] ~ lognormal(log(z[, k]), sigma[k]);
  }
}
generated quantities {
  real y_init_rep[2];
  real y_rep[N, 2];
  for (k in 1:2) {
    y_init_rep[k] = lognormal_rng(log(z_init[k]), sigma[k]);
    for (n in 1:N)
      y_rep[n, k] = lognormal_rng(log(z[n, k]), sigma[k]);
  }
}


So now, I’m just having a strange amount of trouble estimating h. I’ve tried estimating O and h together and separately, with and without log-transformed priors, and with and without very strict priors, none of which worked, so I’m a bit stuck? What I included above is the closest I’ve gotten to fitting this model.

1 Like