Simple model that converges in JAGS but not in Stan

#1

Dear All,

I am a relatively new user of Stan struggling with stable behavior of a model that works well with no problems in JAGS. I am wondering if the performance/behavior is linked somehow to the way I wrote the code in Stan. I would appreciate if you could help me with that problem.

The model contains regularly spaced missing observations, and both JAGS and San’s versions account for that.

The variance between observations increases and collapses along with occurrence of the new, consequent observation.

JAGS model:

    Model= "model{
        for(i in 2:N0){
          FACTOR1[i] ~ dnorm(FACTOR1[i-1],TAU[i])
       } }"

(in all plots gray lines represent 80% confidence interval)

Stan Model:

data {
  int<lower=1> N0;
  vector[N0] Factor1_0;
  int FCT1_NA_COUNT;
  int FCT1_NA_INDEX[FCT1_NA_COUNT]; 
  vector[N0] FACTOR1_SD;
}
parameters {
  vector<lower=0.3, upper=1.7>[FCT1_NA_COUNT] FCT1_MISSING;
}
transformed parameters {
  vector[N0] FACTORS; 
  FACTORS = Factor1_0; FACTORS[FCT1_NA_INDEX] = FCT1_MISSING;
}
model {
  for(i in 2:N0){
        #FACTORS[i] ~ normal(FACTORS[i-1],0.04);			        //    Version 1
        FACTORS[i] ~ normal(FACTORS[i-1],FACTOR1_SD[i]); 	//    Version 2
  }   
  }

Version 1 of the Stan’s code is relatively stable, fast and yields expected results (but not the results that I am trying to model). The 2nd version with the increasing variances (all are below 0.04) doesn’t converge, and simulations don’t produce the expected behavior (I’ll include some plots to present that.

Stan Version 1:

Stan Version 2:


Thank you very much in advance for all your help.

Regards
Peter

[edit: escape model code]

#2

Are all the FACTOR1_SDs > 0?

If so, maybe try adding a bit of extra noise to maybe experiment and see if just being close to zero is causing numerical issues?

Can you post the data for this? I’m not seeing from the plots which data points you have and which are missing, which looks important for interpreting FACTOR1_SD.

1 Like
#3

Wait, shouldn’t that be

vector<lower=0>[N0] FACTOR1_SD;

?

1 Like
#4

Since the post talks about variance, I just want to make it clear that JAGS uses precision (inverse variance) and Stan uses scale (standard deviation) to parameterize the normal.

Can you share the JAGS version and data that converges? Why are the FCT1_MISSING given those hard interval bounds? Did you first try without that? That can be a real problem computationally and statistically if the missing values are consistent with values near the boundaries.

Finally, you can doe the random walk as

FACTORS[2:N0] ~ normal(factors[1:N0 - 1], FACTOR1_SD[2:N0]);

but you’d probably just want to just define FACTOR1_SD as not having that first element rather than having to slice it each time, given that it’s data. Or you could transform in transformed data to define:

transformed data {
  vector[N0 - 1] sigma = FACTOR1_SD[2:N0];

...
model {
  FACTORS[2:N0] ~ normal(FACTORS[1:N0 - 1], sigma);
3 Likes
#5

Hello Ben again!

Thanks for your time. I think this could be one of factors, but what I realized is that there is some problem with implementation of the model in Stan.

data {
int<lower=1> N0;
vector[N0] Factor1_0;
int FCT1_NA_COUNT;
int FCT1_NA_INDEX[FCT1_NA_COUNT];
vector[N0] FACTOR1_SD;
int<lower=1>N0_DATA;
int FACTOR1_DATA_IDX[N0_DATA];

}
parameters {
vector<lower=0.3, upper=1.7>[FCT1_NA_COUNT] FCT1_MISSING;
}
transformed parameters {
vector[N0] FACTORS;
FACTORS = Factor1_0; FACTORS[FCT1_NA_INDEX] = FCT1_MISSING;
}
model {
FACTORS[2:N0] ~ normal(FACTORS[1:N0 - 1], 0.2);
}

If i increase the standard deviation of the model to 0.2 I am getting the following:

As you might see the median of the realizations (the black line) approaches 1.

Nota bene: the upper gray line represents 90% CI, the lower 10% CI.

I enclose the data and programs both in JAGS and Stan (please note that you would need to update links to files within the code).

00.SP500OHLCV.csv (1.2 MB)
JAGS_ver.R (1.9 KB)
Stan.R (4.3 KB)
Stan.stan (464 Bytes)
WILSHIRE_GNP.csv (4.4 KB)

Thanks you a lot,

Kind Regards
Peter

#6

Hello Matthijs,

I updated that in the code, thanks!

Kind Regards
Peter

#7

Hello Bob,

Thank you for replying. I apply precision in JAGS, and the code is enclosed in my response to Ben.

FCT1_MISSING - my guess is that this is potentially the hart of the problem, but honestly I have some problems with correct implementation. One of versions:

data {
int<lower=1> N0;
vector[N0] Factor1_0;
int FCT1_NA_COUNT;
int FCT1_NA_INDEX[FCT1_NA_COUNT];
vector[N0] FACTOR1_SD;
int<lower=1>N0_DATA;
int FACTOR1_DATA_IDX[N0_DATA];

}
parameters {
vector<lower=0.3, upper=1.7>[N0] FCT1_MISSING;
}
transformed parameters {
vector[N0] FACTORS;
FACTORS = Factor1_0;
FACTORS[FCT1_NA_INDEX] = FCT1_MISSING[FCT1_NA_INDEX];
}
model {
FCT1_MISSING[2:N0] ~ normal(FACTORS[1:N0 - 1], 0.2);
}

Which puzzles me a lot.
For simplicity I replaced FACTOR1_SD_t with 0.2, to have a closer look on the implementation.

Without missing I receive no parameters from the simulation.

The advice about the compact version of the random was was spot on - thanks!!!

Best Regards
Peter

#8

Real quick, what is the end goal here?

If you’re fitting a time series and want to get estimates of what is happening between points, usually this looks a lot more like a prediction sort of thing than a missing value thing.

So you fit a curve to all the red points and then predict the things in the middle.

Right now what is happening in the plots is kinda weird. Between the red points, the medians drift back to be near 1 and seem to ignore whatever data is on either side of them. This doesn’t seem like the usual assumption for time series.

1 Like
#9

Hello Ben,

Thank you for your response. I agree this looks very strange, and I would not expect to see that given the specification of the model .

The idea is to estimate points between the known observations (red dots), that are being used in evaluation of an integral in the later part of the code. The unobservable points are known by market participants to certain extent, as they are function of a monthly GNP and daily SP500 (knowing the direction of SP500 gives some idea of where the observable points can be).

I applied two other approaches:

  1. regression using SP500 as an explanatory variable - deviations of the red dots from the median are too big to be of practical use.
  2. I applied splines - in this case the model is fast if we have exactly the same number of days between observations, but in practice this is not the case as small differences in number of points between observations requires increasing max_treedepth - given complexity of the overall program this slows down the process substantially.

The most realistic model that I see here is presented in the 2nd plot from the top. Would you have some other suggestions of how to make that code in Stan working?

Thank you again for your help

Kind Regards
Peter

#10

If the splines weren’t flexible enough, have you looked at GPs? Does the second picture in Figure 2.2 of http://www.gaussianprocess.org/gpml/chapters/RW2.pdf seem interesting? That looks similar to the second plot from the top and you can do this in closed form.

How many days of data and such? Is it 5 data points over 250 days?

What are your goals beyond just fitting this data?

#11

Also depending on what your data is, there’s a time series package for sales sorta data you might play with: https://facebook.github.io/prophet/

And here are some fun plots with it: https://twitter.com/seanjtaylor/status/1123278380369973248

1 Like