**TL; DR**: I’m modeling missing data points in a time series as a thing like a simplex, and it’s relatively fast but I think it has relatively low n_eff/sample, and I’d like to make it better.

I’m modeling a time series:

`Y_t ~ normal(Y_t-1 + mu,sigma)`

with missing observations:

`Y[3] = nan`

`Y[4] = nan`

I’m dealing with the missing observations by treating them as parameters to be estimated; if there’s an observation at time `t=2`

, `Y_2`

, and another observation at `t=5`

, `Y_5`

, then `Y_3`

and `Y_4`

are inferred as parameters. What do we know about those parameters? Since this is a time series, we know that the difference between the two observed time points (`Y_5`

and `Y_2`

) must be explained by the steps between `Y_2`

, `Y_3`

, `Y_4`

, and `Y_5`

.

`Y_5 - Y_2 == (Y_5 - Y_4) + (Y_4 - Y_3) + (Y_3 - Y_2)`

(This is particularly natural if we know that `Y`

is monotonically increasing, in which case the time series model is `Y_t ~ normal(Y_t-1 + mu, sigma) T[Y_t-1,]`

). The monotonic case isn’t germane to my question today, but it’s the use case I’m eventually building up to.)

The way that I actually model this in Stan is not to treat `Y_4`

and `Y_3`

as parameters (the levels), but to treat `(Y_5 - Y_4)`

, `(Y_4 - Y_3)`

, and `(Y_3 - Y_2)`

as parameters (the steps). We’re not modeling the levels; we’re modeling the steps. I can then constrain these steps using the equation above, and back out what the levels `Y_3`

and `Y_4`

should be. At the end of the post is my Stan model for this.

**The problem is as follows**: One of the parameters is unnecessary. If you know `Y_5 - Y_2`

, `Y_5 - Y_4`

, and `Y_4 - Y_3`

, then you know `Y_3 - Y_2`

. If we could lop off one of these parameters, surely things would be better. Is there a way to do this? Is there a better way to attack the problem?

Example output (with `mu=3`

, `sigma=1`

, and `Y_3`

and `Y_4`

missing)

```
Inference for Stan model: anon_model_4888889db6e52ba8f2aaf6c10bceb20d.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
mu 3.05 9.9e-4 0.05 2.96 3.02 3.05 3.09 3.14 2212 1.0
sigma 1.03 7.7e-4 0.03 0.97 1.01 1.03 1.06 1.1 1896 1.0
restricted_latent_parameters[0]-3.8e-3 0.46 0.65 -0.95 -0.62 -0.03 0.61 0.95 2 3.07
restricted_latent_parameters[1]-6.6e-3 0.46 0.65 -0.97 -0.63 0.01 0.62 0.96 2 3.1
restricted_latent_parameters[2] 5.6e-4 0.46 0.65 -0.95 -0.62-4.1e-3 0.62 0.95 2 3.1
Y_latent[0] 2.73 3.1e-164.4e-16 2.73 2.73 2.73 2.73 2.73 2 nan
Y_latent[1] 5.98 6.3e-168.9e-16 5.98 5.98 5.98 5.98 5.98 2 nan
Y_latent[2] 8.49 0.0 0.0 8.49 8.49 8.49 8.49 8.49 2 nan
Y_latent[3] 11.22 0.01 0.73 9.69 10.76 11.27 11.69 12.61 4000 1.0
Y_latent[4] 13.98 0.01 0.72 12.56 13.51 13.96 14.44 15.41 4000 1.0
Y_latent[5] 16.71 0.0 0.0 16.71 16.71 16.71 16.71 16.71 2 nan
Y_latent[6] 19.79 5.0e-157.1e-15 19.79 19.79 19.79 19.79 19.79 2 nan
...
```

If there are no missing observations then we have markedly higher `n_eff`

; sometimes all the way up to 1 `n_eff/sample`

.

And here’s the model:

```
data {
int T; // number of time steps
vector[T] Y; // data to model
int n_missing_observations;
}
parameters {
real mu;
real<lower=0> sigma;
vector<lower=-1, upper=1>[n_missing_steps] restricted_latent_parameters;
}
transformed parameters {
vector[T] Y_latent;
{
int restricted_param_counter;
int gap_width;
real previous_value;
int previous_value_index;
restricted_param_counter = 1;
Y_latent[1] = Y[1];
previous_value = Y[1];
previous_value_index = 1;
for (t in 2:T){
if (!is_nan(Y[t])){
Y_latent[t] = Y[t];
gap_width = t-previous_value_index;
if (gap_width>1){
// These are the unobserved STEPS between observed time steps.
// I.e. If Y_3 and Y_1 are observed, by Y_2 is not, these are (Y_3 - Y_2) and (Y_2-Y_1)
// We will say that these steps have to sum up to the observed difference between Y_3 and Y_1.
// The unobserved levels then have values that are the cumulative sum of these steps.
Y_latent[previous_value_index+1:t-1] =
cumulative_sum(
restricted_latent_parameters[restricted_param_counter:restricted_param_counter+gap_width-1]
/ sum(restricted_latent_parameters[restricted_param_counter:restricted_param_counter+gap_width-1])
* (Y[t] - previous_value) //Take the parameters and divide by their sum, then multiply by the total step that needs to be accounted for. You now have a set of steps that lead up to the next observed level.
)[1:gap_width-1] + previous_value;
// We don't include the last step in this sum, since we already explicitly grabbed the level
// we ultimately get to from the data itself.
restricted_param_counter = restricted_param_counter + gap_width-1;
}
previous_value = Y[t];
previous_value_index = t;
}
}
}
}
model {
mu ~ normal(0, 1);
sigma ~ cauchy(0, 4);
for (t in 2:T){
Y_latent[t] ~ normal(Y_latent[t-1]+mu, sigma);
}
}
```