Cross-conditional dependence in random walk models

Hi,

I have programmed two 1st-order random walk models that, in principle, should be equivalent but I obtain different results. The first model corresponds to a full-conditional formulation of a Poisson time series where each element of the series is conditioned to the rest of the series (both “past” and “future”), the second model corresponds to the traditional formulation of random walk time series where each observation is conditioned just on its past.

Specifically, the first of the models is:

data {
  int<lower=0> N;
  array[N] int<lower=0> y; 
}
parameters {
  real beta; 
  real<lower=0> sigma_phi; 
  vector[N] phi; 
}
model {
  y ~ poisson_log(beta + phi * sigma_phi);
  
  phi[1] ~ normal(phi[2],1);
  phi[2:(N-1)] ~ normal((phi[1:(N-2)]+phi[3:N])/2, 2^(-0.5));
  phi[N] ~ normal(phi[N-1],1);

  mean(phi) ~ normal(0, 0.001); 
}
generated quantities {
  vector[N] mu = exp(beta + phi * sigma_phi);
}

and the second

data {
  int<lower=0> N;
  array[N] int<lower=0> y; 
}
parameters {
  real beta; 
  real<lower=0> sigma_phi; 
  vector[N] phi; 
}
model {
  y ~ poisson_log(beta + phi * sigma_phi);
  
  for(i in 2:N){
    phi[i] ~ normal(phi[i-1], 1);
  }
  mean(phi) ~ normal(0, 0.001); 
}
generated quantities {
  vector[N] mu = exp(beta + phi * sigma_phi);
}

I have used improper flat prior distributions for beta, phi[1] (in the second model) and sigma_phi (on the positive real line). I have called these two models (“RW_fullcond.stan” and “RW_past.stan”) from R by doing

# Call to the full conditional model
data = list(N=length(y), y=y)
set.seed(123)
fit.RW.fullcond <- stan(file = "RW_fullcond.stan", data = data, chains = 3, iter = 5000)

# Call to the model conditioning on the past
set.seed(1)
fit.RW.past <- stan(file = "RW_past.stan", data = data, chains = 3, iter = 5000)

where I obtain the following summary statistics for sigma_phi:

summary(fit.RW.fullcond)$summary["sigma_phi",]
#    mean  se_mean       sd     ...
# 0.04345  0.00324  0.01735     ...
summary(fit.RW.past)$summary["sigma_phi",]
#    mean  se_mean       sd   ...
# 3.70e-01 3.15e-03 9.16e-02  ...

corresponding differences were also observed for phi, even for mu.
What are these differences due to? Does Stan maybe missunderstand the cyclic conditional relationships of the full conditional model? Should I avoid model formulations of that kind in Stan (JAGS and Nimble do not allow them)? Is this documented anywhere?

Thanks in advance.

I don’t think those models are equivalent. They condition on very different things. To be the same in Stan, the density p(\beta, \sigma_\phi, \phi | y, N) needs to be the same in both models (up to a constant factor that doesn’t depend on the parameters).

Maybe it’ll help to understand what Stan does. It’s not trying to understand anything. It literally treats every distribution statement (with ~) and target += statement as incrementing the underlying log density. Then it uses automatic differentiation on that target to sample. So if two Stan programs produce different answers, it’s because they define different log density functions. In order for two Stan programs to produce the same answer, the log density function needs to be the same up to a constant so that the derivatives are the same.

Stan’s using a scale-based normal distribution, so 2^-0.5 induces a sale of 1/sqrt(2).

Also, you can write

for(i in 2:N) {
  phi[i] ~ normal(phi[i-1], 1);
}

as

phi[2:] ~ normal(phi[:N], 1);

I also think it’d be clearer if you just wrote

phi[2:] ~ normal(phi[:N], sigma_phi);

rather than multiplying through by sigma_phi in the linear predictor for y. You can then declare phi as

vector<multiplier=sigma_phi> phi;

to produce the non-centered parameterization.