Why does dropping constants give different parameter estimates if seed and inits are the same?

In this thread https://discourse.mc-stan.org/t/what-is-the-difference-between-specifying-x-n-mu-simga2-and-target-normal-lpdf-mu-sigma2 @bbbales2 and I were both surprised that using the same inits and same seed weren’t enough to give the same results when changing from sampling notation to target += notation. I expected differences in lp__ obviously, but not in parameter values.

For example, why aren’t the inferences for x identical in the example below?

# Fit two simple models, only differing on whether 
# sampling notation or target+= is used

library("cmdstanr") 

file1 <- write_stan_tempfile(
  "
  parameters {
  real x;
  }
  model {
  x ~ normal(1, 1);
  }
  "
)

file2 <- write_stan_tempfile(
  "
  parameters {
  real x;
  }
  model {
  target += normal_lpdf(x | 1, 1);
  }
  "
)

mod1 <- cmdstan_model(file1)
mod2 <- cmdstan_model(file2)

# initialize to 0 and use same seed
fit1 <- mod1$sample(init = 0, seed = 123, save_warmup = TRUE)
fit2 <- mod2$sample(init = 0, seed = 123, save_warmup = TRUE)

I expect different values for lp__ but I also get slightly different results for x, certainly within monte carlo error, but it still surprised me because the initial values and seed are the same:

> fit1$print("x", "mean", "sd")
 variable mean   sd
        x 1.01 0.99
> fit2$print("x", "mean", "sd")
 variable mean   sd
        x 1.00 1.01

Presumably this is the right behavior but I’m struggling to understand why. Can anyone explain this?

Look to see if the first couple iterations are the same. After that, whether there are or are not “constants” can cause it to round differently on the last decimal place and eventually cause the chains to differ.

1 Like

Thanks, that makes sense. The earliest iteration at which they differ is iteration 72 in chain 3:

x1 <- fit1$draws("x", inc_warmup = TRUE)
x2 <- fit2$draws("x", inc_warmup = TRUE)

for (j in 1:4) {
  cat("Chain", j, "first iteration different: ", which(x1[,j,] != x2[,j,])[1], "\n")
}

Chain 1 first iteration different:  501 
Chain 2 first iteration different:  233 
Chain 3 first iteration different:  72 
Chain 4 first iteration different:  192 
1 Like

Yeah, that is totally possible. Nothing is actually constant while a computer is on.

1 Like