Why does dropping constants give different parameter estimates if seed and inits are the same?

jonah · July 24, 2020, 10:05pm

In this thread https://discourse.mc-stan.org/t/what-is-the-difference-between-specifying-x-n-mu-simga2-and-target-normal-lpdf-mu-sigma2 @bbbales2 and I were both surprised that using the same inits and same seed weren’t enough to give the same results when changing from sampling notation to target += notation. I expected differences in lp__ obviously, but not in parameter values.

For example, why aren’t the inferences for x identical in the example below?

# Fit two simple models, only differing on whether 
# sampling notation or target+= is used

library("cmdstanr") 

file1 <- write_stan_tempfile(
  "
  parameters {
  real x;
  }
  model {
  x ~ normal(1, 1);
  }
  "
)

file2 <- write_stan_tempfile(
  "
  parameters {
  real x;
  }
  model {
  target += normal_lpdf(x | 1, 1);
  }
  "
)

mod1 <- cmdstan_model(file1)
mod2 <- cmdstan_model(file2)

# initialize to 0 and use same seed
fit1 <- mod1$sample(init = 0, seed = 123, save_warmup = TRUE)
fit2 <- mod2$sample(init = 0, seed = 123, save_warmup = TRUE)

I expect different values for lp__ but I also get slightly different results for x, certainly within monte carlo error, but it still surprised me because the initial values and seed are the same:

> fit1$print("x", "mean", "sd")
 variable mean   sd
        x 1.01 0.99
> fit2$print("x", "mean", "sd")
 variable mean   sd
        x 1.00 1.01

Presumably this is the right behavior but I’m struggling to understand why. Can anyone explain this?

bgoodri · July 24, 2020, 10:07pm

Look to see if the first couple iterations are the same. After that, whether there are or are not “constants” can cause it to round differently on the last decimal place and eventually cause the chains to differ.

jonah · July 24, 2020, 10:27pm

Thanks, that makes sense. The earliest iteration at which they differ is iteration 72 in chain 3:

x1 <- fit1$draws("x", inc_warmup = TRUE)
x2 <- fit2$draws("x", inc_warmup = TRUE)

for (j in 1:4) {
  cat("Chain", j, "first iteration different: ", which(x1[,j,] != x2[,j,])[1], "\n")
}

Chain 1 first iteration different:  501 
Chain 2 first iteration different:  233 
Chain 3 first iteration different:  72 
Chain 4 first iteration different:  192

bgoodri · July 24, 2020, 10:29pm

Yeah, that is totally possible. Nothing is actually constant while a computer is on.

Topic		Replies	Views
Target += vs. sampling: does dropping constants preserve sampling behavior? General	5	497	October 13, 2022
What is the difference between specifying X ~ N(mu, simga2) and target += normal_lpdf(mu, sigma2) General	17	948	July 26, 2020
Different results on the same data with the same seed Modeling fitting-issues	1	313	May 18, 2023
Target() == -inf at the beginning of model block Modeling	2	213	December 4, 2023
Different WAIC for the exact same model Modeling rstan , loo	10	153	October 3, 2024

Why does dropping constants give different parameter estimates if seed and inits are the same?

Related topics