Unexplained change in convergence on hierarchical model

Hello fellow STAN-ers,

We’ve been working on a hierarchical model (menstrual cycle lengths woman-wise and the woman’s characteristics population-wise) for more than six months. We built the model incrementally and tested a lot of different datasets with different characteristics (to learn what the model could fit best and how). This summer, we were happy with our model’s convergence: the Rhat&ESS looked good and resulting interpretable parameters made sense.

Recently, I re-ran RStan only to find the same MC did not converge anymore. Since then, I’ve gone back to playing with the model, implementing it in cmdStanR, changing the sampler parameters, etc., without any success. It fails to converge at all (the parameters have Rhat between 1.2 and 4) and seems to be hitting the maximum tree depth in most iterations (75%).

  1. Has this “unexplainable” switch of convergence ever been seen before ? The dataset is exactly the same, the code versioning (git) shows no differences in the scripts and there have been no automatic or manual system updates. . The seed hasn’t been changed either.

  2. The two R wrappers, cmdStanR and Rstan give slightly different results (on smaller, easier, converging datasets with which I’m testing my implementation). I imagine this is to be expected?

  3. Do you have any idea of what I could further try? As I’m trying to replicate an existing model (described in a published article), I cannot re-parametrize it further unless there is no other option at all.

Thanks so much for any help, all input welcome!

data {
 int < lower = 1 > N; // Number of cycles
 int < lower = 1 > W; // Number of women
 int < lower = 1 > iwoman[N]; // Table: women to whom the cycle belongs
 vector[N] lnC; // Cycle lengths
parameters {
 real < lower = 0 > half_nu; // Degrees of freedom
 vector < lower = 0 >[N] phi; // Precision

 real mub;     // Population mean of women means
 real < lower = 0 > psi;   // Population variance of women means
 real < lower = 0 > a;   // Population parameters for women variances
 real < lower = 0 > b;

 vector[W] mu; // Women means
 vector < lower = 0 >[W] sigmasq; // Women variances
transformed parameters{
 real invpsi;
 invpsi = 1/psi;
model {
  half_nu ~ gamma(1.0e-3, 1.0e-3);
  phi ~ gamma(half_nu, half_nu);

  mub ~ normal(0,sqrt(1000));
  psi ~ inv_gamma(1.0e-3, 1.0e-3);
  a ~ gamma(1.0e-3, 1.0e-3);
  b ~ gamma(1.0e-3, 1.0e-3);

  mu ~ normal(mub, sqrt(psi));
  sigmasq ~ gamma(a, b);

  lnC ~ normal( mu[iwoman], sqrt(sigmasq[iwoman] ./ phi) );

fit_CL <- CL_fullhierarchy$sample(data = CL_train_data,
                                  iter_warmup = 3000, iter_sampling = 3000, thin = 10,
                                  chains = 4, threads_per_chain = 2, seed = 803214053,
                                  adapt_delta = 0.99, step_size = 0.01,
                                  refresh = 300, output_dir = paste0(diagnosticspath, "pi_cycle1/"))

Without a deep dive into your code, it is entirely (unfortunately) possible that a model that fits fine on a downsampled dataset will fail when applied to the larger version. This could be because there is some structure in the whole data which is eliminated in the smaller trial fits that causes prior/data or likelihood/data conflict and tough posterior geometry. Similarly, if you ran the model on previous datasets and recently added more data the new data could be causing similar problems. This could indicate some distribution shift in the sampled population or perhaps some issues with the most recent round of data collection.

The previous working fit stores Stan version and wrapper version, are those the same with the failed runs?

Yes. They are likely using different versions of Stan.

It’d be unlikely the reproducibility is broken (see 20 Reproducibility | Stan Reference Manual) so I’d go through the list to see if anything changed.

I only took a glance of the model, gamma(1.e-3, 1.e-3) seems unusual. It implies a large curvature in the likelihood function which may stress the sampler.

1 Like

Thank you for your reply! However, the model stopped fitting exactly the same dataset, which is puzzling.

Ah, my bad – didn’t notice that in the earlier post. Does the problem still occur if someone else in your group tries to run the model on their machine?

This is something of a long shot, but the default chain ID did change from 0 to 1 during Stan 2.28. This means the same pRNG seed would have different results before and after that version if you did not manually specify a chain ID. If your model is extremely sensitive to that kind of thing and if your version went from <2.28 to >2.28, it could be it?

1 Like

To diagnose whether it’s a version issue, you could use cmdstanr to install an older version and compare the results:

install_cmdstan(version = "2.27.0")

Thanks for the replies! A little additional info :

  • I realised my initial question is confusing, because I quote the cmdStanR command CL_fullhierarchy$sample(…). But the initial script that managed to converge all summer was in Rstan : so the actual command was with “fit_CL ← stan(file = CL_fullhierarchy, data = CL_train_data, warmup = 3000, iter = 6000, chains = 4, thin = 1, seed = 803214053, control = list(adapt_delta = 0.99))”. This doesn’t run anymore, which is why I tried out cmdStanR two weeks ago (thought maybe another wrapper will give more insights). (RStan and cmdStanR give compatible results on much smaller development data where they both converge.) I hope I’m explaining it clearly.

  • Checking in zsh, no R packages were changed since setting it up in January, except cmdStanR. So Rstan (rstan_2.28.2.9000) did not move version.

  • Someone else tried running on their machine - no success there either.