Loo.brmsfit inexplicably warns about mismatch of yhash attributes

I’m using loo to compare categorical models fit using brms with cmdstan on remote linux instances.

Today I ran into a problem where the loo objects of models fit to the same dataset using different versions of brms and cmdstanr cannot be compared without the warning:
Not all models have the same y variable. (‘yhash’ attributes do not match)
And indeed, when I call attributes(loo_object)$yhash, the strings do differ, like so:

attributes(Mod04_Person3pl_StemX3pl_loo)$yhash
[1] “cbdddd13d96776ed82ff6339b88005b4fc406aa8”

attributes(Mod04_Person3pl_StemX3pl_Corr_loo)$yhash
[1] “29c95c57a1eda060bc20b2281b44333b6edeeddc”

But the datasets are the same. The two models above differ only in whether there’s a correlation parameter between varying slopes and intercepts.

The first model was fit using brms 2.17, cmdstan 2.30.1, and cmdstanr 0.2.0.

The second model was fit using brms 2.18.7, cmdstan 2.30.1, and cmdstanr 0.5.3.

Is it safe to ignore the warning? For what it’s worth, the reported elpd_loo statistic do look plausible.

You can ignore the warning if you know that the data is the same. Without this warning there would be bigger probability that people accidentally compare models which have fitted using different transformation of y variable. As y can be big, we store only the hash, but it seems the function computing the hash is not stable over different environments. Did you use also the same R version in both cases?

2 Likes

No, the one remote system is running R version 4.2.1, the other 4.2.2.

I refreshed my memory on this, and I think only the brms version should matter, and brms has tried to be careful on how the hash is computed to take into account only the values (and not e.g. attribute order). Pinging @paul.buerkner if he has ideas about what has changed from 2.17 to 2.18.7

1 Like

I don’t know for sure but I don’t think I have changed the hashing between 2.17 and 2.18.7.