I’m trying to fit a model with a lot of parameters to datasets of about 60,000 observations. To start, I sampled 10,000 observations from the full datasets for reasons of time, to make sure the model worked correctly. I fit the model to these 10,000 observations with 8 chains. All of them converged for each of the datasets.
However, when I run the model on the complete datasets (~60,000 each), I get a few chains that get “stuck” in regions that don’t end up converging to the other chains. Is there any reason why this might occur with larger datasets? Should I use a longer warmup? A longer first phase of the warmup stage to get to the typical set (some hyper-parameters don’t appear to be in a reasonable place in the stuck chains)? The model takes 2-4 days to fit on the full data, so it’s not easy to test things quickly. Any ideas would be welcome!