I am running 8 parallel chains for a large gaussian process model, which will probably be scaled up further, and from some preliminary results I anticipate issues with the mixing of the chains. I have two main questions.
-
The first issue I see is that different parallel chains may take as little as 1 or more than 10 hours; I know some variation is expected but I didn’t expect differences of that order, so I’m assuming there could be a computational justification. The machines running this have plenty of CPUs and more than enough memory for the job. Is this expected in general, or is something else going on here?
-
The second issue is with convergence and mixing itself. The likelihood of each chain seems to be stationary, but not all are mixing properly. I am already juggling the amount of data/model size (ideally I want to include as much as possible), the length of the chains, and the total runtime (it could easily run for weeks). I am using PyStan and I see that one of the first suggestions seems to be to add
control={'adapt_delta':0.9}
or 0.99 and increase burn-in length in the sampling function call, but I’m guessing this may affect performance. I read the Stan and PyStan references and it was not clear what are the trade-offs here. Is there a limit to how much this can help? (Like I said, I’ll probably be operating in the limit of time and size of data set). -
Finally, suppose the likelihood of half of the chains converge and mix, but the rest seem stationary around lower values. Of course if they all mix the expectations can be computed from any of the samples, but what would be the recommendation in this other case?
I could think of using only the ones converging around the higher likelihood (assuming there are enough of them to assess between and within convergence), but I’m not sure that’s correct.
bonus question: how does the extract
function permuted
option work exactly? It seemed like it randomly permuted samples, but from a shorter run I got this figure that makes it look like it’s just stacking the traces.
thanks.