Change chain location if its get tilted

Ezequiel_Alvarez · December 2, 2023, 2:58am

Hi!

I am running cmdstan and I was told (in this forum) that a slight change in the model can produce drastic changes in time of convergence. Even if is the same dataset, same number of samples, etc. The reason being that a slight change in the model makes a different chain evolution, and some chain may get highly delayed somewhere.

Therefore, there is a way to tell the sampler that if it gets too much time sampling somewhere, then to change the chain location ?

Thank you, Ezequiel.

aakhmetz · December 2, 2023, 8:06am

I think you can control this partially with choosing appropriate priors and initial conditions for your chains

Ezequiel_Alvarez · December 2, 2023, 1:52pm

Hi! I was referring to something more dynamic. For instance: if a chain gets stalled and for 10m it does not have a new sample, then change location by some algorithm, and forget about the previous location.

I say so because I’ve seen incredible differences in sampling time, by just making very slight changes. Indicating that there might be some way to do not insist any more once a chain gets stalles for some given time.

Does it make sense what I’m saying? If yes, probably this does have already a setting parameter. Just that I didn’t find anything.

Thank you!

jsocolar · December 2, 2023, 10:01pm

The computational cost per gradient evaluation is pretty much constant no matter where in parameter space a chain sits. If a chain spends a long time to return a single sample, that is because sampling in that region of parameter space, using the step size that the sampler is using, requires a large number of gradient evaluations to hit the dynamic stopping criterion. If you REALLY want to force the sampler to avoid these long integrations, you can specify a max_treedepth of less than the default 10, or you can turn off warmup and supply a large step size. It is useful to tinker with this stuff to get a better feel for how Stan works, but if your main goal is to get results quickly, then I strongly recommend against doing either of these things, because you need to make sure that your samplers adapt adequately to your posterior. Some posteriors are hard.

This will not lead to valid inference. The whole point of MCMC methods is to do a (hopefully) unbiased exploration of the posterior. If there’s some region of the posterior that requires long integration times, then you will poison your inference by injecting a rule that avoids returning samples from this region.

Ezequiel_Alvarez · December 2, 2023, 11:11pm

Thanks Jacob, I think that I understand the coarse grain of what you mean.

(Nevertheless, some noise remains for me that the same model goes from taking 3 days to 8 hours simply because I randomly made a small change… But i guess this is just luck)

aakhmetz · December 6, 2023, 6:45am

I think it’s called “experience”

Topic		Replies	Views
Inconsistent chain speed - does this give a clue about the problem? Algorithms optimization	10	4577	July 20, 2018
Stan - do several runs of estimation Modeling	2	672	February 10, 2022
Using a fixed step size for HMC (or NUTS) and other sampler options Algorithms mcmc	21	3334	June 27, 2019
Chain length Modeling	2	1037	January 5, 2018
Change point model Algorithms variational-bayes	2	733	February 22, 2019

Change chain location if its get tilted

Related topics