Pairs() plot

rstan::stan() launch the following warning, so I execute pairs(stanfit S4 object), then very tiny things are drawn. But I cannot see it nor I cannot understand what it is.
Please let me know what the warning means ?

Examine the pairs() plot to diagnose sampling problems

1 Like

Thank you for reply and letting me know pairs().

I paste an output of pairs() for my model, i.e., the result of pairs(fit, pars = c("A", "lp__","z[4]")) where fit is an object of the S4 class stanfit. This object fit has the following warning:

Warning messages: 1: There were 8086 divergent transitions after warmup. Increasing adapt_delta above 0.9999 may help. See #divergent-transitions-after-warmup 2: Examine the pairs() plot to diagnose sampling problems

z[4] is ideally infinity, but I put an upper bound by 100000.

Is there any problem ?
I am not sure what this image means or how to plotted.
This is plots the MCMC samples for two pairs corresponding in the cross-table ?

The plot shows a high amount of divergences. It’s a clear fail. This doesn’t mean its difficult to fix,
but you may provide some more about the model otherwise nobody we be able to help.
Please provide the model.

1 Like

Very thank you for reply and I’m relieved to hear the possibility for my model being fixed.

I show pairs() for the other parameters for an S4 object of stanfit.

In picture, is there also problem ?

Note that the R hat statistics is less than 1.01 for all parameters in my model.

See for more discussion.


Thank you for reply and I read it.

The above page describes the two methods to remove divergence, that is decreasing the step size and change the model description to exactly equivalent one… And I attempted the first method but it did not work well.

I set `control= list(adapt_delota= 0.9999 and this treatment does not have efficacy for my model. So, I think my model description should be changed but I am not sure haw to change.

To tell the truth, I tried the second method, i.e. changing the description of my model,

Some parameter in my model has very strong bias,

The above page shows the only one example to reduce the divergence, that is Gaussian case with linear parametrization, I want to know, binomial case. My case is Binomial distribution’s. In this case, parameters is defined by some function , then what should we do for centering parameters.

X \sim \text{Binomial}(prob = foo(\theta) ,N)

where foo is a some non linear function of \theta which is a model parameter and N is a iteration number of Bernoulli trials. I wonder it is possible to centering foo() in the case of a non linear function Or, we need to approximate foo() by some linear function to centering? Such a approximation will generate a new bias.

Centering is difficult for non linear functions…

Is there any paper describing the divergent transition issues except the following two ? I want to describe briefly in my paper how to avoid the divergent transition in my model but I do not have read the book or paper which directory describing the divergent transition issues.

The first book in the following is difficult for me to get.

Betancourt, Michael, and Mark Girolami. 2015. “Hamiltonian Monte Carlo for Hierarchical Models.” In  *Current Trends in Bayesian Methodology with Applications* , edited by Umesh Singh Dipak K. Dey and A. Loganathan. Chapman & Hall/CRC Press.

Rubin, Donald B. 1981. “Estimation in Parallel Randomized Experiments.”  *Journal of Educational and Behavioral Statistics*  6 (4): 377–401.

That paper is also freely available on the arXiv,, although it doesn’t discuss divergences. You can cite the case study direction as

Betancourt, Michael (2017). Diagnosing Biased Inference with Divergences. Retrieved from .

For divergences you can also reference (see Section 5.1 and Section 6.2 for more discussion) or (see Section 7.4).

1 Like

Thank you !!
I will read it !!

By reading your page in the above, I can remove the warnings as follows: Thank you again for plain explanation of divergent transition issues !!

0 of 8000 iterations ended with a divergence.

Tree depth:
0 of 8000 iterations saturated the maximum tree depth of 15.

E-BFMI indicated no pathological behavior.