I wondered if it is possible to get the number of rejected draws in the chains. Is this number counted somewhere? I get 4000 posterior samples but how many attempts were not selected?
It would be interesting to see how big the percentage of accepted samples is because I’m comparing 1-, 2- and 3-parametric IRT models and I want to know to which amount the longer run time from 3pl models is due to rejecting more samples (because of bad likelihood) or just because of more work per sample draw (more parameters, more calculations).
I don’t think that’s currently available anywhere. When I wanted to use the information in my code, I scanned the model output and counted the number of lines that contained the word “reject”. Not super elegant, but it worked.
Note that often the biggest contributor to changes in runtime is not the increased calculation per single evaluation of the density + gradient, but the number of gradient evaluations per iteration as expressed by the achieved treedepth. To be specific: in each iteration, Stan increases the number of steps the integrator teaks by a factor of two, until it reaches a U-turn (or max_treedepth). The number of those doublings is stored in the fit as the treedepth diagnostic. So the number of density + gradient evaluations is 2^treedepth times. If the posterior geometry becomes more difficult to explore, the sampler will take shorter steps and require larger treedepth to reach a U-turn. E.g. if the average treedepth increases by 3, it means the model is doing 8 times as much work per iteration - even if a single evaluation of the density costs exactly the same. Since increasing the number of parameters usually changes the cost of single evaluation only proportionally to the number of parameters while the number of evaluations can grow exponentially, poor geometry can easily be the biggest contributor to runtime.
Stan’s dynamic Hamiltonian Monte Carlo sampler is not a Metropolis-Hastings method and hence does not have a sense of “rejections”. Instead in each iteration the sampler generates a numerical trajectory and then samples a new state from all of the points in that trajectory, including possibly the initial point.
The cost of each transition then scales with the cost of extending the numerical trajectory by a new point and then the number of points needed for each trajectory (which varies dynamically from iteration to iteration).
The cost of extending the numerical trajectory scales with the cost of evaluating the gradient of your target density function, and that will scale with the number of parameters and the amount of data. This cost is estimated at the beginning of each Stan run and reported as Gradient evaluation took X.XXXXX seconds.
The number of steps needed in each numerical trajectory, however, depends on geometry of your posterior density function which itself depends on the geometry of the realized likelihood function and the prior density function. The number of steps in each trajectory is saved as num_leapfrog__ but accessing those values will depend on the interface. For example in RStan they can be access by
Oh, it appears I misunderstood the question - I thought you were looking into counting the number of “proposal rejected” warning messages. But it appears you wanted the number of transitions that were not accepted - in that regard @betanalpha’s answer is definitely the better one.