I read that chapter. Take the example of Beta Distribution, what it did was convert
theta[n] ~ beta(alpha, beta);
theta ~ beta(lambda * phi, lambda * (1 - phi));
where as alpha and beta only has lower bounds, phi has an added upper bound. That is the only place I can see the reparameterization improving over the original. I am not sure how much difference that really makes, but if it did help, it seems to be due to the added upper bound. However, you are saying it does not have to do with bounding, so I don’t know why should the reparameterized version performs better, or how to apply the principle to other examples.
The model also declared that phi and lambda came from beta and pareto distributions, whereas alpha and beta came from some undeclared distributions. If it is the distributions that matters, than the example is not making that point by hiding alpha and beta’s distribution, and I have no intuition why the prior distribution can make a huge difference in computational speed.