Just for clarity - parameters can be “uncoupled” in the prior distribution (e.g. the non-centered parametrization decouples the prior for the parameters) but still be “coupled”/”correlate”/… in the posterior for a specific dataset (and vice versa). For HMC, the shape of the posterior is what matters (the closer to independent normals, the better). Often, uncoupling in the prior leads to nicer posteriors (especially in cases where you don’t have a ton of data], but that’s just a useful heuristic, not a fundamental property of reparametrization.
Unless you are willing to do the hardcore math thing, I think reparametrization is best explored through examples - we list some at How to Diagnose and Resolve Convergence Problems – Stan , specifically (beyond non-centered parametrization):
- Non-centered parametrization for the exponential distribution
- Stan users guide chapter on QR reparametrization for linear models
- Identifying non-identifiability - a sigmoid model shows an example of where the parameters are not well informed by data, while Difficulties with logistic population growth model - #3 by martinmodrak shows a potential reparametrization.
- Reparametrizing the Sigmoid Model of Gene Regulation shows problems and solutions in an ODE model.
- Multiple parametrizations of a sum-to-zero constraint.
Hope some of those are helpful
Unfortunately, my experience is that developing useful reparametrizations is quite hard and typically requires substantial mathematical insight into the model-data combination at hand. Reparametrizing more “empirically” (e.g. looking at a pairs plot and trying to guess changes that would decorrelate the pairs) has almost never worked for me.
UPDATE:
Just rembered another nice example of reparametrization: Previously, Stan parametrized the simplex with a stick-breaking transform (see Constraint Transforms in version 2.36), but now it uses ILR (inverse softmax) see Constraint Transforms in current version which should work better. I don’t understand the problem very deeply, but my guess is that the advantage of the ILR is that it is both computationally more efficient and symmetric (insensitive to the order of the unconstrained parameters).