Do we want to do the specializations for the distributions like in the LambertW
package that uses MLE? Or can we do the moment matching?
The issue with the former is that one must specify the original distribution. That’s not that big of a deal, we do it all the time, but then we need all the specializations in the functor format you suggest. The second way of estimating the distribution is by moment matching. In fact, TF does this with their gaussianize
function. TF bases their code on the Characterizing Tukey h and hh-Distributions through L-Moments and the L-Correlation by Headrick and Pant. They numerically solve for the left and right parameters when the distribution is asymmetric. The TF code uses binary search, see -https://github.com/tensorflow/transform/blob/879f2345dcd6096104ae66027feacb099e228e66/tensorflow_transform/gaussianization.py.
What’s cool about that L-moments paper is we could write a tukey_symmetric_lpdf
and tukey_skew_lpdf
that Stan samples from on the normal 0,1 scale. Along with estimating the location and scale, the symmetric versions would take in h
and the skew versions an h_l
and h_r
as parameters. What is really cool though - and something TF does not do - is that we could do a multi_tukey
specification. Where each marginal density has their own skew and/or kurtosis and connected via a correlation matrix. See equation 4.1 in the paper that uses the choleksy factor of the correlation matrix values.
We wouldn’t need all the specializations but we would need to implement code similar to what TF does and then all the correlation stuff in the paper.