Adding Lambert W Transforms to Stan?

stevebronder · March 28, 2020, 5:34pm

A guy I know wrote a paper titled The Lambert Way to Gaussianize Heavy-Tailed Data with the Inverse of Tukey’s h Transformation as a Special Case and has an associated R package LambertW. The TL;DR is that the paper has a parameterization to remove skewness and left/right kurtosis from random variables that reminds me a lot of non-centered parametrizations.

The package has a good example of this

# Univariate example
library(LambertW)
set.seed(20)
y1 <- rcauchy(n = 100)
plot(density(y1))
out <- Gaussianize(y1, return.tau.mat = TRUE)
plot(density(out$input) # huh!
x1 <- get_input(y1, c(out$tau.mat[, 1]))  # same as out$input
test_normality(out$input) # Gaussianized a Cauchy!

Wanted to throw it out there in case anyone though it would be a useful tool. On the backend we would need to add boosts LambertW implementation and just coding over the stuff in the R package to stan math

bgoodri · March 28, 2020, 6:25pm

I think it is a good idea. Adding something from boost::math is pretty easy, although reimplementing the Gaussianize stuff is more ambitious.

stevebronder · March 28, 2020, 7:37pm

Yeah the boost part is easy peasy. tbh I’m not really sure how we would want to expose something like Gaussianize. We could just add a tukey_hh_lpdf distribution etc. On the other hand the framework in the paper supports a lot of distributions and we could do something with a functor + specializations

// Could call it noncenter(...)?
target += lambertW_transform(normal_lpdf, y, X, mu, sigma, alpha, rho_l, rho_r);

spinkney · January 3, 2021, 1:04pm

Do we want to do the specializations for the distributions like in the LambertW package that uses MLE? Or can we do the moment matching?

The issue with the former is that one must specify the original distribution. That’s not that big of a deal, we do it all the time, but then we need all the specializations in the functor format you suggest. The second way of estimating the distribution is by moment matching. In fact, TF does this with their gaussianize function. TF bases their code on the Characterizing Tukey h and hh-Distributions through L-Moments and the L-Correlation by Headrick and Pant. They numerically solve for the left and right parameters when the distribution is asymmetric. The TF code uses binary search, see -https://github.com/tensorflow/transform/blob/879f2345dcd6096104ae66027feacb099e228e66/tensorflow_transform/gaussianization.py.

What’s cool about that L-moments paper is we could write a tukey_symmetric_lpdf and tukey_skew_lpdf that Stan samples from on the normal 0,1 scale. Along with estimating the location and scale, the symmetric versions would take in h and the skew versions an h_l and h_r as parameters. What is really cool though - and something TF does not do - is that we could do a multi_tukey specification. Where each marginal density has their own skew and/or kurtosis and connected via a correlation matrix. See equation 4.1 in the paper that uses the choleksy factor of the correlation matrix values.

We wouldn’t need all the specializations but we would need to implement code similar to what TF does and then all the correlation stuff in the paper.

stevebronder · January 4, 2021, 5:37pm

I like the moment matching idea, maybe we could setup a prototype branch in stan math to sort out what this would look like? The big thing is being able to go back and forth between the corrected and uncorrected data

spinkney · January 11, 2021, 10:07pm

Realistically, it’s going to be some time before I can try this as there’s a ton at work for the new year. I’d probably write it in Stan before trying it on stan math but that’s because my comfort with Stan >> C++.

arya · January 24, 2021, 5:32pm

Didn’t know about Lambert transforms. That’s cool! Reminds me of rank inverse normal transforms which are use a lot in genetics.

Is this typically used to transform data beforehand or were you thinking of using this to transform parameters to make them easier to sample?

I checked out the volatility modeling example in the paper where they transform the returns. What are the pros and cons of transforming the data then fitting with a Gaussian observational model versus just fitting raw returns with, for example, an SGT distribution?

stevebronder · February 1, 2021, 7:55pm

Section 3.1 of the lambertw transform paper goes over this a bit. For going to a normal there’s not much of a difference really but with the general lambert framework you can use different underlying distributions so it opens it up to different modeling approaches

Topic		Replies	Views
Generalized Lambda distribution Modeling specification	4	426	April 14, 2023
Partial non-centered parametrizations in Stan Modeling techniques	7	2697	December 30, 2018
Statisticians: _propto vs _unnormalized (vs ?) General	22	1459	June 4, 2019
Possibly improving the performance of Mulivariantnormal Distribution Modeling	2	473	February 28, 2020
Non centered parameterization on variance parameter Modeling	31	8421	October 22, 2018

Adding Lambert W Transforms to Stan?

Related topics