Sum of log-normal distributions and Quadrature Integration

I am trying to model some data as:

Y = log(sum^k (e^X_i))
X_i ~ N(0,sigma)

I had been putting all of the X_i as parameters in the model, which works reasonably well, but it was rather slow. To speed things up, I was using the Schwartz-Yeh approximation, which is a lognormal approximation for the sum of lognormals, however to find the parameters for the approximation, one must use numerical integration, and people normally use quadrature eg:

http://ieeexplore.ieee.org/document/467959/
and MATLAB code
http://www.snowelm.com/~t/doc/tips/20110902.en.html#sy

However, it seems strange to be doing the quadrature within the HMC, and I haven’t been getting the speed up I was expecting. The number of numerical integrations scales O(log2(k)), and with 8 quadrature points it’s not clear that much computation is saved.

Reading the manuals and forum, I’ve seen HMC described as an integration algorithm, so it seems like it might be possible to somehow replace the quadrature with new latent variables, but I’m not sure how to proceed.

1 Like

Try defining the X_i values to be an ordered parameter rather than real. This will stop swapping of the parameters between modes which is probably killing your performance. The issue is that your original model is unidentified under exchange of parameters. Ie x_1 --> x_2 and x_2 --> x_1 gives you the same likelihood.

That makes sense about the parameters swapping around - but with the ordered constraint is it still the same model? This application seems somewhat different from the cutpoint regression introduced in the manual.

It’s still the same problem formulation. It just means that your variables are ordered, so the problem is no longer unidentified. You can’t swap X_2 and X_1 in this case.

I’ve done something similar before in a number of cases and it has worked really well.

Give it a try and let me know how it goes…

Well, it’s not technically exactly the same model, but you can think of the ordering as doing the normal distribution, then sorting. So yes, it’ll be the same for all intents and purposes. We use this technique to identify mixtures, and now that you’e going down this route, you might want to read @betanalpha’s case study on mixtures (under doc on our web pages).

If you want to code this

Y = log(sum^k (e^X_i))
X_i ~ N(0,sigma)

you want to code this in Stan as

X[i] ~ normal(0, sigma);
...
Y = log_sum_exp(X);

That’ll keep the arithmetic stable and won’t lose precision.

I couldn’t follow what the integral was you were trying to do, but I’d stay away from approximations if at all possible.

Thanks! I started to look into it and it seemed like it was taking a while for the parameters to converge. All of the X_i are pretty far into the right tail of the distribution and they seemed to all get smashed together, but not at a value low enough.

The Schwartz-Yeh approximation models
Z = log(e^X + e^Y)
Where Z is approximately normally distributed with mean equal to E[Z] and var = E[Z^2] - E[Z]^2.

Although it is an approximation I’m looking at sums of 100-500 variables, and by recursively doing the integral, I am only looking at adding O(log(k)) computations to the model rather than O(k) variables…

Just a quick update - I was able to sidestep the problem by using a gamma rather than log-normal distribution, and then using the sum of logarithmized Gamma random variables using
the method of Marques, Coelho & Carvalho (2014).

Thanks for writing back with the solution (which I marked—at least I hope it was the solution to this problem).