Issues with inference in infinite factor model

stepelu · October 26, 2017, 11:18am

I implemented a minimal version of ADVI in TensorFlow (I’m aware of Edward but I needed something a bit more low-level):

http://www.jmlr.org/papers/volume18/16-107/16-107.pdf

The implementation passes basic tests but I wanted to check on a more involved model by comparing my inference results to Stan’s ADVI and NUTS ones.

I considered the IFM of Bhattacharya and Dunson:

http://people.ee.duke.edu/~lcarin/BhattacharyaDunson.pdf

The results from Stan are off, for instance the covariance matrix omega is not recovered in a 10-dimensional case (I generate the data) and as I am a complete beginner in Stan programming I blame this entirely on some implementation mistake. Further evidence of this is that Stan’s ADVI and NUTS inference results shows some level of agreement.

Follow my implementation:

data {
    int<lower=0> N;
    int<lower=0> P;
    int<lower=0> H;
    vector[P]    y[N];
}
parameters {
    real<lower=0>         a1;
    real<lower=0>         a2;
    vector<lower=0>[H]    delta;
    matrix<lower=0>[P, H] phi;
    matrix[P, H]          lam;
    vector<lower=0>[P]    s2i;
}
transformed parameters {
    vector[H]    theta;
    vector[P]    s2;
    matrix[P, P] omega;

    theta[1] = delta[1];
    for (h in 2:H)
        theta[h] = theta[h-1] * delta[h];

    s2 = 1.0 ./ s2i;

    omega = diag_matrix(s2) + lam * lam';
}
model {
    vector[P] zero;
    for (p in 1:P)
        zero[p] = 0.0;

    a1 ~ gamma(2.0, 1.0);

    a2 ~ gamma(2.0, 1.0);

    delta[1] ~ gamma(a1, 1.0);
    for (h in 2:H)
        delta[h] ~ gamma(a2, 1.0);

    for (p in 1:P)
        for (h in 1:H)
            phi[p, h] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

    for (p in 1:P)
        for (h in 1:H)
            lam[p, h] ~ normal(0.0, 1.0 / phi[p, h] / theta[h]);

    for (p in 1:P)
        s2i[p] ~ gamma(1.0, 0.3);

    for (n in 1:N)
        y[n] ~ multi_normal(zero, omega);
}

Are there any obvious issues, aside from missed vectorization opportunities in the distributions ?

I’m also aware that the 2-level hierarchical structure might not help inference, but I consistently recover omega in my ADVI implementation and the algorithm should be very close, so I must be doing something silliy here…

As an aside, is it expected for zero to be initialized to nan ?

stepelu · October 27, 2017, 7:40am

It turns out there were no mistake in the scripts above (well, aside from me ignoring that normal() is parametrized by scale and not variance in Stan).
The disagreement was caused by a bug in the post-processing of the csv files generated by Stan for plotting purposes.

Case closed.

Bob_Carpenter · November 1, 2017, 7:54pm

Thanks for replying with the resolution.

Bob_Carpenter · November 1, 2017, 7:55pm

There are other inefficiencies than just missed vectorizations, but it sounds like you don’t care about efficiency.

stepelu · November 29, 2017, 1:57am

I am actually going to implement another model soon which is not too dissimilar, so it would helpful if you could point out what are the other efficiency issues with the code above that I am unaware of.
Thank you in advance!

Topic		Replies	Views
Sparse infinite factor priors Modeling	14	1303	June 12, 2018
Variational Inference: ELBO is not ascending Modeling rstan , fitting-issues	1	223	December 8, 2023
Parameter Estimation using ADVI General	1	504	August 14, 2018
ADVI — how does Stan invert the matrix L? Algorithms techniques	12	669	July 1, 2022
Model Calibration Modeling fitting-issues	4	815	May 15, 2017

Issues with inference in infinite factor model

Related Topics