Issues with inference in infinite factor model

stepelu · October 26, 2017, 11:18am

I implemented a minimal version of ADVI in TensorFlow (I’m aware of Edward but I needed something a bit more low-level):

http://www.jmlr.org/papers/volume18/16-107/16-107.pdf

The implementation passes basic tests but I wanted to check on a more involved model by comparing my inference results to Stan’s ADVI and NUTS ones.

I considered the IFM of Bhattacharya and Dunson:

http://people.ee.duke.edu/~lcarin/BhattacharyaDunson.pdf

The results from Stan are off, for instance the covariance matrix omega is not recovered in a 10-dimensional case (I generate the data) and as I am a complete beginner in Stan programming I blame this entirely on some implementation mistake. Further evidence of this is that Stan’s ADVI and NUTS inference results shows some level of agreement.

Follow my implementation:

data {
    int<lower=0> N;
    int<lower=0> P;
    int<lower=0> H;
    vector[P]    y[N];
}
parameters {
    real<lower=0>         a1;
    real<lower=0>         a2;
    vector<lower=0>[H]    delta;
    matrix<lower=0>[P, H] phi;
    matrix[P, H]          lam;
    vector<lower=0>[P]    s2i;
}
transformed parameters {
    vector[H]    theta;
    vector[P]    s2;
    matrix[P, P] omega;

    theta[1] = delta[1];
    for (h in 2:H)
        theta[h] = theta[h-1] * delta[h];

    s2 = 1.0 ./ s2i;

    omega = diag_matrix(s2) + lam * lam';
}
model {
    vector[P] zero;
    for (p in 1:P)
        zero[p] = 0.0;

    a1 ~ gamma(2.0, 1.0);

    a2 ~ gamma(2.0, 1.0);

    delta[1] ~ gamma(a1, 1.0);
    for (h in 2:H)
        delta[h] ~ gamma(a2, 1.0);

    for (p in 1:P)
        for (h in 1:H)
            phi[p, h] ~ gamma(3.0 / 2.0, 3.0 / 2.0);

    for (p in 1:P)
        for (h in 1:H)
            lam[p, h] ~ normal(0.0, 1.0 / phi[p, h] / theta[h]);

    for (p in 1:P)
        s2i[p] ~ gamma(1.0, 0.3);

    for (n in 1:N)
        y[n] ~ multi_normal(zero, omega);
}

Are there any obvious issues, aside from missed vectorization opportunities in the distributions ?

I’m also aware that the 2-level hierarchical structure might not help inference, but I consistently recover omega in my ADVI implementation and the algorithm should be very close, so I must be doing something silliy here…

As an aside, is it expected for zero to be initialized to nan ?

stepelu · October 27, 2017, 7:40am

It turns out there were no mistake in the scripts above (well, aside from me ignoring that normal() is parametrized by scale and not variance in Stan).
The disagreement was caused by a bug in the post-processing of the csv files generated by Stan for plotting purposes.

Case closed.

Bob_Carpenter · November 1, 2017, 7:54pm

Thanks for replying with the resolution.

Bob_Carpenter · November 1, 2017, 7:55pm

There are other inefficiencies than just missed vectorizations, but it sounds like you don’t care about efficiency.

stepelu · November 29, 2017, 1:57am

I am actually going to implement another model soon which is not too dissimilar, so it would helpful if you could point out what are the other efficiency issues with the code above that I am unaware of.
Thank you in advance!

Topic		Replies	Views
Correlated 2D Gaussian breaks ADVI Modeling fitting-issues	23	3369	July 12, 2018
Sparse infinite factor priors Modeling	14	1415	June 12, 2018
Speeding up and reducing divergences for multivariate hiearchical stochastic factor model Modeling techniques , fitting-issues , performance	80	2377	June 22, 2020
Advi to avoid divergent transitions? Modeling	12	1706	May 19, 2017
Rstan much slower than self-coded MCMC Modeling performance	10	2402	January 14, 2022

Issues with inference in infinite factor model

Related topics