Potential bug with (cmd)stan(py)'s optimisation process running a probabilistic PCA

adi_r · March 15, 2023, 12:45am

I would like to understand why I’m seeing the results that I’m seeing running a PCA with Stan on some weirdly simulated data.

The data: is nine different mnist digits rotated a bunch of times.

The model:

\mathbf{Y}|\mathbf{X}, \sigma^2 \sim \mathcal{MN}(\mathbf{0}, \mathbf{X}\mathbf{X}^T + \sigma^2 \mathbf{I}, \mathbf{I})

(this is matrix normal notation for every column of \mathbf{Y} being independently drawn from a zero mean multivariate normal with a dor product covariance function applied to the latents).

The model famously corresponds to probabilistic PCA (the maximum likelihood solution for \mathbf{X} | \mathbf{Y} is attained at the scaled eigenvectors of the sample covariance).

The problem: If I use unconstrained latents, I get a different solution to if I use constrained latents. This is very surprising - is this caused by numerical instabilities? If so, why?

If I set the line (in my parameters block below) as follows:

    matrix<lower=-100, upper=100>[n, q] X;  // latents

I obtain the latents:

This are incorrect looking latents.

If instead the line is changed to:

    matrix[n, q] X;  // latents

the result becomes:

which is absolutely fine - it’s a rotation of the eigenvectors and since the posterior is invariant to rotation, this makes sense.

Stan places an improper uniform prior on the constrained parameters right? So I don’t understand this behaviour.

What’s more is that, using a bridgestan model that uses the constrained parameters, and using scipy’s optimize to optimize the parameters works and generates the correct solution (and not the weird axis aligned solution). Could someone shed some light into what’s going on?

I’m using cmdstan 2.31.0.

My code is as follows:


import torch
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from cmdstanpy import CmdStanModel
from torchvision.transforms.functional import rotate
from tensorflow.keras.datasets.mnist import load_data

plt.ion(); plt.style.use('seaborn-pastel')
np.random.seed(42)

with open('dr.stan', 'w') as f: f.write("""
data {
    int n;  // num data
    int d;  // num data dims
    int q;  // num latents
    matrix[n, d] Y;  // data
}
transformed data {
    vector[d] w = rep_vector(1.0, d);
}
parameters {
    matrix<lower=-100, upper=100>[n, q] X;  // latents
    real<lower=1e-6, upper=1> sigma_sq;
}
model {
    Y' ~ multi_gp(add_diag(X * X', sigma_sq), w);
}
""")

def plot(X):
    plot_df = pd.DataFrame(dict(x=X[:, 0], y=X[:, 1], hue=c.astype(str)))
    plot_df = plot_df.set_index('hue').sort_index().reset_index()
    sns.scatterplot(data=plot_df, x='x', y='y', hue='hue', palette='Paired')

if __name__ == '__main__':

    (_, _), (Y, c) = load_data()

    c = np.arange(10)
    Y = np.concatenate([Y[[np.where(c == lb)[0][0]], ...] for lb in c], axis=0)
    Ys = []
    for Yi in Y:
        for angle in np.linspace(0, 350, 25):
            Ys.append(rotate(torch.tensor(Yi)[None, ...], angle))
    Y = torch.cat(Ys, axis=0).numpy().reshape(-1, 28**2)/255
    c = np.repeat(c, 25)

    Y += np.random.normal(0, 0.05, size=Y.shape)
    Y -= Y.mean(axis=0)
    Y -= Y.mean(axis=1)[..., None]

    (n, d), q = Y.shape, 2

    pd.Series(dict(
        Y=Y, n=n, d=d, q=2,
    )).to_json('data.json')

    model = CmdStanModel(stan_file='dr.stan')
    fit = model.optimize(data='data.json', show_console=True, iter=int(1e5), seed=42)

    X = fit.stan_variable('X')
    plot(X)

adi_r · March 26, 2023, 9:03pm

Cmdstan currently doesn’t automatically add jacobians of transforms automatically. github issue on this.

Topic		Replies	Views
Unexpected log_prob() behavior Developers fitting-issues	5	487	April 20, 2023
Latent variables, divergent transitions and prior selection Modeling rstan , techniques , fitting-issues	22	1261	September 19, 2021
Failing to reproduce a stan fit, is it a dependency issue? Modeling fitting-issues	2	431	November 8, 2022
Gaussian Process Latent Variable Models in Stan? Modeling gaussian-process	3	785	February 25, 2021
Trying to implement a latent Gaussian model of categorical data (categorical PCA) Modeling	3	942	November 25, 2019

Potential bug with (cmd)stan(py)'s optimisation process running a probabilistic PCA

Related Topics