Specifying density that is product of normal densities

I would like to perform HMC inference of following probability function:

\begin{align} \mathbf{y}_j &\sim \mathcal{N}(\mathbf{y}_j \mid \mathbf{0}, \textbf{O}_j + K(\mathbf{X}, \mathbf{X}^{\top})) \\ \mathbf{x}_n &\stackrel{\text{iid}}{\sim} \mathcal{N}(\mathbf{x}_n \mid \mathbf{0}, \textbf{I}_D) \\ p(\mathbf{X} \mid \mathbf{Y}, \mathbf{O}) &\sim \Bigg[\prod_{j=1}^{J} p(\mathbf{y}_j \mid \mathbf{O}_j, \mathbf{X})\Bigg] \Bigg[ \prod_{n=1}^{N} p(\mathbf{x}_n) \Bigg] \end{align}

where

  • \mathbf{y}_j \in \mathbb{R}^{N}
  • \mathbf{x}_n \in \mathbb{R}^{D}
  • \mathbf{O} \in \mathbb{R}^{J \times N} \implies \mathbf{O}_j \in \mathbb{R}^N
  • K: (\mathbf{X}, \mathbf{X}^{\top}) \mapsto \mathbb{R}^{N \times N} (some kernel)

In Python, I would write something like


def log_posterior(X):
    K = kernel(X)
    Q = 0
    for n in range(N):
        Q += multivariate_normal.logpdf(X[n], mu_x, Cov_x)
    for j in range(J):
        Q += multivariate_normal.logpdf(Y[j], mu_z, O[j] + K)
    return Q

However, I want to try to optimize this (find \mathbf{X}^{\star}) using Stan. However, I am having trouble. This is what I have tried so far:


data {
    int<lower=0> N;
    int<lower=0> J;
    int<lower=0> D;
    vector[N] mu_y;
    vector[N] Y[J];
    vector[N] O[J];
    real alpha;
    real rho;
    vector[D] mu_x;
    matrix[D, D] Cov_x;
}
parameters {
    vector[N] X[D];
}
transformed parameters {
    matrix[N, N] Cov[J];
    matrix[N, N] K = cov_exp_quad(X, X, alpha, rho);
    for (j in 1:J)
        Cov[j] = diag_matrix(O[j]) + K;
}
model {
    X ~ multi_normal(mu_x, Cov_x);
    for (j in 1:J)
        Y[j] ~ multi_normal(mu_y, Cov[j]);
}

I get lots of various errors about dimensions being wrong, but more generally, I think my mental model of how to write in Stan is still incomplete. I’m particularly confused as to how to handle indexing across J. Any help is appreciated.

from the first sight, I may say that vector[N] X[D]; should be vector[D] X;

1 Like

Thanks. How would Stan then know the prior on \mathbf{x}_n is p(\mathbf{X}) = \prod_{n=1}^{N} p(\mathbf{x}_n)? In other words, how do I handle the N terms?

Sorry, I don’t really have time to check on the indexing right now. I just wanted to point out that the target in Stan is actually the log probability. So you can use the target =+ multi_normal_lpdf(...) notation, which is closer to you python code. Just a brief thought, maybe it helps with your ‘mental model’. Please just ignore if this isn’t helpful or this was obvious to you ;)

1 Like

@aakhmetz, thanks for this suggestion. Using vector[D] X[N] fixed the dimensionality problem. Basically, I have an N-vector filled with D-vectors.

@Max_Mantei, thanks for suggesting target. I didn’t realize you could do this. For future users, this is the relevant documentation.

1 Like