Simulating from Multivariate Probit

Hey all, I’ve been using @bgoodri’s parameterisation of the MV Probit (code below, also available here). I’m now working with missing response data (entirely missing vectors) and I was wondering if anyone had advice on how to turn Ben’s formulation into simulation code in Stan (as it doesn’t have any explicit lpdf calls which I’m more used to! Thanks in advance!!!

data {
  int<lower=1> K;
  int<lower=1> D;
  int<lower=0> N;
  int<lower=0,upper=1> y[N,D];
  vector[K] x[N];
}
parameters {
  matrix[D,K] beta;
  cholesky_factor_corr[D] L_Omega;
  real<lower=0,upper=1> u[N,D]; // nuisance that absorbs inequality constraints
}
model {
  L_Omega ~ lkj_corr_cholesky(4);
  to_vector(beta) ~ normal(0, 5);
  // implicit: u is iid standard uniform a priori
  { // likelihood
    for (n in 1:N) {
      vector[D] mu;
      vector[D] z;
      real prev;
      mu = beta * x[n];
      prev = 0;
      for (d in 1:D) { // Phi and inv_Phi may overflow and / or be numerically inaccurate
        real bound; // threshold at which utility = 0
        bound = Phi( -(mu[d] + prev) / L_Omega[d,d]  );
        if (y[n,d] == 1) {
          real t;
          t = bound + (1 - bound) * u[n,d];
          z[d] = inv_Phi(t);       // implies utility is positive
          target += log1m(bound);  // Jacobian adjustment
        }
        else {
          real t;
          t = bound * u[n,d];
          z[d] = inv_Phi(t);     // implies utility is negative
          target += log(bound);  // Jacobian adjustment
        }
        if (d < D) prev = L_Omega[d+1,1:d] * head(z, d);
        // Jacobian adjustments imply z is truncated standard normal
        // thus utility --- mu + L_Omega * z --- is truncated multivariate normal
      }
    }
  }
}
generated quantities {
  corr_matrix[D] Omega;
  Omega = multiply_lower_tri_self_transpose(L_Omega);  
}

@bgoodri any reference on this? Is this a standard sorta reparameterization trick?

I think it’s a way of efficiently sampling from truncated MVNs but I’m not smart enough to follow why! See Ben’s documents in this thread: https://groups.google.com/g/stan-users/c/GuWUJogum1o/m/LvxjlUBnBwAJ

Oh well it’s gonna be hard for me to top this. It looks like it’s doing an inverse CDF trick.

Is there some particular point of confusion?

Edit: Not that it’s all clear, just we should probably work off of that PDF :D

Hahah thanks Ben. I might be over complicating it - I think at it’s core I can see how to wrap Ben’s code into an lpmf function but not into an rng function in Stan.

Oooh, okay so

I think @bgoodri 's code there is an rng function. So you’re using Stan to generate from a truncated multivariate normal by sampling a constrained vector z(u) with Stan itself.

This as opposed to rejection sampling.

So like you can sample from a uniform(0, 1) in R, or you can write a Stan program for it:

parameters {
  real<lower = 0.0, upper = 1.0> u;
}

In the first case you use some built in C libraries, in the second you use MCMC. It looks like this function is generating draws from a truncated multivariate normal using MCMC.

I assume it’s not easy to write down the combined effect of what is going on here in closed form, and so that is why we’re using MCMC to do the sampling. That make any sense? It’s kindof a weird thing.

Hey Ben, thanks for your replies and apologies for the delay getting back to you. That makes sense (in a complicated way!) and I can see how once you can generate u’s and z’s you know the 1s but I’m confused what happens with the if else loops if we can’t generate 1s and 0s within the code? I can’t specify that in Stan (can’t impute binary parameters). Full disclosure, I’m running a model where the sum of the imputed 1s are Poisson distributed so need to get integer values within Stan if possible.

Which 1s and 0s? (edit: I see them in the code in the first post but I wasn’t finding them in the Goodrich code, though I’m probably just looking in the wrong place – anyway want to be clear what these are for)

Can you write out the model you want to work with if you had all the log densities and stuff you wanted?

The way I understand it this code we can use it to put a truncated multivariate normal prior on a vector of coefficients.

So effectively something like:

beta ~ truncated_multivariate_normal(...);

We can then do some sort of logistic regression with this:

y ~ bernoulli_logit(X * beta);

Hey @bbbales2, apologies I missed this reply. It makes a lot more sense now I’ve had 3 months to ruminate! I think what I was struggling with was not using an rng to explicitly draw values but it’s clicked now!! Thanks so much for taking the time to reply, really helpful comments.

2 Likes

Glad it worked out!

1 Like