Bivariate probit in Stan

Hi all –

I hadn’t seen any implementation of bivariate probit/IV (a la the Stata command biprobit) in Stan. So made one. Just thought I’d plonk it here as it’ll be more searchable when someone comes looking.

Jim

Stan code:

R code/test:

4 Likes

Just thought I’d plonk it here as it’ll be more searchable when someone comes looking.

Good stuff! Thanks for sharing

Thanks!

I alwasy get confused about how these things relate. There’s a chapter on multivariate probit regression in the manual, but it doesn’t need binomal CDFs. Ben posted a binormal CDF implementation that I mean to put into the manual.

real q1;
real q2;
real w1;
real w2;
real rho1;
// from Greene's econometrics book
    q1 = 2*Y[1] - 1.0;
    q2 = 2*Y[2] - 1.0;
    w1 = q1*mu1;
    w2 = q2*mu2;
    rho1 = q1*q2*rho;

You can use compound declare-define and vectorize rather than defining explicitly enumerated variables, as in:

vector[2] q = 2 * Y - 1;
vector[2] w = q .* mu;
real rho1 = q[1] * q[2] * rho;

Also, you don’t need the .0 on reals (the only exception is for division).

It’d also all be a bit more efficient if you vectorized the lpdf—youl could save some redundant calculations involving rho1 and also vectorize most of the rest, I think.

Thanks Bob – I’ll update this later in the week .

Binormal probit is a special case, as you don’t need to have a parameter for each observation. So this should give you the same estimates far more quickly. There aren’t many options once you move to higher dimensions, sadly.

Cool—makes sense given that the cdf can do the marginalization integral.

Thanks. I’ll be watching for updates.

If you are feeling ambitious, James, it would be beneficial to see a version with random effects as well.

For instance, imagine that your outcomes, Y1 and Y2, represent whether students passed their reading and math tests, respectively. And imagine that the students are clustered in schools. It would good to see a covariance matrix indicating the school-level variances of passing reading and math and the school-level covariance.

Hi Jeremy,

I did this (individual random effects) a while back but forgot to post it. Tested it against a standard individual fixed effects linear probability model and it gave qualitatively similar results.

functions {
    real binormal_cdf(real z1, real z2, real rho) {
    if (z1 != 0 || z2 != 0) {
      real denom = fabs(rho) < 1.0 ? sqrt((1 + rho) * (1 - rho)) : not_a_number();
      real a1 = (z2 / z1 - rho) / denom;
      real a2 = (z1 / z2 - rho) / denom;
      real product = z1 * z2;
      real delta = product < 0 || (product == 0 && (z1 + z2) < 0);
      return 0.5 * (Phi(z1) + Phi(z2) - delta) - owens_t(z1, a1) - owens_t(z2, a2);
    }
    return 0.25 + asin(rho) / (2 * pi());
  }
  real biprobit_lpdf(row_vector Y, real mu1, real mu2, real rho) {
    real q1;
    real q2;
    real w1;
    real w2;
    real rho1;
    
    // from Greene's econometrics book
    q1 = 2*Y[1] - 1.0;
    q2 = 2*Y[2] - 1.0;
    
    w1 = q1*mu1;
    w2 = q2*mu2;
    
    rho1 = q1*q2*rho;
    return log(binormal_cdf(w1, w2, rho1));
  }
}
data {
  int N;
  int P;
  int P2; // the number of instruments
  int I; // number of individuals
  int individual[N]; // index of individuals
  matrix[N,2] Y; // outcomes, binary
  matrix[N, P] X; // first column is a column of ones
  matrix[N, P2] Z; // matrix of instruments
}
transformed data {
  matrix[N, P+P2] X1;
  matrix[N, P + 1] X2;
  X1 = append_col(X, Z);
  X2 = append_col(Y[:,1], X);
}
parameters {
  vector[P + P2] beta;
  vector[P + 1] gamma;
  real<lower = -1, upper = 1> rho;
  real<lower = 0> sigma_1;
  real<lower = 0> sigma_2;
  matrix[I, 2] mu;
}
model {
  // priors
  beta ~ normal(0, 1);
  gamma ~ normal(0, 1);
  
  mu[:,1] ~ normal(0, sigma_1);
  mu[:,2] ~ normal(0, sigma_2);
  sigma_1 ~ normal(0, 1);
  sigma_2 ~ normal(0, 1);
  // likelihood
  for(i in 1:N) {
    Y[i] ~ biprobit(mu[individual[i], 1] + X1[i] * beta, mu[individual[i], 2] + X2[i] * gamma, rho);
  }
}
2 Likes

Very cool. By chance, did you use simulated data when testing the model? Although I think I can infer the data structure from the data block, it would be ideal to walk through your modeling step by step. Thanks.

Will try to write a post up on it before long and put a link here.

1 Like

There is a lot of potential use among ecologists for bivariate probits. (And multivariate probit responses, too, for that matter.) Particularly with correlated random effects across the outcome equations. The Stan forum has relatively few examples that go into greater depth than the Stan manual. Your post, James, is probably the most useful so far.

Thanks, that’s very encouraging :-)

1 Like

The lack of depth in the manual is intentional. I decided to stick to the simplest possible examples of things for the manual.

The case studies are where you want to go for more complex models, but we don’t have anything on multivariate probits (I actually wrote the initial multivariate probit in the manual at the urging of Michael McCarthy, a stat ecologists in Melbourne—but it’s hard to fit).

We do accept user submissions for case studies if you (or Jim) want to put something together.

1 Like

@James_Savage to what extent can you envision a version in which the correlation varies as a function of covariates in the model?

P.S. We’re implementing your function in an upcoming analysis. Thanks again.