Bivariate ordinal (ordered) probit

Ole_Petter_Hansen · May 14, 2018, 7:54pm

biorprob.stan (2.6 KB)
biorprob.r (1008 Bytes)

Hello everyone,

I’ve long wanted to estimate a bivariate, ordinal probit with Stan (or any other sensible program, really…), but haven’t been able to due to lack of the bivariate normal CDF. But @Bob_Carpenter pointed to Owens_T function being implemented, and therefore it is possible to calculate the biv. CDF. Nifty! :-)

I have used Bob’s code, and written a Stan-program for a bivariate probit. The likelihood is defined in section 10.2 in this book by Greene and Henscher.

The model recovers simulated parameters and correlation coefficients nicely - and runs surprisingly fast! I was honestly expecting more numerical bugs.

Appreciate any comments you might have for improving the simple model. Lots of If’s in defining the likelihood.

But otherwise a great thanks to the Stan-devs for making adding new and exciting stuff!

Ole-Petter

functions{
  real binormal_cdf(real z1, real z2, real rho) {
  if (z1 != 0 || z2 != 0) {
    real denom = fabs(rho) < 1.0 ? sqrt((1 + rho) * (1 - rho)) : not_a_number();
    real a1 = (z2 / z1 - rho) / denom;
    real a2 = (z1 / z2 - rho) / denom;
    real product = z1 * z2;
    real delta = product < 0 || (product == 0 && (z1 + z2) < 0);
    return 0.5 * (Phi(z1) + Phi(z2) - delta) - owens_t(z1, a1) - owens_t(z2, a2);
}
return 0.25 + asin(rho) / (2 * pi());
}


real bi_ord_ll(real mu1, real mu2, vector c1, vector c2, 
  int y1, int y2, int d1, int d2,real rho) {
real tmp;
  if(y1>1 && y2>1 && y1<d1 && y2<d2){
    tmp = binormal_cdf(c1[y1  ]-mu1,c2[y2  ]-mu2,rho)-
	      binormal_cdf(c1[y1  ]-mu1,c2[y2-1]-mu2,rho)-
          binormal_cdf(c1[y1-1]-mu1,c2[y2  ]-mu2,rho)+
		  binormal_cdf(c1[y1-1]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y1==1 && y2>1 && y2<d2){
    tmp = binormal_cdf(c1[y1  ]-mu1,c2[y2  ]-mu2,rho)-
	      binormal_cdf(c1[y1  ]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y2==1 && y1>1 && y1<d1){
    tmp = binormal_cdf(c1[y1  ]-mu1,c2[y2  ]-mu2,rho)-
	      binormal_cdf(c1[y1-1]-mu1,c2[y2  ]-mu2,rho);
  }
  if(y1==1 && y2==1){
    tmp = binormal_cdf(c1[y1  ]-mu1,c2[y2  ]-mu2,rho);
  }
  if(y2>1 && y2<d2 && y1==d1 ){
    tmp = Phi(c2[y2]-mu2)-
	      Phi(c2[y2-1]-mu2)-
          binormal_cdf(c1[y1-1]-mu1,c2[y2  ]-mu2,rho)+
		  binormal_cdf(c1[y1-1]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y1>1 && y1<d1 && y2==d2 ){
    tmp = Phi(c1[y1]-mu1)-
	      Phi(c1[y1-1]-mu1)-
          binormal_cdf(c2[y2-1]-mu2,c1[y1  ]-mu1,rho)+
		  binormal_cdf(c1[y1-1]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y1==d1 && y2==d2){
    tmp = 1-
	      Phi(c2[y2-1]-mu2)-
		  Phi(c1[y1-1]-mu1)+
		  binormal_cdf(c1[y1-1]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y1==1 && y2==d2){
    tmp = Phi(c1[y1  ]-mu1)-
	      binormal_cdf(c1[y1  ]-mu1,c2[y2-1]-mu2,rho);
  }
  if(y2==1 && y1==d1){
    tmp = Phi(c2[y2  ]-mu2)-
	      binormal_cdf(c2[y2  ]-mu2,c1[y1-1]-mu1,rho);
  }
return log(tmp);
}
}

data{
int N;               // Number of obs .
vector[N] x1;        // Cov. outcome. 1
vector[N] x2;        // Cov. outcome. 2
int<lower=1> Y1[N];  // Observed response 1
int<lower=1> Y2[N];  // Observed response 1
}

transformed data{
int d1;    // Number of poss. responses outcome. 1
int d2;    // Number of poss. responses outcome. 1
d1 = max(Y1);
d2 = max(Y2);
}

parameters{
real<lower=-1,upper=1> rho;
real beta1;
real beta2;
ordered[d1-1] c1;
ordered[d2-1] c2;
}

model{
for(i in 1:N){
target += bi_ord_ll(beta1*x1[i],beta2*x2[i],c1,c2,Y1[i],Y2[i],d1,d2,rho);
}
}

harrelfe · May 20, 2018, 4:29pm

I see from the book that is a bivariate modeling approach that only works for probit. It would be nice to use a more general approach that would also work for the proportional odds ordinal logistic model, using copulas.

The R brms package makes fitting univariate ordinal models so easy that I’m hoping the package will add multivariate copula capabilities for such problems.

Ole_Petter_Hansen · May 21, 2018, 3:58pm

Yes, need the bi/multivariate CDF to estimate this, and think the multivariate normal CDF is hard enough. Don’t know much about copulas in Stan.

Why would you want an logistic model instead, though?

harrelfe · May 21, 2018, 5:05pm

The effects in the logit link (proportional odds model) have ready interpretations as odds ratios, and the proportional odds model is a special case of the much-liked Wilcoxon test. If not using latent variables you can evaluate the copula directly, without using any complex CDFs.

Topic		Replies	Views
Bivariate probit in Stan Modeling ecology	12	2715	October 8, 2023
How to speed up my bivariate hierarchical normal model? Modeling performance	14	1471	August 30, 2017
Estimating correlation matrix for ordinal outcomes Modeling	2	1034	September 29, 2017
Work-arounds Bivariate Student T CDFs Modeling	6	498	February 10, 2020
Joint Modeling of Mixed Outcome Types Modeling techniques , specification	6	1120	July 5, 2022

Bivariate ordinal (ordered) probit

Related topics