Work-arounds Bivariate Student T CDFs

Aditya_Kaushik · February 7, 2020, 5:34pm

My understanding is that bivariate Student T CDFs aren’t available for Stan. Does anyone have any custom functions they have built that can handle this?

mike-lawrence · February 7, 2020, 5:59pm

Are you sure you’re not looking for multi_student_t()?

mike-lawrence · February 7, 2020, 6:02pm

Oops, sorry, see now you’re asking for the CDF

jjramsey · February 7, 2020, 6:03pm

May I ask why you require the CDF rather than the PDF?

Aditya_Kaushik · February 7, 2020, 6:10pm

Well I want to calculate the probability that my t-distributed variable X,Y with nu < 10 df is less than 0 with means Mu and correlation matrix Sigma. Both Mu and Sigma are latent variables which I’m interested in estimating. I have access to a trove of data that tells me (among other things) whether X,Y lies in Quadrants 1, 2, 3, or 4 (ie X>0,Y>0; X>0,Y<0, etc.)

If I had access to the joint CDFs, I could get probability estimates for the probability that X,Y lies in quadrant j, with j = 1,2,3,4. Then I could say my data is multinomially distributed with the probabilities I got from the CDFs. I believe this would give me more accurate estimates of my latent variables.

jjramsey · February 7, 2020, 6:22pm

I can see that you might want to do that in a post-processing step after you’ve done MCMC, but I’m not sure how that would be relevant to sampling a model in Stan. What is your mathematical model?

ETA: I suspect that there may be an XY problem here. You’ve asked about Y, and I want to find out X.

Aditya_Kaushik · February 10, 2020, 5:15pm

Sorry for my late response, busy weekend, and you’re probably right that there is an XY problem here. I’ll be more forthright –

My code is below. My ultimate goal is to get better estimates of the means for better prediction.

data {

// X1 can take on 5 different types and X2 can also take on 5 different types
//. I am able to observe 20 observations of each type
int X1_indices[100];  
int X2_indices[100]; 

// the interaction between the X1 and X2 type yield an observable real valued result 
vector[100] real_result;

// I am also able to observe an interger valued result that comes about
// (X1,X2) will lie in a quadrant and I'm able to observe which quadrant it lines in 

// which quadrant depends on whether or not X1 and X2 minus som constant are above 0 or below 0. 
// Ie the 4 permutations (above,above); (above,below); (below,above); (below,below)
// these constants in some sense give the quantile value for x1 and x2 and help refine my posterior
int quandrant_result[100];

int constant_1[100];
int constant_2[100];
                      

 }

parameters{
   vector[100] mu_1; // mean of X_1
   vector[100] mu_2; // mean of X_2

    vector<lower = 0>[100] sigma_1; // sd of x_1
    vector<lower = 0>[100] sigma_2; // sd of x_2

    vector<lower = 3>[100] x1_nu; // df of x_1 

   }

 model{

    vector[100] mu;
    vector[100] sigma;
    vector[100] nu;

    int X1_index; 
    int X2_index;

    int p1[100]; // the probability (x1,x2) lines in quadrant 1
    int p2[100]; // the probability (x1,x2) lines in quadrant 2
    int p3[100]; // the probability (x1,x2) lines in quadrant 3
    int p4[100]; // the probability (x1,x2) lines in quadrant 4

    int probabilities[100];
  for( i in 1:100){
          X1_index = X1_indices[i];
          X2_index = X2_indices[i];
  
          mu[i] = mu_1[X1_index] - mu_2[X2_index];
          sigma[i] = sqrt(square(sigma_1[X1_index]) + square(sigma_2[X2_index]));
          //
          nu[i] = x1_nu[X1_index];
  
  
          // Here is where I would use the bivariate t distribution  
          p1[i] = //Probability that mu_1[i] +mu_2[i] + constant_1 > 0  & mu_1 - mu_2 +                 constant_2 >0; )
          p2[i] = // probability that mu_1[i] + mu_2 + constant_1 > 0 & mu_1 - mu_2 + constant_2 <0; 
          p3[i] = // probability that mu_1[i] + mu_2 + constant_1 < 0 & mu_1 - mu_2 + constant_2 >0; 
          p4[i] = // probability that mu_1 + mu_2 + constant_1 < 0 & mu_1 - mu_2 + constant_2 <0; 
  
}


probabilities[1] = sum(p1)/100;
probabilities[2] = sum(p2)/100;
probabilities[3] = sum(p3)/100;
probabilities[4] = sum(p4)/100;

normalizing_constant = 1/( probabilities[1] + probabilities[2] + probabilities[3] + probabilities[4]);

probabilities[1] = probabilities[1] * normalizing_constant;
probabilities[2] = probabilities[2] * normalizing_constant;
probabilities[3] = probabilities[3] * normalizing_constant;
probabilities[4] = probabilities[4] * normalizing_constant;

real_result ~ student_t(nu,mu,sigma);

quandrant_result ~ multinomial(probabilities);

}

Topic		Replies	Views
Status of CDF for the Multivariate Normal Distribution? General	2	1357	August 24, 2017
Multivariate normal CDF General multivariate-normal	4	1880	April 7, 2021
Sampling from this banana shaped distribution Modeling	10	4410	March 27, 2018
Multivariate Skewed Model for Simple Dataset Modeling	6	906	April 8, 2019
Modelling a multivariate skewed t-distribution Modeling	3	822	March 19, 2018

Work-arounds Bivariate Student T CDFs

Related topics