Multivariate skew-t distribution

Hi. I am trying to incorporate non-gaussian returns in asset-allocation models. In regards to that, i need to make draws from a multivariate skew t-distribution. It is based on Sahu, Dey & Branco (2003) and used in Jondau & Rockinger (2012), and I have attached an image. I have the priors of the parameters, but the problem is how to draw from the multivariate skew-t distribution shown in equation 4 on the picture. I don’t think that it’s implemented in Stan.

Can one of you explain to me how this can be implemented?

1 Like

There is no multivariate skew-t distribution in the Stan Math library, but if equation (4) is supposed to represent a PDF, then just work out its logarithm analytically and implement that as a user-defined function in the Stan language.

1 Like

Thank you very much. Can you elaborate a bit more on what you mean by: “then just work out its logarithm analytically”?

Stan uses the logarithm of PDFs and PMFs for numerical and algorithmic reasons. Equation (4) appears to be a PDF that is the product of conditionally independent observations at time t. The logarithm of a product is the sum of the logarithms, the logarithm of a ratio is the difference between the logarithm of the numerator and the logarithm of the denominator, and the logarithm of a power is equal to the exponent multiplied by the logarithm of the base. Also, the logarithm of the Gamma function is denoted lgamma in Stan and lots of other computer languages. With those rules, you can work out what is the logarithm of the right-hand side of (4) and implement it as a user-defined function in the Stan language in a numerically stable way.

2 Likes

Thanks bgoordi, that is a great solution. I would like to check the function which I have written, is there a way to generate a random sample from the specified function for some parameter of nu and xi, which I then can plot in R to check, whether the distribution has the desired shape?

I have tried writting the function as:

functions {
//Multivariate skewed t-distribution (Sahu, Dey & Branco, 2003)
real MskT_lpdf(real z, real nu, real xi){
real a = (tgamma((nu-1)/2)sqrt(nu-2))/(sqrt(pi())tgamma(nu/2))(xi-1/xi);
real b = sqrt(pow(xi,2)+1/pow(xi,2)-1-pow(a,2));
real k = (-a/b<= z ? (b
z+a)xi : (bz+a)/xi);
return log(22/(xi+1/xi)) + lgamma((nu+1)/2)-(log(sqrt(pi()(nu-2)))+lgamma(nu/2))-((nu+1)/2)*log(1+(pow(k,2)/(nu-2)));
}
}

data{
real nu;
real xi;
}

model{
real m~MskT_lpdf(nu = nu, xi = xi);
}

I.e. how can i get the stan program when called for R, to return a random sample drawn from the MskT distribution?

That function seems to be univariate but perhaps it is a valid special case. In any event, you need to call expose_stan_functions and verify that the antilog of MskT_lpdf integrates to 1. See
http://mc-stan.org/users/documentation/case-studies/gpareto_functions.html

1 Like

After 2 korrektion it did indeed integrate to one - that is a nice check - but is there some way, that I in a similar manner can draw e.g. 1000 observations from the distribution and plot them as a histogram to check that it has the sought shape? Either directly in stan or indirectly in R?

Some ideas here as well: https://stackoverflow.com/questions/43860236/multivariate-skew-normal-in-stan

You would need to write another function in the Stan language (or R) to draw from a multivariate skewed t distribution.

Thank you very much. I am now trying to convert it to the multivariate case - the one stated by Klaus. However, I cannot get it to loop correctly - how should I declare the parameter vectors (nu and xi) and the and the random variabel, z, such that it matches with the number of random variables in the vector z?

functions {
//Multivariate skewed t-distribution (Sahu, Dey & Branco, 2003)
real MskT_lpdf(vector z, vector nu, vector xi){
vector uni_pdf;
for (i in 1:rows(z){
real a = (tgamma((nu[i]-1)/2)sqrt(nu[i]-2))/(sqrt(pi())tgamma(nu[i]/2))(xi[i]-1/xi);
real b = sqrt(pow(xi[i],2)+1/pow(xi[i],2)-1-pow(a,2));
real k = (-a/b >= z[i] ? (b
z+a)xi[i] : (bz+a)/xi[i]);
uni_pdf[i] = log(2b/(xi[i]+1/xi[i])) + lgamma((nu[i]+1)/2)-(log(sqrt(pi()(nu[i]-2)))+lgamma(nu[i]/2))-((nu[i]+1)/2)*log(1+(pow(k,2)/(nu[i]-2)));
}
return sum(uni_pdf)
}
}

I apologize for the many questions and hope You can help :)

Yes, just define a parameter and draw from it:

functions {
  real foo_lpdf(...) { ... }
}
data {
  int<lower = 0> K;
  ...
}
parameters {
  vector[K] mu;
}
model {
  mu ~ foo(...);
}

That’s not as efficient as writing an independent random number generator, but it will test that your function is defined properly.

1 Like

Thank you so much - it works perfectly! Awesome program and awesome support!

Now I am trying to run the multivariate case:
//This code represents our model written in stan (C++)

functions {
//Multivariate skewed t-distribution (Sahu, Dey & Branco, 2003)
real MskT_lpdf(vector z, vector nu, vector xi){
int N = rows(nu);
vector[K] l;
for(n in 1:K){
real a = (tgamma((nu[n]-1)/2)sqrt(nu[n]-2))/(sqrt(pi())tgamma(nu[n]/2))(xi[n]-1/xi[n]); // To ensure zero mean
real b = sqrt(pow(xi[n],2)+1/pow(xi[n],2)-1-pow(a,2)); // To ensure unit variance
real k = (-a/b >= z[n] ? (b
z[n]+a)xi[n] : (bz[n]+a)/xi[n]);
l[n] = log(2b/(xi[n]+1/xi[n])) + lgamma((nu[n]+1)/2)-(log(sqrt(pi()(nu[n]-2)))+lgamma(nu[n]/2))-((nu[n]+1)/2)*log(1+(pow(k,2)/(nu[n]-2)));
}
return sum(l);
}
}

However, when I try running it from R by checking whether it integrates to 1, using:
rstan::expose_stan_functions(“C:/Users/rva92/Dropbox/Mine ting/Uni/10. Semester (Finance master thesis)/4 - Kodning/Stan functions/Multivariate skewed t distribution.stan”)

skw_t <- function(z, nu, xi, K) {
exp(sapply(z, FUN = MskT_lpdf, nu = nu, xi = xi))
}

xi = c(4, 2, 0.5, 3)
nu = c(4, 5, 3, 4)
K = as.numeric(length(nu))
integrate(skw_t, lower = -Inf, upper = Inf, nu = nu, xi = xi, K = K)

I get the following message:
Error in FUN(X[[i]], …) :
[]: accessing element out of range. index 2 out of range; expecting index to be between 1 and 1; index position = 1z

Any idea on, where it went wrong?

Multivariate t is only appropriate in very limited situations. That is, where you have reason to believe that the degrees of freedom parameter is the same for every asset, both on the marginal distribution and on the joint distribution. In the classical world, a grouped T distribution was proposed as a solution, but it seems like copula modelling has become more important.

You might consider instead a factor structure, such as CAPM or APT. Assuming the factors are uncorrelated, then you just need to fit a univariate t distribution to the factors. The joint distribution may have some non-normal distribution that way, even if the returns, conditional on the factor, are assumed to be normal.

Another issue with multivariate t is that it is a little more tricky to set up a time-varying version. You’re sort of forced to choose whether the multivariate dependence being t is more important than the time-varying volatility. Another reason to prefer a factor structure of some kind. A little more flexibility.