Truncating a bivariate student t distribution in generated quantities

I am analyzing the correlation between two variables using the robust correlation model described in Adrian Baez-Ortega’s 2018 blog post:

To assess the fit of the estimated correlation, random samples from the estimated bivariate t-distribution are simulated in generated quantities:

vector[2] x_rand;

x_rand = multi_student_t_rng(nu, mu, cov),

where nu, mu, and cov are parameters estimated from the model.

This produces a number of correlated pairs of variables x_rand[1] and x_rand[2].

I would like to be able to truncate both variables at lower and upper limits in order to eliminate values that may be known to be biologically or physically unlikely or impossible. For example, if x_rand[1] is the natural logarithm of an annual mortality rate of a mammal population, values greater than 0.2 (-1.6094 in natural log space) may be extremely unlikely, especially if the x_rand[2] is an environmental variable such as average annual precipitation. Similarly, average annual precipitation may have lower and upper limits during a period of record from which the original bivariate data (x) are obtained. For example, if x_rand[2] is an index of annual precipitation during a finite period of record that ranges from 0.2 to 2.0, it might be reasonable to confine x_rand within these limits.

I am familiar with truncating univariate distributions in the model section of stan. But I am not sure how to do so with bi-variate distribution random number generators. Assuming this is possible, can someone show me how to do this for x_rand in the example above?

Thanks in advance.

In lieu of an elegant solution, you could always just discard samples that violate your bounds. Assuming your parameters are not way off, samples should regularly land in your accepted bounds. Here’s some code that does this:

vector[2] x_rand;
x_rand = multi_student_t_rng(nu, mu, cov),
while !((x_rand[1] > lower_1) && 
        (x_rand[1] < upper_1) && 
        (x_rand[2] > lower_2) && 
        (x_rand[2] < upper_2)) {
  x_rand = multi_student_t_rng(nu, mu, cov)
}

Just keep sampling until the sample lies in your bounds. I’m sure there’s a more elegant/performant approach, but in my experience the rng sampling is not the long pole in the tent when it comes to sampling from a model.

Thank you. I will give your suggested approach a try.