Truncating a bivariate student t distribution in generated quantities

nickg1 · August 7, 2021, 3:12pm

I am analyzing the correlation between two variables using the robust correlation model described in Adrian Baez-Ortega’s 2018 blog post:

To assess the fit of the estimated correlation, random samples from the estimated bivariate t-distribution are simulated in generated quantities:

vector[2] x_rand;

x_rand = multi_student_t_rng(nu, mu, cov),

where nu, mu, and cov are parameters estimated from the model.

This produces a number of correlated pairs of variables x_rand[1] and x_rand[2].

I would like to be able to truncate both variables at lower and upper limits in order to eliminate values that may be known to be biologically or physically unlikely or impossible. For example, if x_rand[1] is the natural logarithm of an annual mortality rate of a mammal population, values greater than 0.2 (-1.6094 in natural log space) may be extremely unlikely, especially if the x_rand[2] is an environmental variable such as average annual precipitation. Similarly, average annual precipitation may have lower and upper limits during a period of record from which the original bivariate data (x) are obtained. For example, if x_rand[2] is an index of annual precipitation during a finite period of record that ranges from 0.2 to 2.0, it might be reasonable to confine x_rand within these limits.

I am familiar with truncating univariate distributions in the model section of stan. But I am not sure how to do so with bi-variate distribution random number generators. Assuming this is possible, can someone show me how to do this for x_rand in the example above?

Thanks in advance.

amas0 · August 7, 2021, 3:26pm

In lieu of an elegant solution, you could always just discard samples that violate your bounds. Assuming your parameters are not way off, samples should regularly land in your accepted bounds. Here’s some code that does this:

vector[2] x_rand;
x_rand = multi_student_t_rng(nu, mu, cov),
while !((x_rand[1] > lower_1) && 
        (x_rand[1] < upper_1) && 
        (x_rand[2] > lower_2) && 
        (x_rand[2] < upper_2)) {
  x_rand = multi_student_t_rng(nu, mu, cov)
}

Just keep sampling until the sample lies in your bounds. I’m sure there’s a more elegant/performant approach, but in my experience the rng sampling is not the long pole in the tent when it comes to sampling from a model.

nickg1 · August 8, 2021, 12:10am

Thank you. I will give your suggested approach a try.

Topic		Replies	Views
Is there a more efficient way to code this robust correlation approach Modeling rstan , techniques , specification	8	667	October 31, 2021
Multi_student_t_rng vs student_t_rng General	2	451	August 4, 2021
User specified multivariate model Modeling	1	431	March 25, 2018
Truncated model for neg_binomial_2 Modeling	20	1590	June 9, 2017
Estimating Robust Correlations for Multiple (>2) Outcomes Modeling rstan , techniques , specification	10	346	June 2, 2021

Truncating a bivariate student t distribution in generated quantities

Related Topics