Testing user defined probability function implementations in Stan generated quantities

This post is inspired by Stan Slack discussion about implementing user defined probability functions and related to my old case study on implementing generalized Pareto distribution. In that case study, I implemented tests to check the correctness of the probability function implementations in R. Now that we have integrate_1d in Stan, we can make several tests also in generated quantities block.

The idea is that we can test the user defined probability density function implementations quickly and without simulating data and running MCMC (which has additional complication of stochasticity). The example below is a quick implementation, but should provide a base for a more refined testing framework depending on your needs.

The example tests

  1. Integral of user defined pdf (exp(lpdf)) over the whole range should be 1 (assuming the normalization term is included)
  2. Compares integral of user defined pdf up to x and user defined cdf at x
  3. Integral of user defined pdf up to median value (e.g. from Wikipedia) should be 0.5
  4. Integral of user defined pdf times x should be the mean value (e.g. from Wikipedia)

It is possible to come up with alternative or additional tests, depending on which reference values are easily available. User defined lcdf and lccdf can be tested as 2. User defined rng can also be tested, but need to take into account the randomness (e.g. by computing mean of a test quantity given a large number of draws per iteration and allowing a bigger error, but additional statistical analysis may be needed outside of Stan).

In the example, I implemented Kumaraswamy distribution as described on Wikipedia. This was also the distribution in the Stan Slack discussion.

The Stan code which has only the functions and generated quantities blocks:

functions {
  // user defined lpdf
  real kumaraswamy_lpdf(real x, real a, real b) {
    if (x < 0 || x > 1) // strictly 0 and 1 should not be includeded, but floating points...
      reject("x = ", x, ", but must be in (0,1)");
    if (a <= 0)
      reject("a = ", a, ", but must be positive");
    if (b <= 0)
      reject("b = ", b, ", but must be positive");
    return log(a) + log(b) + (a-1)*log(x) + (b-1)*log1m(x^a);
  }
  // user defined cdf
  real kumaraswamy_cdf(real x, real a, real b) {
    if (x < 0  || x > 1) // strictly 0 and 1 should not be includeded, but floating points...
      reject("x = ", x, ", but must be in (0,1)");
    if (a <= 0)
      reject("a = ", a, ", but must be positive");
    if (b <= 0)
      reject("b = ", b, ", but must be positive");
    return 1-(1-x^a)^b;
  }
  // helper functions for testing
  real integrand_pdf(real x, real xc, array[] real theta,
               array[] real Xi, array[] int yi) {
    real a = theta[1];
    real b = theta[2];
    return exp(kumaraswamy_lpdf(x | a, b));
  }
  real integrand_mean(real x, real xc, array[] real theta,
               array[] real Xi, array[] int yi) {
    real a = theta[1];
    real b = theta[2];
    return x*exp(kumaraswamy_lpdf(x | a, b));
  }
}
generated quantities {
  // random parameter values
  real a = uniform_rng(1,20);
  real b = uniform_rng(1,20);
  // check that integral of pdf from 0 to 1 is 1
  real test1 = integrate_1d(integrand_pdf, 0, 1, {a, b}, {0}, {0})-1;
  // check that integral of pdf up to 0.5 is same as cdf at 0.5
  real test2 = integrate_1d(integrand_pdf, 0, 0.5, {a, b}, {0}, {0})-kumaraswamy_cdf(0.5 | a, b);
  // check that integral of pdf up to median from Wikipedia is 0.5
  real test3 = integrate_1d(integrand_pdf, 0, (1-2^(-1/b))^(1/a), {a, b}, {0}, {0})-0.5;
  // check that integral of x*p is same as mean from Wikipedia
  real test4 = integrate_1d(integrand_mean, 0, 1, {a, b}, {0}, {0})-b*tgamma(1+1/a)*tgamma(b)/tgamma(1+1/a+b);
}

The same Stan code can be used with all different interfaces. Here’s how I run it with cmdstanr

library(cmdstanr)
kum = cmdstan_model(stan_file = "kumaraswamy.stan")
kums = kum$sample(fixed_param=TRUE, iter_sampling=10, chains=1)
as_draws_matrix(kums$draws())

The output shows that with varying parameter values all test error are small, and these differences can be explained by the floating point computation accuracy and tolerance of integrate_1d function.

    variable
draw    a    b    test1    test2    test3    test4
  1  14.1  4.7  4.4e-16 -2.6e-16  2.2e-16  1.0e-15
  2   6.6  8.7 -1.1e-16  2.8e-16  1.1e-16  3.3e-16
  3  14.3  9.0  2.2e-16  2.5e-16  1.1e-16 -5.6e-16
  4   1.4  4.3  0.0e+00  0.0e+00  0.0e+00 -1.7e-16
  5  12.1 10.0 -4.4e-16  4.1e-16 -7.8e-16  1.0e-15
  6   2.9  4.2  0.0e+00 -1.1e-16  0.0e+00  3.3e-16
  7  13.3  7.3  4.4e-16 -5.1e-17 -1.7e-16 -7.8e-16
  8  11.9  9.5  0.0e+00 -5.6e-16 -1.1e-16 -1.8e-15
  9   9.8 12.9  4.4e-16  2.8e-16  1.1e-16  1.0e-15
  10 17.1 10.5  4.4e-16 -5.6e-16 -2.2e-16  1.2e-15
14 Likes