RNG for truncated distributions

If the PRNG uses the inverse CDF method of drawing randomly from a standard uniform and then transforming that by the inverse CDF of the desired distribution, then it is often relatively straightforward to draw uniformally from some subset of the [0,1] interval and transform that by the inverse CDF of the distribution in order to obtain a truncated draw.

3 Likes

But I guess Stan does not support that right now - is it correct?

There is no syntax for it in the Stan language, but you can draw from a uniform distribution and there are some inverse CDFs implemented.

For concreteness, I’ll give the example I worked on this morning. I want to draw from a left-truncated Weibull to impute right-censored event times. Let’s say that I want a Weibull(alpha, sigma) truncated so that my random draws are never below some lower boundary t.

Then I should:
(1) Figure out the quantile corresponding to lower cutoff t. Let’s say it’s 0.2, or the 20th percentile. Call that p.
(2) Draw from a continuous uniform(0.2, 1). Call that draw u.
(3) Apply the inverse transform for the cdf, or “percent point” function. I did not feel like doing algebra on a Monday morning so I found mine at https://www.wolframalpha.com/input/?i=inverse+cdf+weibull. This will return a draw from the correct truncated weibull.

(The unstated step 4 is to post on the Stan discourse to check my logic. So far the draws look good when I plot them, but I have not checked it much beyond that.)

For a double truncation, you would add a step to calculate the cdf at the upper truncation point and draw from a uniform(p1, p2) instead of uniform(p, 1).

functions {
  real tweibull_rng(real alpha, real sigma, real t) {
    real p;
    real u;
    real x;
    p = weibull_cdf(t, alpha, sigma);
    u = uniform_rng(p, 1);
    x = sigma * (-log1m(u))^(1/alpha);
    return x;
  }
}

9 Likes

Thanks, that’s exactly what I need to do and with a lot of detail.

This is very nice. I’m going to add this example to the manual after a bit of renaming and collapsing:

real weibull_lb_rng(real alpha, real sigma, real lb) {
    real p = weibull_cdf(lb, alpha, sigma);  // cdf for bounds
    real u = uniform_rng(p, 1);
    return sigma * (-log1m(u))^(1 / alpha);  // inverse cdf for value
  }

Here’s the issue comment

3 Likes

I’m working on the same problem (truncated normal in RNG block). The Weibull example makes sense, but would you be able to share your code for the truncated normal? I’d like to check my work.

1 Like

It’d be something like this for a normal RNG with truncation at a lower bound lb:

real normal_lb_rng(real mu, real sigma, real lb) {
    real p = normal_cdf(lb, mu, sigma);  // cdf for bounds
    real u = uniform_rng(p, 1);
    return (sigma * inv_Phi(u)) + mu;  // inverse cdf for value
}

The inv_Phi function gives the inverse of a Normal(0,1) cdf, and you can apply the scale parameter and location shift to get back to a Normal(mu, sigma).

7 Likes

Hi! Would a normal rng with double truncation be correct like this?:

real normal_lb_ub_rng(real mu, real sigma, real lb, real ub) {
    real p1 = normal_cdf(lb, mu, sigma);  // cdf with lower bound
    real p2 = normal_cdf(ub, mu, sigma);  // cdf with upper bound
    real u = uniform_rng(p1, p2);
    return (sigma * inv_Phi(u)) + mu;  // inverse cdf 
}

Thanks!

4 Likes

Works fine like this, never mind!

I started to work with this, and one problem that I notice is that the normal cdf overflows quite easily. Is there another alternative? Will there be truncated rngs built-in in stan in the near future?

Where does it overflow? My best guess is that you have overflows when you are trying to sample at the very tails of the distribution (as on normal(0,1) truncated between 7 and 9. If this is the case, then I think it is likely you can find a better model for your data than truncated normal (in this case a right-truncated exp might work just fine). If you absolutely need to work with truncation at far tails you might be able to get some leverage from moving to log scale with normal_lcdf and normal_lccdf but then you’ll have to figure out how to sample on the log-transformed scale, and how to compute inv_Phi(exp(u)) without the explicit exponentiation - this should be possible, but would require some math.

I’m trying to generate data for the LBA with a truncated normal distribution as the model of the drifts (https://github.com/stan-dev/stancon_talks/tree/master/2018-helsinki/Contributed-Talks/nicenboim). So I’m using a version with only lb=0. I know that the model is failing when mu<0 and mu/sigma < 8.25 (https://mc-stan.org/docs/2_18/functions-reference/normal-distribution.html). But I can’t avoid this.

I’ll try the second approach, I’m sure that someone out there in the internet did the math for me. In any case, it’ll be nice to have this built in in stan…

Just a note that I ran into the cdf overflow issue. But if mu and sigma are fixed, there is another solution to generating truncated normal RVs: specify something like

parameters {
vector[n_random_numbers]<lower = lb, upper = ub> dummy_parameter;
}

model {
dummy_parameter ~ normal(mu, sigma);
}

That needs explicit truncation if mu and / or sigma are unknown parameters. If they are both data, then it is ok.

1 Like

Because the truncated distribution does not integrate to 1 (or some other constant) if mu and / or sigma are not constant.

1 Like

Ah, of course; thanks! Will edit my post to avoid tripping people up.

I’m looking for a way to do the same for the beta distribution: to sample from the beta distribution with an upper bound. But it seems that stan does not have a built-in inverse cdf for the beta distribution. Any help is much appreciated!

real beta_ub_rng(real alpha, real beta, real ub) {
    real p_ub = beta_cdf(ub, alpha, beta);  // cdf for upper bound
    real u = uniform_rng(0, p_ub);
    return ????  // inverse cdf for value
}

There is one in Boost but you would have to call it from C++.

Thanks! I was also wondering for gamma: How would I sample from gamma with lower or upper bound?
This is what I have but I’m too new to this to be sure that I’m doing it right:

real gamma_lb_rng(real alpha, real beta, real lb) {
    real p_lb = gamma_cdf(lb, alpha, beta);
    real u = uniform_rng(p_lb, 1);
    real y  = gamma_cdf(u, alpha, beta);
    return y;
}
3 Likes