Parameter bounds that depend on other parameters

To specify the priors x \sim \text{Uniform}(0,1) and y|x \sim \text{Uniform}(0,x), I would have assumed this would work:

parameters {
  real<lower = 0, upper = 1> x;
  real<lower = 0, upper = x> y;
}
model {}

However, that results in the distributions P(x) = 2x, P(y) = 2(1-y), not what was intended. I get the intended distribution by adding y ~ uniform(0, x); to the model block (or by defining an unscaled y with constant boundaries and then scaling this by x). I conclude that the uniform prior needs to be stated explicitly for parameters whose bounds depend on other parameters, unlike for parameters whose bounds are fixed at the beginning of execution (whose priors are correctly understood implicitly, e.g. x here). I donā€™t understand whatā€™s happening when I donā€™t specify the uniform explicitly, nor why this is different for parameters with constant or variable bounds. Is this documented somewhere? I couldnā€™t find it. I guess itā€™s something to do with the Jacobian but couldnā€™t say what.
(Though the above explanation should be enough, Iā€™ve elaborated on this issue, including plots, here.)

3 Likes

This doesnā€™t directly address what is happening to the distribution of x. I am not sure myself.

But some time ago I had a problem where a parameter bound was a function of other parameters (and thus, implicitly, so was its density), and was having sampling issues. I tried standardizing that parameter to not depend on the other, then generated another in the transformed parameters block to then account for this dependence. It improved my sampling.

In your case, I would try making both x and y in [0, 1], then assign their product to a transformed parameter. I also suspect that the distribution of x will be what you want.

Hope this helps.

Thanks, yes I described exactly that approach as an alternative solution to adding y ~ uniform(0, x); at the link in the original post (though not in the post itself). Iā€™m wondering whether the default behaviour, i.e. what happens when neither solution is used, is explained somewhere in the docs. If not, perhaps it should be as itā€™s a bit of a gotcha

I think the problem stems from a misunderstanding about the ā€œimplicitā€ uniform prior. The fact is that Stan never directly imposes any implicit prior.

Recall that Stan needs the posterior density only up to a constant multiplier (and hence constant additive constant for the log density). So when you have x ~ uniform(0,1); or target += uniform_lpdf(x | 0,1);, it is equivalent to target += 0, which can be ignored (the term wonā€™t be 0 if the bounds are different, but if the bounds are constant, the density it will still be constant and so you can ignore it).

However the dependent uniform prior target += uniform_lpdf(y | 0, x) is equivalent to target += -log(x) which obviously is not a constant and hence cannot be ignored.

More generally it is always good to remember that Stan program is really just a function to compute posterior density - nothing more, nothing less. So although it may sometimes feel that Stan does some sort of magic, it actually never does any magic.

Hope that clarifies more than confuses.

4 Likes

The implicit prior is 2-dimensional uniform on the triangle (0,0), (1,0), (1,1). Those marginal distributions are cross-sections of the triangle and I assume you estimated them from histograms; a scatter plot will have uniform appearance.

3 Likes

Iā€™m having a moment - the histograms in the post are certainly estimates of the marginals of the ā€œuniform over the triangleā€, but Iā€™m missing how the model as specified results in that distribution. The model has both lower/upper specified, so Iā€™m assuming it uses inv-logit as a transform, but I canā€™t get that to match my own, probably flawed, log posterior function. If you have a moment, can you see if I missing something obvious?:

library(ggplot2)

res <- array(dim = c(1e5, 2))

# can match the histograms from the blog post
for (ii in seq_len(nrow(res))) {
  x <- sqrt(runif(n = 1)) # but whyyyy sqrt when jacobian is exp(y)
  y <- runif(n = 1, min = 0, max = x)
  res[ii, ] <- c(x, y)
}

# evidently uniform
res |> 
  as_tibble(.name_repair = \(x) c("x", "y")) |> 
  ggplot() +
  geom_hex(aes(x = x, y = y), bins = 57)

withr::with_par(new = list(mfrow = c(1, 2)), {
  hist(res[, 1])
  hist(res[, 2])
})

logit <- function(x) {
  log(x / (1 - x))
}

inv_logit <- function(v) {
  1 / (1 + exp(-v))
}

log_post_2 <- function(x, y) {
  inv_logit_x <- inv_logit(x)
  inv_logit_y <- inv_logit(y)
  
  res <- dunif(inv_logit_x, min = 0, max = 1, log = TRUE)
  res <- res + log(inv_logit_x) + log(1 - inv_logit_x)
  
  res <- res + dunif(x = x * inv_logit_y, min = 0, max = x, log = TRUE)
  res <- res + log(x) + log(inv_logit_y) + log(1 - inv_logit_y)
  return(res)
} 

plot_grid <- tidyr::expand_grid(
  x = seq(from = 0, to = 1, length.out = 100),
  y = seq(from = 0, to = 1, length.out = 100)
) |> 
  filter(y < x) |> 
  mutate(lp2 = log_post_2(x, y))

# this should be uniform, but it isn't and I'm missing something.
ggplot(plot_grid) +
  geom_point(aes(x = x, y = y, colour = lp2))

When you donā€™t sqrt you get


This is the correct marginals and hereā€™s why:

We have

X \sim \mathcal{U}(0, 1) \\ Y \mid X \sim \mathcal{U}(0, x)

To get the marginal of Y we have to integrate out X

f_{Y \mid X} (y \mid x) = \begin{cases} \frac{1}{x} \; \text{if } 0 \le y \le x \\ 0 \; \text{ otherwise} \end{cases} \\ \begin{align*} f_{Y} &= \int_0^1 f_{Y \mid X} (y \mid x) f_X(x) dx \\ f_{Y} &= \int_y^1 \frac{1}{x} dx \\ f_{Y} &= -\ln(y) \end{align*}

where we know f_X = 1 and the limits of integration.

Letā€™s generate these marginals without the conditional distribution. Integrating the pdf for Y gives F_Y = y - y \ln(y) which is a nasty thing to try to do inverse transform sampling. However, our friend, wolfram alpha, easily dispatches with that and shows us that we can use the lambertW function (which is actually built-in to Stan as well). The code in R for that is

library(lamW)
for (ii in seq_len(nrow(res))) {
  x <- runif(n = 1) 
  y <- exp(1 + lambertWm1(-runif(n = 1, min = 0, max = 1) / exp(1)))
  res[ii, ] <- c(x, y)
}

which matches the above marginal

Just for fun we can test this in Stan

parameters {
  real<lower=0, upper=1> x;
  real<lower=0, upper=1> y;
}
model {
 target += log(-log(y));
}

which gives

 variable  mean median   sd  mad    q5   q95 rhat ess_bulk ess_tail
     lp__ -3.89  -3.55 1.14 0.81 -6.18 -2.79 1.00     1800     2205
     x     0.50   0.50 0.29 0.36  0.05  0.94 1.00     3092     2340
     y     0.25   0.19 0.22 0.20  0.01  0.70 1.00     2988     2316
4 Likes

This is illuminating (as ever) and I fully get that they are the correct (intended?) marginals.
My confusion is twofold in the other direction:

  • Why is sqrt the ā€˜correctā€™ transformation for x in the naive Monte Carlo sampler to match the ā€œuniform over the triangleā€ distribution implemented by the Stan program in the OP? Iā€™m 99% sure the answer is ā€œthe distribution of a transformation of a random variable formulaā€, but Iā€™m not seeing the sqrt yielding transformation (more coffee for me maybe).
  • What am I missing in log_post_2 from either log-Jacobian that would make the final plot uniform over y < x ?

This is because things cancel out. To get to this letā€™s assume that Z \sim \mathcal{U}(0, 1) and X = \sqrt{Z} and Y | X \sim \mathcal{U}(0, x).

The distribution of X is

\begin{align*} F_X(x) &= P(X \le x) = P(\sqrt{Z} \le x) = P(Z \le x^2) \\ F_X(x) &= \int_0^{x^2} dz = x^2 \\ f_X(x) &= \delta F_X(x) = 2x \end{align*}

The distribution of Y | X \sim \mathcal{U}(0, x) and we need the -log(x) because it depends on a variable.

Hence the target increment is

// target += log(2 * x) = log2() + log(x);
target += log2() + log(x);
target += -log(x);

since log2() is a constant then this the same as specifying a 0 which is what happens when we donā€™t do anything.

To make this complete. The marginal distribution of Y (integrating out X from P_{Y \mid X}(y \mid x)) here is what @ChrisWymant posted in his OP:

2 Likes

How satisfying - thank you for making it explicit for me!

1 Like