Unit vector Jacobian adjustment

Instead of the typical unit vector, I want to enforce that one variable of the unit vector is positive (so one is half normal instead of normal and then divide all of them by squared sum of all their values). If I’m implementing the unit vector on my own, then I need to apply the Jacobian adjustment. The Stan Reference Manual (10.8) says:

“In this case, the determinant of the Jacobian is proportional to […] the kernel of a standard multivariate normal distribution with n independent dimensions”

The underlying vector (call it y to be consistent with the manual) is no longer multivariate normal. So would I need to do a different adjustment?

1 Like

Just make sure you add the appropriate half normal density for that one value

1 Like

If I’m understanding the documentation correctly, it uses -0.5*y’*y as the adjustment for the unit vector (taking the log of exp(-0.5*y'*y)). If I’m calculating y’*y anyway, would I need to make any other adjustments (since that is incorporating the half normal already)? See below if it makes it clearer.

data {
    int<lower=0> N;
parameters {
    real<lower=0> B0;
    vector[N - 1] B_lower;
transformed parameters {
    vector[N] B;
    real r2;
    real r;
    r2 = square(B0) + dot_self(B_lower);
    r = sqrt(r2);
    B[1] = B0 / r;
    B[2:N] = B_lower / r;
model {
    B0 ~ normal(0, 1);
    B_lower ~ normal(0, 1);
    target += -0.5 * r2;
1 Like

The reference manual is a bit confusing on what the Jacobian adjustment is in this context, because it doesn’t cleanly separate its description of the transform itself from the description of the Jacobian part.

  • The transform, which takes us from \mathbb{R}^n to the unit sphere, is just to take an arbitrary vector and divide by its norm. No normal distributions involved.
  • The Jacobian adjustment is proportional to the standard normal density evaluated at the unconstrained vector.
  • And so Stan implements this whole thing (including the Jacobian adjustment) by sampling a standard normal vector and then dividing by its norm, which I gather is the scheme originally proposed by Muller.

For your purposes, this technique is certainly valid over the half of \mathbb{R}^n that satisfies your positivity constraint, so you can implement the positivity constraint (including its Jacobian adjustment, which Stan automatically implements when you declare with <lower=0>) and then implement the exact same transform.

You can verify for yourself that this works: delete the target += line from your Stan program and then check that the density is uniform with

mod <- cmdstan_model("/Users/Jacob/Desktop/unit_vector_test.stan")
samples <- mod$sample(
  data = list(N = 2), 
  iter_sampling = 50000, 
  adapt_delta = .95
  ) |>
hist(acos(samples$`B[2]`), breaks = seq(0, pi, length.out = 20))

Happily, even if you include your target += statement, you still get the right behavior. This is because the product of Gaussian densities is itself proportional to another Gaussian density, so despite incrementing the target twice by a standard normal, you’ve still just done the equivalent of sampling from a different zero-centered normal, which just amounts to sampling from a standard normal under a rescaling of the unconstrained space.

That helps a lot. I don’t think I completely understood what the documentation meant by uniform in this case. Correct me if I’m wrong, but what you’re basically doing is saying I have some value on the unit sphere and projecting that down to the x-axis and checking to see if the x-axis values are uniform.

Not quite. The values should be uniform over the surface of the (hyper)sphere (in this 2D case, a circle). If we project down to the (hyper)plane (in this case a line), those values will not be uniform. What is uniform is the arc-cosine of the variable that we projected down.

We want to verify that these points are uniform over the semicircle:

plot(samples$`B[1]` ~ samples$`B[2]`)

If they are uniform, then the arc-cosine of B[2], which is the angle of elevation to the point, should be uniform over its entire range. That’s what gets checked in the R script above; we are checking that the angles are uniformly distributed between zero and pi.

Gotcha. That makes sense. Thanks again for the help!