Zero-sum vector and normal priors
The release of 2.36.0 introduces a new constraint transform in the Stan language. This constraint has been discussed here and, I’m sure, in other topics over the years.
A sum to zero vector is exactly what the name suggests. A vector where the sum of the elements equals 0. If you put a normal prior on the zero-sum vector the resulting variance will be less than the intended normal variance. To get the same variance as the intended normal prior do
parameters {
sum_to_zero_vector[N] z;
}
model {
z ~ normal(0, sqrt(N * inv(N - 1)) * sigma)
}
where sigma
is the intended standard deviation. FYI, it’s a bit more efficient to pre-calculate the sqrt(N * inv(N - 1))
in transformed_data
. The general result to get a given variance from a normal with linear constraints is in: Fraser, D. A. S. (1951). Normal Samples With Linear Constraints and Given Variances. Canadian Journal of Mathematics, 3, 363–366. doi:10.4153/CJM-1951-041-9.
I hope in the next release we add a zero_sum_normal distribution as syntactic sugar for the above.
Connection with a simplex
We know that a simplex of dimension N
is composed of N - 1
free parameters. The N^{th} parameter is determined by the other N - 1
parameters. Similarly, a zero-sum vector is determined by N - 1
free parameters.
A common way to transform an N-vector to a simplex is through the softmax
function and to set one of the elements to 0. The sum-to-zero vector we now have is probably better because it ‘parses’ or ‘distributes’ the N - 1
free elements evenly into the N^{th} which should be easier for the sampler to explore.
parameters {
sum_to_zero_vector[N] z;
}
transformed parameters {
simplex[N] s = softmax(z);
}
model {
// jacobian for softmax
// typically you'd see sum(s) here too but it's 0
target += -N * log_sum_exp(z);
// co-area correction
target += 0.5 * log(N);
// the above is not necessary in Stan b/c it's a constant
// but if you test this in something like Jax
// you'll see the log jacobian det is off by this factor
// it is due to the coarea formula, a generalization
// of the jacobian when mapping from R^m -> R^n where m >= n
}
There’s more on the coarea at coarea.pdf (578.6 KB).
Hopefully this simplex paper I have with others that details this will be out soon, where we explore this and other simplex transforms.
Zero-sum background for those interested
Our version cropped up from a circuitous research route where I’m working on a simplex parameterization paper with @mjhajharia, @Bob_Carpenter, @Seth_Axen, and @Niko where we are studying the Aitchison parameterizations and specifically the inverse logarithmic ratio. The simplex transform results from first generating a zero sum vector from a set of orthonormal bases. This seemed to work well but was a bit computationally expensive, creating a full orthogonal matrix and having matrix vector products to construct the resulting zero sum vector. @Seth_Axen noticed that this can be simplified into a series of sums, which @WardBrian condensed into a single loop through the elements!
The “different flavors” of implementation from our PyMC friends is due to choosing a different orthonormal bases. And any orthonormal bases will result in the same outcome. I believe they are doing an implementation of a Householder reflection. PyMC also allows any dimensional tensor to be zero-sum across the axes. In Stan we currently don’t have any n-array constraint transforms like that, but I have worked out the sum_to_zero_matrix
and hopefully that will be in the next release. The stan-math github issue is Add sum_to_zero_matrix · Issue #3132 · stan-dev/math · GitHub.