The mean is a linear transform (hence warnings can be ignored)

Hi all -

I am posting this more as a knowledge base than an actual question. Recently I’ve been using a model where it makes sense to constrain the mean of a parameter vector by including a sampling statement of the mean, as follows:

data {
int N;
vector[N] y;
}
parameters {
vector[N] mu;
}
model {
  mean(mu) ~ normal(2,0.1);
  y ~ normal(mu, 1); 
} 

This will generate a warning in the parser because of a transform on the left-hand side. After doing some work on this, it turns out the mean of a vector is a linear function of the vector as N does not vary for a given model. As such, I believe that this type of constraint does not require a Jacobian to balance the model.

I’m posting this because 1) I couldn’t find anything written on this and 2) I wanted to be sure I was right :). If I’m wrong about this, please post!

The difficulty here is that the mean is a scalar function of a vector. To evaluate the determinant of the Jacobian matrix of the inverse transformation function of a vector, you need to think in terms of N random variables, N - 1 values — which are the same as the first N - 1 values of the N values being transformed — and the expectation. The N \times N Jacobian matrix can then be partitioned as
\mathbf{J} = \begin{bmatrix}\mathbf{I} &\frac{1}{N}\mathbf{1}\\ \frac{1}{N}\mathbf{1}^\top &\frac{1}{N} \end{bmatrix}
whose determinant is constant, but not 1, so ignoring the warning does not affect the proposals for the parameters. It would affect things like the Bayes Factor.

3 Likes

Interesting. So essentially I would need to multiply the log density by a function of N if I am calculating anything from the posterior distribution that is also affected by N (Bayes factor, etc).

Multiply the density by the determinant or add the log-determinant to the log-density, depending on what units the post-estimation function is expecting.

So taking the determinant of J comes to

I\frac{1}{N} - \frac{1}{N}1\frac{1}{N}1^T

The second term will simplify to the dot product of the two vectors, so we can replace that with a real number r.

We then have the following expression for the analytic determinant of the Jacobian:

I\frac{1}{N} - r

Does that look right? Just checking the math.

I’d have to do some more math, but it looks like this term should vanish as N grows.

Sorry per wikipedia it looks like I can’t use that formula for the determinant of a block matrix. I can though use this one:

det( \begin{matrix} A & B \\ C & D \end{matrix}) = (D - CA^{-1}B)det(A)

Which can be programmed directly.