Implicit constraints on parameters corresponding to elements of invertible matrix

As part of a more complex model, I have N \times N matrix X = I - B where I is identity matrix and B is asymmetric matrix with zeros on diagonal, lots of structural zeros and arbitrary unknown elements b_{ij} elsewhere (zeros are symmetric, but b_{ij} \neq b_{ji}). I assume a priori that b_{ij} \sim N(0.5, 0.25), although in terms of the application it wouldn’t be awful to assume that they are all constrained between [0, 1] if that makes a difference.

Currently I have just written

parameters {
  vector[N] b;
}
model {
   b ~ normal(0.5, 0.25);
   matrix[N,N] X = rep_matrix(0.0, N, N);
   for(i in 1:N) {
     // fill X;
   }
  // do something with X
  // I don't actually invert X anywhere but invertibility is assumed implicitly
} 

Now I’m wondering if this works as intended as I am defining elements b_{ij} without any constraints in the parameter block, while they actually are somehow constrained as I assume X is invertible? Would the situation change if I actually tried to invert the matrix X which would then reject some parameter combinations? My guess is no, as Stan would still try to sample b in the unconstrained space?

1 Like

Sorry for taking too long to respond.

I think that - depending on how exactly the invertibility affects the rest of the model - the direct approach might be problematic. Quite possibly, the posterior would have discontinuities around say two rows being a multiple of each other (but that is just a wild guess).

My linear algebra is not good enough to be able to help you directly, so just some hints that might let you figure out a good parametrization which would enforce the constraints you care about:

Best of luck with your model!

Thanks, we ran some tests without any constraints to b without errors and the results seem reasonable, but of course, there could be still some issues lurking somewhere and we just got lucky. The parameters b and the subsequent X in the model above are defined in such way that there cannot be rows of multiple of each other, and actually the log-posterior contains the log-determinant of X also so I guess if we were exploring posterior near the singular case the sampler would spit out some errors and/or reject the current sample due to this.

What I still wonder if there is other than some efficiency issues of prior being defined in the unconstrained space. I guess a simple analogy would be estimating the standard deviation parameter and not setting a lower bound on it?..

Sorry, I forgot about this for a bit.

Depends heavily on the model IMHO. Sampling sd without a lower bound would in most cases make the sampling deeply problematic. It is possible that your data constrain the posterior far enough from the singular case that you just don’t hit it. So if all your posterior samples are invertible and diagnostis (divergences, effective sample size, rhat, …) are OK I wouldn’t worry about it too much…

Thanks, Martin, this was my conclusion as well, everything seems to work smoothly without warnings or any peculiarities and the posterior samples suggest that we are nowhere near the singular case (or determinant flipping its sign).

1 Like