 # Implicit constraints on parameters corresponding to elements of invertible matrix

As part of a more complex model, I have N \times N matrix X = I - B where I is identity matrix and B is asymmetric matrix with zeros on diagonal, lots of structural zeros and arbitrary unknown elements b_{ij} elsewhere (zeros are symmetric, but b_{ij} \neq b_{ji}). I assume a priori that b_{ij} \sim N(0.5, 0.25), although in terms of the application it wouldn’t be awful to assume that they are all constrained between [0, 1] if that makes a difference.

Currently I have just written

parameters {
vector[N] b;
}
model {
b ~ normal(0.5, 0.25);
matrix[N,N] X = rep_matrix(0.0, N, N);
for(i in 1:N) {
// fill X;
}
// do something with X
// I don't actually invert X anywhere but invertibility is assumed implicitly
}


Now I’m wondering if this works as intended as I am defining elements b_{ij} without any constraints in the parameter block, while they actually are somehow constrained as I assume X is invertible? Would the situation change if I actually tried to invert the matrix X which would then reject some parameter combinations? My guess is no, as Stan would still try to sample b in the unconstrained space?

1 Like

Sorry for taking too long to respond.

I think that - depending on how exactly the invertibility affects the rest of the model - the direct approach might be problematic. Quite possibly, the posterior would have discontinuities around say two rows being a multiple of each other (but that is just a wild guess).

My linear algebra is not good enough to be able to help you directly, so just some hints that might let you figure out a good parametrization which would enforce the constraints you care about:

Best of luck with your model!

Thanks, we ran some tests without any constraints to b without errors and the results seem reasonable, but of course, there could be still some issues lurking somewhere and we just got lucky. The parameters b and the subsequent X in the model above are defined in such way that there cannot be rows of multiple of each other, and actually the log-posterior contains the log-determinant of X also so I guess if we were exploring posterior near the singular case the sampler would spit out some errors and/or reject the current sample due to this.

What I still wonder if there is other than some efficiency issues of prior being defined in the unconstrained space. I guess a simple analogy would be estimating the standard deviation parameter and not setting a lower bound on it?..