Programming a Block Diagonal

Hi, I am trying to program a function which takes as input a square matrix A (let’s say K by K) and an integer (let’s say J) and gives as output a JK by JK block diagonal matrix with A on the block diagonal and 0 everywhere else. That is:

B = \begin{pmatrix} A & \boldsymbol{0} & ... & \boldsymbol{0} \\ \boldsymbol{0} & A & ... & \boldsymbol{0} \\ ... & ...& ...& ... \\ \boldsymbol{0} & ...& ...& A \\ \end{pmatrix}

Does someone know how to do it or has had the same problem? Thank you very much for the help, this forum is always full of helpful people.

I have come up with this, but there is an issue: the function creates B (which I called Omega in this) as an arrey[J,J] of matrices [K,K]. I need a way to convert it into a matrix but the function to_matrix is only defined for arrays of real numbers. What should I do?

 matrix block_diag_matrix(matrix matrix_diag, int K, int J) {
    
  matrix[K,K] O;
  matrix[K,K] Omega[J,J];
  matrix[J*K,J*K] rtn;
  
  
  O = rep_matrix(0, K, K);
  
  for (s in 1:J) {
    for (t in 1:J) {
    
      if (t==s){
        Omega[s,t]= matrix_diag;
      }
      else{
        Omega[s,t]= O;
      }
    
    }
  }
  
  rtn = to_matrix(Omega);
  return rtn;  

}

The block matrix you need looks is a Kronecker product of your matrix with an identity matrix. I think that there should be some posts floating around somewhere on implementing Kronecker products in Stan, but I can’t find them (maybe they don’t exist?).

Perhaps an easier alternative than implementing a Kronecker product function would be just to assemble the matrix you need by concatenating your matrix with zero matrices (first form rows of dimension J x JK, then concatenate these rows). See 5.11 Matrix concatenation | Stan Functions Reference

But also, why do you need to form this particular matrix? It looks like a very inefficient way to store exactly the same information that’s already contained in A and J. Is it because you need to do some downstream matrix algebra (in which case I think there’s likely to be a more efficient implementation of the computation than first forming this giant block-diagonal matrix)? Is it to use as a covariance matrix (just replace your giant multivariate normal with a set of independent multivariate normals, one per block)?

So I have J vectors \beta that are of dimension K \times 1 and

\beta_j \sim \mathcal{N}(\mu_j, \Sigma)

Where \Sigma is the same for every \beta_j that:

\begin{pmatrix} \beta_1 \\ ... \\ \beta_J \end{pmatrix} \sim \ \mathcal{N}( \boldsymbol{\mu}, \Omega)

Which on Stan becomes:

Beta ~ multi_normal(mu, Omega);

Right now I am assigning the priors in a cycle, but it is not very efficient. I managed to concatenate all the \beta_j's and \mu_j's but cannot program the block diagonal matrix. Do you think there is a more efficient way to program this? Thanks for all the help.

So I think I managed to program the block diagonal function, but something is not working out:

 matrix block_diag_matrix(matrix matrix_diag, int K, int J) {
    
  matrix[K,K] O;
  matrix[K,K] A[J,J];
  matrix[J*K,J*K] rtn;
  
  
  
  O = rep_matrix(0, K, K);
  
  for (s in 1:J) {
    for (t in 1:J) {
    
      if (t==s){
        A[s,t]= matrix_diag;
      }
      else{
        A[s,t]= O;
      }
    
    }
  }
  
  
  for (i in 1:J){
    for (j in 1:J){
      for (n in 1:K){
        for (m in 1:K){
         rtn[i*m,j*n] = A[i, j, m, n];
          }
        }
      }
    }  
  
  
  return rtn;  

}

I then code the model like this:

Beta ~ multi_normal(mu, Omega);

And I get this error:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: multi_normal_lpdf: Covariance matrix is not symmetric. Covariance matrix[1,127] = nan, but Covariance matrix[127,1] = nan  (in 'model2de871122f6a_Block_Diagonal_Work_in_progress' at line 155)

I do not understand what is wrong though.
Edit: I do understand what is wrong, the indices are not correct. However I do not understand how to correct them.

Hey, I think this should work for you. It’s not really efficient and mainly for pedagogical purposes though…
Cheers, Simon

1 Like

Definitely don’t use a block diagonal here!! Using such a behemoth as a covariance matrix will not be efficient, and there’s a much better way. From the functions reference:

See section 11.8.1.2 at the link above.

2 Likes

Thank you very much, it is written very smartly. Thanks a lot, even if it is not the most efficient way, it is still pedagogically very useful.

So, I did this because a loop would be inefficient. How would you suggest giving the priors then?

I often find it useful to just specify a model using brms and then just look at the generated Stan code to see how I could make my own more efficient or how to implement sth.

For instance, to see how brms implements a model using a block-diagonal as you are creating here, you could just implement a hierarchical linear model with group-specific intercept and slope. Here you would model the correlation structure of the slope and intercept like you do with \gamma_i \sim N(0, A) and then create the block diagonal to sample \gamma \sim N(0, \text{block_diag(A)}).

If I’m reading this correctly, you have y and mu as arrays of vectors, and a single covariance matrix Sigma, is that right?
In that case, you can just write:

y ~ multi_normal(mu, Sigma);

The multi_normal is vectorised to handle this case, so no looping is necessary

2 Likes

Yes I managed to do that, so the function “understands” that the vectors in the array are independent. Pretty useful, though I feel (from a beginner perspective) this is not explained too clearly on section 11.8.1.2 of the help book. But thanks everyone for the help, you are all too kind.

1 Like