Programming a Block Diagonal

Tommaso11397 · July 20, 2021, 2:46pm

Hi, I am trying to program a function which takes as input a square matrix A (let’s say K by K) and an integer (let’s say J) and gives as output a JK by JK block diagonal matrix with A on the block diagonal and 0 everywhere else. That is:

B = \begin{pmatrix} A & \boldsymbol{0} & ... & \boldsymbol{0} \\ \boldsymbol{0} & A & ... & \boldsymbol{0} \\ ... & ...& ...& ... \\ \boldsymbol{0} & ...& ...& A \\ \end{pmatrix}

Does someone know how to do it or has had the same problem? Thank you very much for the help, this forum is always full of helpful people.

Tommaso11397 · July 20, 2021, 3:57pm

I have come up with this, but there is an issue: the function creates B (which I called Omega in this) as an arrey[J,J] of matrices [K,K]. I need a way to convert it into a matrix but the function to_matrix is only defined for arrays of real numbers. What should I do?

 matrix block_diag_matrix(matrix matrix_diag, int K, int J) {
    
  matrix[K,K] O;
  matrix[K,K] Omega[J,J];
  matrix[J*K,J*K] rtn;
  
  
  O = rep_matrix(0, K, K);
  
  for (s in 1:J) {
    for (t in 1:J) {
    
      if (t==s){
        Omega[s,t]= matrix_diag;
      }
      else{
        Omega[s,t]= O;
      }
    
    }
  }
  
  rtn = to_matrix(Omega);
  return rtn;  

}

jsocolar · July 20, 2021, 4:55pm

The block matrix you need looks is a Kronecker product of your matrix with an identity matrix. I think that there should be some posts floating around somewhere on implementing Kronecker products in Stan, but I can’t find them (maybe they don’t exist?).

Perhaps an easier alternative than implementing a Kronecker product function would be just to assemble the matrix you need by concatenating your matrix with zero matrices (first form rows of dimension J x JK, then concatenate these rows). See 5.11 Matrix concatenation | Stan Functions Reference

But also, why do you need to form this particular matrix? It looks like a very inefficient way to store exactly the same information that’s already contained in A and J. Is it because you need to do some downstream matrix algebra (in which case I think there’s likely to be a more efficient implementation of the computation than first forming this giant block-diagonal matrix)? Is it to use as a covariance matrix (just replace your giant multivariate normal with a set of independent multivariate normals, one per block)?

Tommaso11397 · July 21, 2021, 6:35am

So I have J vectors \beta that are of dimension K \times 1 and

\beta_j \sim \mathcal{N}(\mu_j, \Sigma)

Where \Sigma is the same for every \beta_j that:

\begin{pmatrix} \beta_1 \\ ... \\ \beta_J \end{pmatrix} \sim \ \mathcal{N}( \boldsymbol{\mu}, \Omega)

Which on Stan becomes:

Beta ~ multi_normal(mu, Omega);

Right now I am assigning the priors in a cycle, but it is not very efficient. I managed to concatenate all the \beta_j's and \mu_j's but cannot program the block diagonal matrix. Do you think there is a more efficient way to program this? Thanks for all the help.

Tommaso11397 · July 21, 2021, 7:45am

So I think I managed to program the block diagonal function, but something is not working out:

 matrix block_diag_matrix(matrix matrix_diag, int K, int J) {
    
  matrix[K,K] O;
  matrix[K,K] A[J,J];
  matrix[J*K,J*K] rtn;
  
  
  
  O = rep_matrix(0, K, K);
  
  for (s in 1:J) {
    for (t in 1:J) {
    
      if (t==s){
        A[s,t]= matrix_diag;
      }
      else{
        A[s,t]= O;
      }
    
    }
  }
  
  
  for (i in 1:J){
    for (j in 1:J){
      for (n in 1:K){
        for (m in 1:K){
         rtn[i*m,j*n] = A[i, j, m, n];
          }
        }
      }
    }  
  
  
  return rtn;  

}

I then code the model like this:

Beta ~ multi_normal(mu, Omega);

And I get this error:

Chain 1: Rejecting initial value:
Chain 1:   Error evaluating the log probability at the initial value.
Chain 1: Exception: multi_normal_lpdf: Covariance matrix is not symmetric. Covariance matrix[1,127] = nan, but Covariance matrix[127,1] = nan  (in 'model2de871122f6a_Block_Diagonal_Work_in_progress' at line 155)

I do not understand what is wrong though.
Edit: I do understand what is wrong, the indices are not correct. However I do not understand how to correct them.

Simon · July 21, 2021, 2:08pm

Hey, I think this should work for you. It’s not really efficient and mainly for pedagogical purposes though…
Cheers, Simon

jsocolar · July 21, 2021, 3:17pm

Definitely don’t use a block diagonal here!! Using such a behemoth as a covariance matrix will not be efficient, and there’s a much better way. From the functions reference:

See section 11.8.1.2 at the link above.

Tommaso11397 · July 22, 2021, 7:33am

Thank you very much, it is written very smartly. Thanks a lot, even if it is not the most efficient way, it is still pedagogically very useful.

Tommaso11397 · July 22, 2021, 7:34am

So, I did this because a loop would be inefficient. How would you suggest giving the priors then?

Simon · July 22, 2021, 7:56am

I often find it useful to just specify a model using brms and then just look at the generated Stan code to see how I could make my own more efficient or how to implement sth.

For instance, to see how brms implements a model using a block-diagonal as you are creating here, you could just implement a hierarchical linear model with group-specific intercept and slope. Here you would model the correlation structure of the slope and intercept like you do with \gamma_i \sim N(0, A) and then create the block diagonal to sample \gamma \sim N(0, \text{block_diag(A)}).

andrjohns · July 22, 2021, 8:48am

If I’m reading this correctly, you have y and mu as arrays of vectors, and a single covariance matrix Sigma, is that right?
In that case, you can just write:

y ~ multi_normal(mu, Sigma);

The multi_normal is vectorised to handle this case, so no looping is necessary

Tommaso11397 · July 22, 2021, 9:31am

Yes I managed to do that, so the function “understands” that the vectors in the array are independent. Pretty useful, though I feel (from a beginner perspective) this is not explained too clearly on section 11.8.1.2 of the help book. But thanks everyone for the help, you are all too kind.

Topic		Replies	Views
Converting arrays of matrices into a matrix General	1	470	July 31, 2021
Construct a row-wise kronecker product matrix Modeling	6	2669	May 31, 2017
How to use 'diag_matrix' function to creat a diagonal matrix using a certain row of another matrix? General rstan , specification	4	424	October 5, 2021
Kronecker product of matrices Modeling rstan , fitting-issues , specification	5	2018	November 8, 2023
Kronecker Product with Vector Modeling techniques	23	3158	July 22, 2019

Programming a Block Diagonal

Related topics