Naively, one would compute the matrix product and then takes it diagonal, but this is inefficient.

Is there a way, in C++ Stan or in Eigen, to only compute the diagonal elements of a matrix product?

Naively, one would compute the matrix product and then takes it diagonal, but this is inefficient.

Is there a way, in C++ Stan or in Eigen, to only compute the diagonal elements of a matrix product?

Just break up the calculation analytically â€“ any element of the output matrix is the dot product of two vectors â€“ a row from one of the input matrices and a column from the other,

A_{ii} = \sum_{j = 1}^{J} B_{ij} C_{ji},

or in Eigen,

```
B.col(j) * C.row(j)
```

2 Likes

Eigenâ€™s lazy evaluation takes care of this automatically, where `(B * C).diagonal()`

will compute the diagonal coefficients.

See this SO comment from one the developers: https://stackoverflow.com/a/37868577/4362612

1 Like

Thanks for the insight. Note that using the matrix operations in Stan doesnâ€™t do the lazy evaluation. Indeed:

`diagonal(multiply(A, B))`

is much slower than:

`(A * B).diagonal()`

where the latter uses only Eigenâ€™s functions.

Thatâ€™s right. Stanâ€™s matrix functions evaluate the entire matrix as theyâ€™re called, whereas Eigenâ€™s functions use expression templates to identify which parts of the matrix to evaluate

Expression templates are so cool when they work! It probably even figures out that you donâ€™t need to transpose `A`

in this case, whereas in `A * B`

, `A`

will get transposed (from column-major default to row major) to make the quadratic number of nested dot products memory local.

The bigger challenge is to extend this expression template functionality to Stan. The problem with using `(A * B).diagonal()`

as opposed to our own autodiff dot products as @betanalpha suggested is that Stan will just autodiff through Eigenâ€™s dot-product rather than using our more efficient lazy adjoint-Jacobian formulation (which cuts down on memory). So one piece gets lazier (which elements to evaluate) and one gets more eager and memory intensive (computing the Jacobian). But thereâ€™s no reason we couldnâ€™t add expression templates to Stan to get the same effect.

1 Like