# Sparse model matrix functionality without the sparse matrix

While we all wait for tuples and sparse matrix support, I finally had a chance to think about non-language solutions to the problem of sparse model matrices and it turns out Stan has a pretty good one. Everything is wrapped up nicely here: https://github.com/sakrejda/hierarchy/

The motivating perspective is nicely described in a book by James Hodges and presents an awful lot of models as (mostly sparse) model matrices. Yey. In Stan you end up using these things as components for describing GLMâ€™s quite a bit so it would be nice to have first class sparse matrix support but thatâ€™s â€¦ ongoingâ€¦

You can hack sparse matrices with ragged arrayâ€¦ of course those are not in Stan either so you hack those too. Thereâ€™s an example script in a little package demo that you can install with `devtools::install_github("sakrejda/hierarchy")` and run an example script:

``````script_file = system.file('quick-check/check-02.R', package = 'hierarchy')
source(script_file)
``````

The upshot is that with a 10,000 x 200 matrix thatâ€™s about 2% non-zero you can easily write a linear model thatâ€™s twice as fast (and uses less memory) if you take

``````X * betas
``````

and replace it with

``````for (i in 1:N)
mu[i] = sum(X_vec[starts[i]:stops[i]] .* betas[NZE[starts[i]:stops[i]]]);
``````

Where `X_vec` is the non-zero entries of `X`, `NZE` are the column-indices of the non-zero entries of X, and `i` is the row index. The starts/stops here are where the `i`'th row starts and stops in the dense vectors `X_vec` and `NZE`. This data stored here is similar to the CSR sparse matrix format.

In both cases here the `X`/`X_vec` can be either data, a parameter, or a transformed parameter. That means it can be used to construct measurement error models and other models where you need to have parameters as (some) entries of the `X` matrix.

The package I reference above is one Iâ€™m putting some tools like this in but the function used here is quite short:

``````#' Create a sparse list model matrix
#'
#' @param formula formula to use
#' @param data where to look up terms
#' @return list with Stan-friendly components
#' @export
flat_mm = function(formula = 1, data = NULL, ...) {
mm = Matrix::sparse.model.matrix(f, data, ...)
nze = apply(mm, 1, function(x) which(x != 0))
N = length(nze)
stops = sapply(nze, length) %>% cumsum
starts = c(1, stops[1:(N-1)] + 1)
nze = unlist(nze)
X_vec = apply(mm, 1, function(x) x[x != 0])
mml = list(
N = N,
P = ncol(mm),
N_NZE = length(nze),
NZE = nze,
starts = starts, stops = stops,
X_vec = unlist(X_vec)
)
t = terms(formula)
pos = attr(t, 'response')
dn = all.vars(t)[pos]
mml[[dn]] = data[[dn]]
return(mml)
}
``````

The version of sparse matrix multiplication we have implemented (in C++, sigh) is slower than this Stan-language version (good job on that language design @Bob_Carpenter et alâ€¦!). Oops that I didnâ€™t see this when I first coded the C++ versionâ€¦ it might make sense to replace thatâ€¦

4 Likes

Hi! Probably I am missing something obvious, but how does this compare/relate to the sparse matrix stuff that is already in Stan? Specifically, `csr_matrix_times_vector` in Stan and `rstan::extract_sparse_parts()` in R?

Itâ€™s fasterâ€¦ Itâ€™s also nearly equivalent minus a copy and maybe less thorough consistency checking

One lesson I keep running into repeatedly in sparse matrix operations/algorithms is that thereâ€™s probably no single best format/solution thatâ€™s generally applicable, although the kludgy Stan functions we do currently have are very fixable (and probably worth a PRâ€¦).