Searching’s not always so easy. Especially in pdfs (I need to move the manual to HTML, but it’s huge). You won’t find any mention of the R-specific notation `NA`

in the manual. At least I don’t recall putting it in there.

What’s missing is also Ben’s really nice approach. You can decompose the full matrix with missing data and observed data conceptually as

```
X = X_miss + X_obs;
```

where the observed data matrix `X_obs`

is sparse and has zeroes where data is missing; `X_miss`

has parameters where data is missing and zeroes where its observed.

What you can do rather than actually adding them and doing `X * beta`

, is to instead use

```
X_miss * beta + X_obs * beta
```

using the sparse matrix multiplication function `csr_matrix_times_vector`

for the multiplies and plain old addition for the additon.