Should sparse matrices in Stan be row major or column major

I don’t think the user should know anything about the storage format (except for there being a few different constructors), so it should be sparse_matrix not csr_matrix

I’ll just leave it at I don’t agree. Maybe I’ll change my mind (this could be a Céline song, but I Google’d it, and alas, it’s not). but I’m not convinced of the advantages. There’s still the issue of building the matrices, as random access for a generic sparse_matrix class is probably complicated unless I’m missing some fancy trick.

I’d argue the latter is necessary because we need to know if someone is able to run a cholesky factor on it at compile time (to work out the number of vars that would be needed).

Oh I see. If we’re actually talking about a matrix of random variables defined in the parameters block, one of them would have many fewer parameters of the other.

The number of vars is a runtime thing though. The autodiff graph is build dynamically, so I don’t think this factors in here.

From the Stan manual:

For instance, a variable declared to be real<lower=0,upper=1> could be assigned to a variable declared as real and vice-versa. Similarly, a variable declared as matrix[3, 3] may be assigned to a variable declared as cov_matrix[3] or cholesky_factor_cov[3], and vice-versa. Checks are carried out at the end of each relevant block of statements to ensure constraints are enforced.

I’m only into checking simple constraints (like symmetry). I’m against the positive definiteness checks. This is why I’d be against those custom types, because as Stan currently does things, it checks (I think), and I don’t like that in my models.

I could go for fancy banded_matrix/symmetric_matrix types and such if it meant avoiding functions like multi_normal_banded and multi_normal_symmetric or banded_matrix_times_vector and such. But these checks!

Sparse-sparse multiplication is anathema to the Stan paradigm (because it would change the number of vars needed as the sparsity pattern changes)

Fair enough. Probably there’s no real need for sparse matrix * matrix multiplication. (mildly related, here’s an interesting model: Nystrom approximation slows down Gaussian process regression? - #11 by Bob_Carpenter)