How do you propose to code the phases in the Stan language?
We are. But as Dan responded, this is a foundational assumption in the way Stan’s built that percolates through all the interfaces. Data is persistent and immutable across sampling, anything parameter-related is volatile. Insofar as we can code up things like ordering and whatnot as data, it can persist.
If we can’t, then we need some new thinking on the language side.
For GPUs and MPI the plan is to specify data-only values that are thus guaranteed to persist and can be pinned to CPUs and GPUs once and for all. The sparse matrices issues may be similar, it may just be clunky.
Yes, but it comes at a cost, and in my estimation so far, it’d be even more of a pity to break one of the fundamental encapsulation and immutability invariants on our model class.
Assignment would only work with matching sparsity. Function arguments, being temporaries, would be an exception—they wouldn’t need to be declared for sparsity.
I hope not, too. But it’s not going to be clean and elegant. Sparse matrices never are.
Funnily enough, that’s what people have said about Stan from the get go because of its variable declaration and block restrictiveness and verbosity. Ditto the ODE solvers, which are also super clunky in terms of the way arguments are specified.
In practice, people will use things if the gain outweighs the pain. Things like the ODE solvers, GPs, and sparse matrices are going to set a higher bar in terms of programming than simple GLMs.
We could. It’d look like this:
int K; // number non-empty
sparse_matrix<M, N, is, js> A = vals;
I don’t like this, though—we don’t allow implicit reshaping for other types, such as assigning a vector to a matrix. I think it ges confusing as to where it would apply and how the rehsaping works. I’d be happier keeping the types simpler. This more verbose form is OK
sparse_matrix<M, N, is, js> A = to_sparse_matrix(M, N, is, js, vals);
where the right-hand side
sparse_matrix is a sparse matrix building function. That would probaby entail an extra copy, which is unfortunate.
Definitely not as efficient as you need to calculate all the CRS coefficients to store internally. That’s not so onerous, though, as everything in the CRS structure (presumably what they’re using internally) can be sized ahead of time and computed on the fly. Transposition from CRS is painful, though, so again, it’s just a matter of which operations run at which speed.