Cross-posted from GitHub.
I’m writing an R package where a user passes an R formula specifying the experimental design of an RNAseq experiment, and then this formula is used to construct a Stan model (using brms
) that is ultimately passed to rstan
.
An example of this would be:
~ cell_type + (1 | donor:cell_type)
This internally is modified to include an interaction allowing for multiple genes to be modelled at once, producing:
~ (cell_type:feat_idx) + (1 | (donor:cell_type):feat_idx)
This formula is then included in the actual brmsformula
I pass to brms
:
# Construct formula for brms
formula <- paste0(
"counts ~ intercept + design + log(sf) + log(",
n_batches,
")"
)
formula <- brms::bf(formula, nl = TRUE) +
brms::lf(design_formula, sparse = TRUE) +
brms::lf(paste0("sf ~ 0 + ", batch_name)) +
brms::lf(paste0("intercept ~ 0 + feat_idx"), sparse = TRUE)
For the current example, this final brmsformula
looks like:
counts ~ intercept + design + log(sf) + log(6)
design ~ 0 + (cell_type:feat_idx) + (1 | (donor:cell_type):feat_idx)
sf ~ 0 + batch
intercept ~ 0 + feat_idx
My issue however is that although I specify sparse = TRUE
, the resulting fixed/population effects model matrices are not stored as sparse model matrices:
model_data <- brms::standata(
formula,
data = model_data,
family = "negbinomial",
prior = priors,
threads = brms::threading(n_threads)
)
> model_data$X_design |> class()
[1] "matrix" "array"
> model_data$X_intercept |> class()
[1] "matrix" "array"
These matrices are extremely large and contain many 0s, so I really would like to store them as sparse matrices.
What am I doing wrong? How can I make the sparse = TRUE
argument work?
Supplementary Information
May or may not be helpful but I’ll include here:
- The priors are constructed like so:
# Construct priors
priors <- brms::prior_string(
paste0("dirichlet(rep_vector(1, ", n_batches, "))"),
nlpar = "sf"
)
# Include optional priors
if (any(!design$random)) {
# Induce sparsity in fixed effects
priors <- priors +
brms::prior(
"horseshoe(main = TRUE, scale_slab = 10000)",
class = "b",
nlpar = "design"
)
}
if (any(design$random)) {
# Induce sparsity in random effects
priors <- priors +
brms::prior(
"horseshoe(1, scale_slab = 10000)",
class = "sd",
nlpar = "design"
)
}
And for the current example produce the following:
> priors
prior class coef group resp dpar nlpar
dirichlet(rep_vector(1, 6)) b sf
horseshoe(main = TRUE, scale_slab = 10000) b design
horseshoe(1, scale_slab = 10000) sd design
lb ub source
<NA> <NA> user
<NA> <NA> user
<NA> <NA> user