Optimizing Stan Performance for Single-Cell RNA-seq Mixed Effects Model (10K features, 50K cells)

1. Yes, I often recommend pinning the group-level variance parameter or covariance matrix to a pre-chosen value based on subject-matter information. Often the inference isn’t super-sensitive to this group-level variance, as long as it’s not so small that it causes all the estimates to disappear to zero and not so small that the estimates are wildly noisy.

I’ve toyed with the idea of making this a more formal procedure, for example drawing 10 values of the set of variance parameters from a prior, then using these to run 10 fast inferences (could be MCMC or even just plain old optimization and Laplace approx), then averaging over them using stacking. I think this could work, but I’ve never actually tried it, let alone evaluated the idea. It’s a research idea!

2. Sometimes we do use gamma priors for group-level variance parameters. The gamma prior with 1 or more degrees of freedom has the pleasant property of being zero-avoiding, which is especially helpful when doing marginal maximum likelihood, as we discuss in our 2013 paper: https://sites.stat.columbia.edu/gelman/research/published/chung_etal_Pmetrika2013.pdf or for covariance matrices (using the Wishart, _not_ inverse-Wishart) prior for cov matrix in our 2014 paper: https://sites.stat.columbia.edu/gelman/research/published/chung_cov_matrices.pdf

3. Another thing that’s worked well for me is to use Pathfinder to get starting values. It varies, but sometimes Pathfinder runs very fast and then we can jointly estimate all the parameters and not worry so much about the funnel.

3 Likes