Advice on speeding up non-linear multi-level in brms?


We are trying to estimate additive genetic variance of growth curve parameters in a continuous growth species. To do so, we use a (variance-covariance) matrix of relatedness between individuals and a model of logistic growth. The model is the following in brms.

prior <- prior(normal(60, 10), nlpar = 'lmax') +
         prior(normal(0, 10), nlpar = 'lmax', coef = 'Sex1') +
         prior(normal(20, 2), nlpar = 'l0') +
         prior(normal(0, 2), nlpar = 'l0', coef = 'Sex1') +
         prior(uniform(0, 10), nlpar = 'k') +
         prior(normal(0, 5), nlpar = 'k', coef = 'Sex1')

model_log <- brm(bf(LMA ~ lmax  / (1 + ((lmax - l0) / l0) * exp(- k * Age)),
                    lmax + k ~ Sex + 
                      (1 | Annee_nais) + 
                      (1 | gr(ID_additif, cov = Amatrix)) + 
                      (1 | ID_pe) +
                      (1 | Mother),
                    l0 ~ Sex + (1 | Annee_nais) + (1 | Mother),
                    nl = TRUE),
                 chains = 10,
                 cores = 10,
                 iter = 2000,
                 warmup = 500,
                 prior = prior, 
                 init = rep(list(list(b_lmax    = array(data = c(68, 0)), 
                                      b_k       = array(data = c(1, 0)),
                                      b_l0      = array(data = c(21, 0)))), 10),
                 data = donnees_mod,
                 data2 = list(Amatrix = Amatrix))


  • Amatrix is the relatedness matrix between individuals (a plain matrix in R, not a sparse matrix)
  • ID_additif and ID_pe contain the ID for each individual
  • The logistic growth is parametrised by a starting size (l_0), an asymptotic size (l_max) and a growth rate (k).
  • The is composed of 11347 data on 5671 individuals

I am aware this model is extremely complex and applied to a very large dataset. However, as of now, it takes a bit more than 2 weeks to run completely.

Thus, I would be happy to hear any advice on how to make this model run faster. For example, I thought about provide “Amatrix” as a sparse matrix, but the help in brms is stating that this could help regarding the memory impact, but would not improve speed (and could even make it worse).

Any other idea on how to make this a bit of a speedier ride? Happy to run it through pure Stan if modifying things from brms would help (although I doubt it, as brms tend to produce much better Stan code than I would, from experience).

Are you able to use within-chain parallelization with your model and computing environment? You could probably run only 4 chains, not 10, and use the rest of your processing power for that.


Thanks, I’ll run some tests to see if the gain of within-chain parallelisation is better (considering effective sample output per unit of time).