Problems fitting multivariate model with option set_rescor

Hello,
I am struggling to fit a trivariate (i.e. three response variables) mixed model, one of whose random/group effects includes a phylogeny. For reasons that I can explain if needed, I need to obtain the residual sd and correlations of the response variables. The problem is that whenever I set set_rescor(T), the model estimates the sd of the random/group effects to be equal to zero.
When I set set_rescor(F), the model fits very well but I do not obtain the residual correlations of the response variables that I need.

These are the three identical formulae for each response variable: 
formula.LeafN3 = bf(log(SpsLeaf.N.)~Presin.s+pH.H2O.s+ Clay.s+ Corg.s + (1|Morphology)+ (1|Site.Plot)+  (1+Presin.s|Spsabrv.)+(1+Presin.s|p|gr(Spsname1, cov=inv.phylomatrix)))
formula.LeafP3 = bf(log(SpsLeaf.P.)~Presin.s+pH.H2O.s+ Clay.s+ Corg.s + (1|Morphology)+ (1|Site.Plot)+ (1+Presin.s|Spsabrv.)+(1+Presin.s|p|gr(Spsname1, cov=inv.phylomatrix)))
formula.LeafN.P3 = bf(log(SpsLeafN.P)~Presin.s+pH.H2O.s+ Clay.s+ Corg.s + (1|Morphology)+ (1|Site.Plot)+ (1+Presin.s|Spsabrv.)+(1+Presin.s|p|gr(Spsname1, cov=inv.phylomatrix)))

And the model is:
multiv5= brm(formula.LeafN3+formula.LeafP3+formula.LeafN.P3  + set_rescor(T), data=DF,
             family=gaussian(), data2=list(inv.phylomatrix=inv.phylomatrix),
             chains=3,  warmup = 1e3, thin=20, iter=1e5, cores = 3, prior=prior.m1, 
             control = list(adapt_delta = 0.9,max_treedepth=15),
             threads = threading(2, grainsize = 100))
The priors I set were:
prior.m1=prior(normal(0,1), class = "b")+
         prior(normal(0,2), class = "Intercept")+
         prior(lkj(0.5), class="cor")

All explanatory variables were standardized to mean=0 and sd=1 prior to the analysis. The dataset has 207 rows.
Before posting this message, I have tried many simplifications of the above model by deleting many fixed/population effects, by reducing the group/random effects, by increasing the warmup, the number of iterations (up to 10⁶), and by changing a few other parameters involved in the HMC algorithm such as the adapt_delta and max_treedepth). I am also trying to fit this model on a cluster of the Faculty of Agronomy and, after 5 days, the results are seemingly identical to those I am describing. I’d then guess that it is not a matter of ā€œnumerical brute forceā€.
In all cases, I obtained the same frustrating result oof not being able to estimate either the sd of the group/random effects (when I set set_rescor(T)), or the residual correlations (when I set_rescor(F)).
I would appreciate any suggestions that you might have.
Cheers

Pablo

Please also provide the following information in addition to your question:

  • Operating System: Linux ubuntu 22.04
  • brms Version:2.22_0
  • R version 4.5.0
    My machine is a Pentium I7 with 32GB RAM and the processor is AMDĀ® Ryzen 5 2600 six-core processor Ɨ 12

Looking forward to your topic!

Do you have multiple observations per species?

You write cov=inv.phylomatrix, but brms expects the phylogenetic VCV, not its inverse.

If you are not getting warnings about the fitting it is probably an issue with the model specification rather than the estimation itself.

Hi,
Thanks for your response. You are correct in that it is wrong using the inverse of the variance-covariance matrix calculated from the phylogeny. I was misled by my reading of a few papers. I am now using the correlation matrix calculated from the phylogeny. In case it helps others, these models are excruciatingly slow to fit, generally requiring many hundreds of thousands of iterations, long warmups and large thin rate because the sampled values of the parameters have high autocorrelations. You’ll require a server or a cluster and lots of patience.

Thanks again.
Pablo

I was puzzled by this and looking around led me to this. If I understand correctly, high autocorrelations allow thinning rather than requiring it. Sorry if I misunderstood you.

1 Like

Models with group-level parameters other than an intercept (i.e. (1 + covariate | gr(species, cov = phylo)) are generally inefficient to fit in brms, as mentioned in the phylogenetics vignette. However, I’ve never needed hundreds of thousands of iterations for any stan model. I would not recommend thinning, perhaps unless you have memory constraints.

You may get better efficiency using MCMCglmm for this model, although I find that package much harder to use. Also note that MCMCglmm does use the inverse of the VCV.

I am now using the correlation matrix calculated from the phylogeny

Just to double-check, you need the variance-covariance matrix, not the correlation matrix, e.g. ape::vcv.phylo(your_tree, correlation = FALSE).