Hello,
I am interested in fitting growth models to some data with a variety of treatments/groups in the data. Right now my project has 15 groups (genotypes) but I am hoping to get some advice on general scalability. In the example below the smallData.fit
model will run and give good results but the completeData.fit
model will fail to initialize.
My initial values are confined to be positive by rgamma
since I am using lognormal priors, I have also tried inits=0
and setting inits as a list of lists containing values for a, b, and c for each chain (idea taken from here), but the model has consistently failed to initialize when I have more than 2 groups.
Is there a set of guidelines on how many groups a brms model should be able to handle? Has anyone had a similar situation and found a solution besides running separate models on subsets of the data (that option does work but it seems like a poor workaround)?
library(brms)
library(tidyverse)
## Setup simulated data
growthSim <- function(x,a,b,c){
a_r <- a+rnorm(1,mean = 0,sd=10)
b_r <- b+rnorm(1,mean=0,sd=2)
c_r <- c+rnorm(1,mean=0,sd=.035)
return(a_r*exp(-b_r*exp(-c_r*x)))}
x <- 1:25
smallDf <- rbind(do.call(rbind,lapply(letters[1:2], function(L) do.call(rbind, lapply(1:20,function(i) data.frame("sample"=paste0("sample_",i),"treatment"=L,"time"=x,"y"=growthSim(x,200-(5*which(letters==L)),13-(.2*which(letters==L)),.2+(0.001*which(letters==L))),stringsAsFactors = F))))))
completeDf <- rbind(do.call(rbind,lapply(letters[1:15], function(L) do.call(rbind, lapply(1:20,function(i) data.frame("sample"=paste0("sample_",i),"treatment"=L,"time"=x,"y"=growthSim(x,200-(5*which(letters==L)),13-(.2*which(letters==L)),.2+(0.001*which(letters==L))),stringsAsFactors = F))))))
completeDf%>%
ggplot(aes(time,y,group=interaction(treatment,sample)))+
geom_line(aes(color=treatment), show.legend=F)
## Define Priors
prior1 <- prior(lognormal(log(130), .25),nlpar = "a") +
prior(lognormal(log(12), .25), nlpar = "b") +
prior(lognormal(log(1.2), .25), nlpar = "c") +
prior(student_t(3,0,5), dpar="sigma") +
prior(gamma(2,0.1), class="nu")
## Run models
smallData.fit <- brm(bf(y ~ a*exp(-b*exp(-c*time)),
sigma~s(time,by=treatment),
a + b + c ~ 0+treatment,
autocor = ~arma(~time|sample:treatment,1,1),nl = TRUE),
family = student, prior = prior1, data = smallDf, iter = 2000,
cores = 4, chains = 4, backend = "cmdstanr",
control = list(adapt_delta = 0.999,max_treedepth = 20),
inits = function(){list(b_a=rgamma(2,1),b_b=rgamma(2,1),b_c=rgamma(2,1))})
completeData.fit <- brm(bf(y ~ a*exp(-b*exp(-c*time)),
sigma~s(time,by=treatment),
a + b + c ~ 0+treatment,
autocor = ~arma(~time|sample:treatment,1,1),nl = TRUE),
family = student, prior = prior1, data = completeDf, iter = 2000,
cores = 4, chains = 4, backend = "cmdstanr",
control = list(adapt_delta = 0.999,max_treedepth = 20),
inits = function(){list(b_a=rgamma(2,1),b_b=rgamma(2,1),b_c=rgamma(2,1))})
Error messages from trying to make completeData.fit
for all chains are:
Chain 4 Rejecting initial value:
Chain 4 Log probability evaluates to log(0), i.e. negative infinity.
Chain 4 Stan can't start sampling from this initial value.
Chain 4 Initialization between (-2, 2) failed after 100 attempts.
Chain 4 Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.
Warning: Chain 4 finished unexpectedly!
Also this is my first post so if I should provide more information or I have tagged this topic incorrectly please let me know! Thank you!