Improve model on log-normal data with heterogenous variability

Marie · April 30, 2021, 12:26pm

Hi,

I’m working with data which are characterised by :

log-linear accross time : log(Y)=a+b*Day
Repeated measurements
Possibility of complete regression i.e. subjects with regressive tumor volume
Variability inter and intra groups

Here is a example of what the data could look like :

Here is the code :

nlprior <- c(prior(normal(0, 100),nlpar = "alpha"),
               prior(normal(0, 100),nlpar = "beta"),
               prior(gamma(0.01, 0.01), class=sd, nlpar = "alpha"),
               prior(gamma(0.01, 0.01), class=sd, nlpar = "beta"),
               prior(gamma(0.01, 0.01), class=sigma)
             )
  
  nlform <- bf(log_value ~ alpha+beta*(day),
                 alpha ~(1|gr(Animal_ID, by=Group_no))+Group_no,
                 beta ~ (1|gr(AnimaI_ID, by=Group_no))+Group_no,
                 nl = TRUE)

fit <- brm(formula = nlform,
               data = data,
               prior=nlprior,
               sample_prior = "yes",
               family = gaussian(),
               chains = 4,
               save_pars=save_pars(all=TRUE),
               cores= parallel::detectCores(),
               iter = 100000,
               thin = 5,
               warmup=25000,
               seed = 42,
               control = list(adapt_delta = 0.95, max_treedepth = 15)
    )

Here are my issues :

Is it possible to only put the gr(Animal_ID, by=Group_no) on beta and not alpha ? it’s not running if I only put on beta.
The code made several hours (even days) to compute, is it a way to improve that ?
I used hypothesis function to compare each group to the first one. Is it necessary to adjust for multiplicity just like in frequentist. If yes, how ? If no, why ?

I can’t provide the data but I hope it’s clear.
Thanks a lot

martinmodrak · May 6, 2021, 9:56am

Hi!
one thing that feels weird is that I think a model equivalent to yours could be written directly in the normal (linear) brms syntax. Something like:

log_value ~ Group_no * day + gr(1 + day, Animal_ID, by = Group_no, cor = FALSE)

should be equivalent or almost equivalent. Is there a reason for your use of the non-linear syntax? (e.g. if you plan to extend the model further and that would require non-linear terms).

I don’t see a reason why it should be a big problem in principle - what do you mean by “not running”? Do you get an error? Or is the model very slow? Or anything else?

Probably yes. You are using a lot of iterations - Stan generally needs much less iterations than say JAGS as it tends to explore more efficiently. You also have high adapt_delta and max_treedepth. If the model gives you warnings with lower number of iterations/adapt_delta/max_treedepth, it is a signal that there is something problematic with the model. Most likely you just don’t have enough data to learn anything useful about all of your parameters.

Additionally, we usually don’t recommend this type of gamma prior on the sd parameters as it tends to put too much prior weight on large sds. So unless you have a good reason to prefer the gamma prior, I would rather use something like normal(0, half_the_variability_you_roughly_expect).

Feel free to ask for further clarifications if anything is unclear.

Best of luck with your model!

Topic		Replies	Views
Non-linear hierarchical modelling (logarithmic) problem brms fitting-issues , brms	6	1273	May 7, 2021
Advice on a non linear regression model (brms) brms	2	632	February 28, 2019
Lognormal model predicts extremely high and unlikely values in brms Modeling fitting-issues , specification , brms	8	1419	September 21, 2022
Could you please check if I specify the model by brms correctly? brms	6	666	October 15, 2019
Parameter contrasts in Bayesian linear regression model using brms brms	11	2298	February 26, 2019

Improve model on log-normal data with heterogenous variability

Related topics