Brms multilevel nonlinear model - model fails to converge when converted from individual level to multilevel


#1

I am defining a hierarchical nonlinear model with a custom function using brms. It seems to do a fine job of fitting the model on an individual level, but fails dismally as soon as I take it to a multilevel fit across individuals. I am wondering whether it could be related to my specifying the brms syntax incorrectly, or what I should be trying to address the issue.

(Hopefully) relevant details:

  • The model itself is a pharmacokinetic compartmental model, which is defined by convolving two functions analytically, so the likelihood function is 24 lines long. I defined it as a function to clear up the code, but I have been concerned that something is going wrong with STAN’s engine here.
  • The data consists of about 20 measurements per individual defining a pharmacokinetic curve which increases and decreases
  • There are model weights which are defined relative to the other measurements within each individual (group). So I would like to weight within groupings, and have sigma vary by group.
  • I could fit all of the individual measurements using an nls, and they all look fine.
  • I could also fit everyone at once using a nlme, and the estimates look excellent, so it doesn’t feel like there’s anything wrong with the group or the individuals.
  • I fit the first individual using brms (i.e. not hierarchical) using the same priors as I placed over the multilevel parameters, and it got the same outcomes as the nlme for that particular individual, so the STAN model seems to be working ok.

I’ll show how I specified the models in nlme, brms (single individual) and brms (multilevel) and hopefully that might give some clues.

nlme syntax for the hierarchical MLE model

fit_nlme = nlme(midcsc ~ two_compartment(k1, k2, k3, k4, vb, 
                                                   MidTime, b_pfc, 
                                                   lambda1_pfc, lambda2_pfc, lambda3_pfc, 
                                                   A1_pfc, A2_pfc, A3_pfc, tstar_pfc, 
                                                   b_tot, 
                                                   lambda1_tot, lambda2_tot, lambda3_tot, 
                                                   A1_tot, A2_tot, A3_tot, tstar_tot), 
                data = modeldat, 
                fixed = k1 + k2 + k3 + k4 + vb ~ 1, 
                random = k1 + k2 + k3 + k4 + vb ~ 1, 
                groups = ~ ID, 
                weights = varIdent(form = ~ invweights |  ID),
                start = coef(fit_nls), 
                control = nlmeControl(maxIter = 2000,  
                                      pnlsMaxIter = 15, 
                                      msMaxIter = 200, 
                                      tolerance = 1e-5, 
                                      opt = "nlminb"), 
                verbose = F)

(Note: I inverted the weights into invweights to feed into the variance function)

brms syntax for the individual model

(In this exercise, I have mostly been trying to grok the brms syntax, so the priors are basically placed over the mean of the nls estimates, with a fairly wide margin - parameters can’t be 0. I also filtered the data to only include the first individual in this case).

twotcm_prior <- c(
  set_prior("normal(0.05, 0.05)", nlpar = "k1", lb=0, ub=0.5),
  set_prior("normal(0.2, 0.2)", nlpar = "k2", lb=0, ub=1),
  set_prior("normal(0.05, 0.05)", nlpar = "k3", lb=0, ub=0.5),
  set_prior("normal(0.02, 0.02)", nlpar = "k4", lb=0, ub=0.5),
  set_prior("normal(0.02, 0.02)", nlpar = "vb", lb=0, ub=0.2))

stantwotcm_fit_formula <- bf( midcsc|weights(weights) ~ twotcm_stan(k1, k2, k3, k4, vb, MidTime, 
                                    b_pfc, 
                                    lambda1_pfc, lambda2_pfc, lambda3_pfc, 
                                    A1_pfc, A2_pfc, A3_pfc, tstar_pfc, 
                                    b_tot, 
                                    lambda1_tot, lambda2_tot, lambda3_tot, 
                                    A1_tot, A2_tot, A3_tot, 
                                    tstar_tot, indicator),
     # Nonlinear variables
     k1 + k2 + k3 + k4 + vb ~ 1,
     # Nonlinear fit
     nl = TRUE)

stantwotcm_fit <- brm(
  stantwotcm_fit_formula,
  family=gaussian(), 
  data = modeldat_testone,
  prior = twotcm_prior,
  stan_funs = two_compartment_stan,
  control = list(adapt_delta=0.8), 
  chains = 3)

This fits pretty quickly, the traceplots look great, the diagnostics look pretty good, and the output distributions are pretty normal. It also hits pretty much the same values as the random effects from the NLME model for this individual.

brms syntax for the hierarchical model:

mltwotcm_prior <- c(
  set_prior("normal(0.05, 0.05)", nlpar = "k1", lb=0, ub=0.5),
  set_prior("normal(0.2, 0.2)", nlpar = "k2", lb=0, ub=1),
  set_prior("normal(0.05, 0.05)", nlpar = "k3", lb=0, ub=0.5),
  set_prior("normal(0.02, 0.02)", nlpar = "k4", lb=0, ub=0.5),
  set_prior("normal(0.02, 0.02)", nlpar = "vb", lb=0, ub=0.2),
  set_prior("lkj(1)", class = "cor"))

mltwotcm_fit_formula <- bf( midcsc|weights(weights) ~ twotcm_stan(k1, k2, k3, k4, vb, MidTime, 
                                    b_pfc, 
                                    lambda1_pfc, lambda2_pfc, lambda3_pfc, 
                                    A1_pfc, A2_pfc, A3_pfc, tstar_pfc, 
                                    b_tot, 
                                    lambda1_tot, lambda2_tot, lambda3_tot, 
                                    A1_tot, A2_tot, A3_tot, 
                                    tstar_tot, indicator),
     sigma ~ 0 + ID,
     # Nonlinear variables
     k1 + k2 + k3 + k4 + vb ~ 1 + (1|k|ID),
     vb ~ 1 + (1|ID),
     # Nonlinear fit
     nl = TRUE)

mltwotcm_fit <- brm(
  mltwotcm_fit_formula,
  family=gaussian(), 
  data = modeldat,
  prior = mltwotcm_prior,
  stan_funs = two_compartment_stan,
  control = list(adapt_delta=0.9), 
  chains = 1)

This runs for ages, and has Rhat values of nearly 4, and about 2 effective samples for most parameters. The traceplots look like line plots. The posteriors look worse than I realised it was possible for a posterior to even look.

The sigma ~ 0 + ID line in particular is a sticking point for me: I got it from https://magesblog.com/post/2018-01-30-pkpd-reserving-models/ , but I really don’t quite understand what I’m specifying there. I think I tried it with 1 + ID and I don’t think it worked any better.

Summary
I’m very much hoping that this all comes down to a silly specification issue in the syntax, and that’s what I’m assuming that it is. But I’m just getting started on this project, and the aim is to dig deep and eventually be specifying things in a better way, so I would also not be displeased if this is going to mean learning the ins and outs of STAN syntax, as that would make for a fun project too!

Thank you so much in advance for any help!

  • Operating System: Windows 10
  • brms Version: 2.3.1

#2

First, I would leave sigma as is and not predict it at least for now. If that still fails, you need to specify reasonable priors on the “random” effects standard deviations (via class = "sd" in set_prior). Very likely their default priors are far too wide, which may cause some problems.

Also, you seem to specify formulas for vb twice. I believe the second just overwrites the first, but it is still confusing when reading the model specification.


#3

As another recommendation: Try to make sure that your non-linear parameters are roughly of the same order of magnitute. If one is, say, between 0.001 and 0.01 and another is between 100 and 1000, Stan may have trouble sampling from the model.


#4

Thanks so much for the response!

I tried removing sigma and ran into the same issues. Also noticed that the sigma in the single individual case was concentrated around 0.02, so I tried a much more restrictive prior over sigma to no avail. I guess it must be the SD priors then.

I’ll fiddle with them sometime this week then and get back to you!

Thanks again! I really appreciate the help!