Posterior predictive output when fitting a multilevel nonlinear model with brms differs from output when using nlmer (lme4)

I am currently trying to fit a Bayesian model on repeated measures data, but seem to be facing an obscure problem.

For a start, I did prior predictive check and my model seemed to be behaving well (Rhats = 1 and 0 divergent transitions), hence I decided to proceed with model fitting. I run 4 chains of 6000 iterations (with warmup 1000), which seemed to be enough - my full model, again, converged with no indications of problems.

I then proceeded to testing (on my training data) whether the model would predict it well, however, the results I got were way off (with order of magnitude around 15 for most predictions, compared to the real data). I then tried to fit the same model with lme4 and the output I got was reasonable.

So I am very curious where this discrepancy might be coming from? And if, for example, the problem is with my priors, why was the predictive prior check OK and didn’t throw any problems?

My code is given below:

> library(brms)
> library(bayesplot)
> library(rstanarm)
> # Read source data in .csvformat
> ldlData <- read.csv("sorted_with_category.csv", header = TRUE) 
> 
> # Rename columns to make faster / more intuitive use
> colnames (ldlData) <- c ("Sample", "Time", "LDL","Cat" )
> 
> #Turning string column into a column of numbers (if necessary) 
> ldlData$LDL <-as.numeric(as.character(ldlData$LDL))
> ldlData$Time <-as.numeric(as.character(ldlData$Time))
> View(ldlData)
> quantile(ldlData[,"LDL"], na.rm=T, probs=seq(0, 1, by=.05))
> plot (LDL ~ Time, data = ldlData, pch = 16)
> 
> # defining the prior
> "prior <- brms::get_prior(
>   brms::bf(LDL ~ 15 + (alpha-15) / (1 + exp ((gamma-Time) / delta)),
>        alpha ~ 1 + (1|Cat)+(1|Sample),
>        gamma ~ 1 +  (1|Cat)+(1|Sample),
>        delta ~ 1,
>        nl = TRUE),
>   data = ldlData, 
>   family = lognormal())
> prior"
> fit<- brm(
>   bf(LDL ~ 15 + (alpha-15) / (1 + exp ((gamma-Time) / delta)),
>  alpha ~ 1 + (1|Cat)+(1|Sample),
>  gamma ~ 1 +  (1|Cat)+(1|Sample),
>  delta ~ 1,
>  nl = TRUE),
>   prior = c(
> prior(lognormal(8.33, 0.15), class="b", lb=3800, nlpar = "alpha"),
> prior(gamma(42,15), class="b", lb=0,ub=5, nlpar = "gamma"),
> prior(normal(0.4,0.02), class="b", lb=0, ub=1, nlpar = "delta"),
> prior(student_t(3, 0, 10), class="sigma"),
> prior(student_t(3, 0, 10), class="sd", nlpar="alpha"),
> prior(student_t(3, 0, 10), class="sd", nlpar="gamma"),
> prior(normal(-0.002420144,0.482032688 ), class="sd", group="Sample", coef="Intercept", nlpar="gamma"),
> prior(normal(0.01815935,0.52960660 ), class="sd", group="Sample", coef="Intercept", nlpar="alpha"),
> prior(normal(-0.002420144,0.482032688 ), class="sd", group="Cat", coef="Intercept", nlpar="gamma"),
> prior(normal(0.01815935,0.52960660 ), class="sd", group="Cat", coef="Intercept", nlpar="alpha")),
>   data = ldlData, family = lognormal,
>   chains = 1,
>   iter=3000,
>   warmup = 1000,
>   cores = getOption("mc.cores",1L),
>   thin = 5,
>   control = list(adapt_delta = 0.99999, stepsize=0.000001, max_treedepth=10)
> )
> 
> predict(fit)

Based on the information your provided is not possible for me to make any specific comments. A few general ones.

  1. Make sure your non-linear parameters have the correct boundaries as a whole. Simply setting lb and ub will not work when you have varying effects in the parameter. There are several threads here on discuss that discuss this issue.
  2. You may try something else than lognormal() as a start. Perhaps gaussian() could work to get a better sense at what is happening even
    if this family may not be the one you use in the end.
  3. The scale of “Time” is relevant since it is put into exp(). Try scaling time to the unit interval or something similar so that it cannot take on too large values.
1 Like

Thank you for your reply!
But I am still curious how it is possible for my model to pass the prior predictive check and still generate predictions which are way off, compared with the available data.
Is this not the sole purpose of the predictive prior - to check if the model is consistent with the available data? Or have I completely misunderstood what it is used for?

What exactly do you understand as “prior predictive check”?

I run my model, but add the

sample_prior = “only”

to my code, so that, according to this, allows for generating samples from the prior predictive distribution. If my model is good, is the prior predictive distribution not supposed to very similar to the distribution of my available data?
Because in my case the model converges, but the predictions it makes are way off.
Or do I completely misunderstand the whole purpose of predictive priors?

I think you misunderstand what “the model converges” means. If you sample from the prior and you get high ESS, low Rhat etc it just means sampling from the prior was successfull (in the sense that the sampler could handle the priors), not that the predictions from the prior are sensible, which is something completely different. You can see the letter via pp_check or via more manual inspection of the output of posterior_predict.

Ok, thanks!
I will go back and review some theory then.

For a future reference of anyone who could potentially come across this - I simply increased the number of iterations per chain from 6000 to 20000 and changed the ‘family’ from lognormal to gaussian, my prediction results improved dramatically, so it is worth trying!

1 Like