Predict error with nested random effect structure


I am having issues using the predict.brmsfit function when I have new random effect levels. I have a nested random effect structure of (1|Region/Species). From the R documentation, I have tried setting allow_new_levels=TRUE as well as re_formula=NA however, in both cases I get the following error:
Error: New factor levels are not allowed. followed by a list of all of contained species incorporated in the analysis.

I have tried to troubleshoot this and I have found if I remove the nestedness and re-fit the model with only (1|Species) setting either allow_new_levels = TRUE or re_formula=NA results in a normal predict output.

Is there something else I need to specify in order for this to work with my nested structure?

Thanks in advance - Liam

Please also provide the following information in addition to your question:

  • Operating System: Mac OS High Sierra
  • brms Version: 2.31

Strange, I would expect allow_new_levels = TRUE. Let me see if I can reproduce this problem.

Thanks, if not I will make a reproducible example and post it on here

If it is not too much work, I would prefer a reprex from you, of course. :-)

no problem, i’m on it

Here you go - i fitted the model with only a subset and then tried to use predict using the entire data frame

df.csv (2.1 KB)

forum_example.R (205 Bytes)

1 Like

Thanks! It should be fixed now in the dev version of brms. You can install it via

if (!requireNamespace("devtools")) {

fantastic, thank you! it works perfectly

I just downloaded the dev version, but I’m still getting the same issue. For what it’s worth, the help page for predict.brmsfit also no longer mentions the allow_new_levels argument or the sample_new_levels argument. Are those arguments no longer supported?

  • Operating System: Mac OS High Sierra
  • brms Version: 2.7.1

Can you provide a reproducible example?

The arguments you are mentioning are now documented under ?extract_draws.brmsfit but still function as before.

Here is a minimally modified example based on liamkendall’s example in this thread (using the same data set). It seems what breaks things is the use of random effects specified as in mgcv (which I need because I want to specify random slopes for a smooth)

#read df

#Create model with subset
mod1=brm(Spec.wgt~s(IT)+s(Genus, bs = "re"),df[1:10,])
mod2=brm(Spec.wgt~s(IT)+(1 | Genus),df[1:10,])

#try to predict with full df. The first one does not work, the second works.
        allow_new_levels = TRUE, 
        sample_new_levels = "uncertainty")

        allow_new_levels = TRUE, 
        sample_new_levels = "uncertainty")

df.csv (2.1 KB)

Thanks for this example!

brms does not offer the same amount of support for random effects via splines as for “standard” random effects. I can confirm, the first won’t allow new levels, but I don’t see this can be fixed without considerable amount of special case coding. I don’t think it’s worth it to be honest, since brms already offers such a flexibel multilevel syntax that one does not need to specify random effects via splines.

Yes, that makes sense. When I originally posted the question, I hadn’t noticed that the problem was the random effect smooths. The only functionality that is lost this way are random slopes (incl. factor smooths) for non-linear effects, for which–I think–there’s not alternative in brms?

Yes indeed, non-linear random smooths are possibly the only thing brms currently can predict values for new levels.

Hi Paul,

I’m surfing through the forum looking for a thread I had seen before about random effects in mgcv vs. brms and ran across this one. Quick question regarding your comment above:

If fitting a multilevel model such as: response ~ s(var1) + (1 | group1) , do I also need to specify the random effect in the spline, so that it would read: response ~ s(var1, by= group1) + (1 | group1)? I’m assuming from the comment above that the answer is no. If this is the case, when does one use the argument “by=” in a brms spline (if ever)?

Thank you!
Denise Colombano

“Need” is perhaps not the right question. This will depend on how you want to model your data. Do you expect the spline to vary over groups? If yes, it can be sensible to add a s(var1, by = group1) term, but be aware that this is not a “random” spline in the sense that there is no average spline and no shrinkage of the indiviudal splines will happen. instead, it’s just one spline per level of group1. You can have ‘actual’ random splines via the bs = "fs" type. For more details, see ?mgcv::s

As brms’ splines are based on mgcv, most of what mgcv supports works with brms as well.