Brms to run mixed effect GAMs with different priors for each covariate

Operating System: macOS Mojave
brms version 2.10.0

I’m interested in running a mixed effects GAM in a Bayesian framework so that I can put in various priors for each covariate. Based on the help file and blogs I have found code that it seems to work up until when I start messing with priors. Basically I need to set lower and upper bounds for various covariates. I found on this site:https://github.com/paul-buerkner/brms/issues/86 how to do this but I still get an error. Below is my code for the model:

priors<-c(set_prior(“normal(26,2)”,class=“b”,ub=32,nlpar=‘etaTemp’),
set_prior(“normal(5,1)”,class=“b”,lb=3.5,nlpar=‘etaDO.2’),
set_prior(“normal(25,3)”,class=“b”,lb=17,nlpar=‘etaSal’))
testmod<-brm(bf(Binomial~s(etaDO.2)+s(etaSal)+s(etaTemp)+offset(Effort)+(1|Year)+(1|Month)),
data = dat, family = binomial(),nonlinear=list(etaTemp~0+Temp,etaDO.2~0+DO.2,etaSal~0+Sal),
prior=priors, cores = 1, seed = 17,iter = 4000, warmup = 1000, thin = 10) #can use control = list(adapt_delta = 0.99) to help model if need be

I tried to do the eta addition you used but I still get a warning:

Error: The following variables are missing in ‘data’:
‘etaDO.2’, ‘etaSal’, ‘etaTemp’–

Any help is greatly appreciated. Thanks!

There are multiple problems with your code. First, the nonlinear argument no longer exists as of brms 2.0. Second, you cannot pass non-linear parameters into spline functions. I am not exactly sure why and on what you need to set lower and upper boundaries, but if you want to restrict non-linear parameters then you can simply use transformations such as exp of inv_logit. I provide some examples in https://arxiv.org/abs/1905.09501

I see, okay I didn’t realize that it no longer existed. I need to set up bounds because I need to predict beyond the range of the covariates that the model was made over and I have other data that has informed me that there should be bounds for each of the covariates. I’m surprised this doesn’t exist in the function since this is a particular problem in ecological data and forecasting species habitat. I don’t understand how a transformation would solve this issue but I’ll look into your paper. Thanks.

If you need to constrain the smooth functions (because it is those covariates you need to extrapolate beyond the data) take a look at the b spline basis in mgcv (?b.spline after loading mgcv).

This is a very flexible basis and you can set the end points of the basis support that you need to extrapolate to, whilst focusing the basis on the range of the data (by setting the inner boundary knots), with the remaining knots (to reach k) spread evenly between these inner knots unless otherwise specified. The idea is that you need the penalty to extend out beyond the support of the data otherwise the smooth function explodes - there’s nothing in the data to constrain it.

The example shows how to use it and it’s effect, using mgcv, but I assume that brms should be able to handle this basis, though you only need to get the specification of the basis correct using the m argument.

Awesome, thanks for the advice. It looks like this might be the way I should go. I’ve looked at some of the instructions for b spline in mgcv but I’m still confused about the syntax. So my current model looks like this (without b spline):

mod=gam(Binomial~s(DO.2)+s(Sal)+s(Temp,k=14)+offset(Effort)+s(Year,bs=“re”)+s(Month,bs=“re”),data=dat,family=‘binomial’, method=“REML”)

However, I would like an upper bound for Temp to be 32, where the probability at or above 32 drops down to 0. For DO.2 I would like a lower bound at 3.5, where the probability at or below 3.5 drops down to 0. From reading examples is looks like the parameter m inside the spline needs to be manipulated but I still don’t really understand how to modify it to get what I need above. Do you have any other advice on this?

I don’t believe you can do that. You can only constrain what the spline contributes to your model when you extrapolate beyond the support of the data. What you can do is allow the penalty that pulls the function towards a null function (zero) to act beyond the support of the data. Without the penalty extending beyond the support of the data the behaviour of the spline can be extreme.

m is specified as a vector. In your case it will be simplest to pass 4 values to the vector. In order those values are:

  1. the minimum temperature or DO that you want to extrapolate to,
  2. the minimum temperature or DO observed in the data
  3. the maximum temperature or DO observed in the data,
  4. the maximum temperature or DO that you want to extrapolate to.

In this way, the two extreme knots (1 and 4) allow the penalty to operate in the range of the data where you will be extrapolating in, whilst concentrating knots where there is data (the inner two knots (2, 3).

If you run the example in ?b.spline, you’ll see how this affects the behaviour of the spline when extrapolating.

With mgcv there aren’t other ways in which you can control the behaviour of splines beyond the boundaries of the data from which the basis has been foremd.

ucfagls-thanks for the response. I’m tried doing what you suggested but it appears based on the bspline help page (https://rdrr.io/cran/mgcv/man/smooth.construct.bs.smooth.spec.html) that the first value in m (m[1]) determines the spline order. If that’s the case putting a temperature value in the m vector wouldn 't make sense.

For example, I tried to do what you suggested but I don’t think the syntax for m is correct.

testmod=gam(Binomial~s(DO.2)+s(Sal)+s(Temp,m=c(10,14,29,32))+offset(Effort)+s(Year,bs=“re”)+s(Month,bs=“re”),data=dat,family=‘binomial’, method=“ML”)

The response over temperature was extremely curvy when I did this and I think it is because the values in m are large. Any idea why this would be?

Thanks,