Brms custom link function for survival modelling

Hi all,

I am new to Bayesian modelling, so apologies in advance. My question is in relation to whether or not it is possible to create custom link functions in brms?

In ecology (for this example, think of individual nesting attempts by birds), it is common to model the survival of a nest using a binary response outcome (e.g., 1 = alive, 0 = depredated). However, individual nests can differ in the number of days they are exposed to predation risk, and irregular data collection methods mean that nests may be found at different stages of development. To account for this, Shaffer (2004) proposed a modified logistic regression which incorporates the # of exposure days into the link function:

# Define logistic exposure family link function 

library(MASS)
library(lme4)

logexp <- function(exposure = 1)
{
  linkfun <- function(mu) qlogis(mu^(1/exposure))
  linkinv <- function(eta) plogis(eta)^exposure
  mu.eta <- function(eta) exposure * plogis(eta)^(exposure-1) * .Call(stats:::C_logit_mu_eta, eta, PACKAGE = "stats")
  valideta <- function(eta) TRUE
  link <- paste("logexp(", 
                deparse(substitute(exposure)), ")",sep="")
  structure(list(linkfun = linkfun, linkinv = linkinv,
                 mu.eta = mu.eta, valideta = valideta, name = link),
            class = "link-glm")
}

mod.1 <- glmer(survival ~ nest_height + (1|nest_id),
               data = dat,
               family = binomial(logexp(exposure =dat$exposure)))

Is there an equivalent method for doing this kind of model using brms?

I have included some dummy data:
dat.csv (4.1 KB)

Which is modified from: https://rpubs.com/bbolker/logregexp

Thanks!

A link function in brms is just a tranform to the parameters e.g exponentiation to enforce that the parameter is positive even though it might have initially have the mathematical form of normally distributed glm.

In brms, all you need to do is write a custom family where you set the link functions to identity (so brms does nothing to the parameters by itself), then, inside your loglikelihood function, transform the parameters with any link function you like before using them to calculate a loglikelihood.

Your math is a bit in-depth for me to go through in short order but I can provide a simple example:

poisson_growth <- custom_family(
  "poisson_growth", dpars = c("mu","mudev", "sigmadev"),
  links = c("identity","identity", "identity"), lb = c(0,NA, NA),
  type = "int", vars = "vreal1[n]"
)
stan_funs <- "
  real poisson_growth_lpmf(int y, real mu,real mudev, real temp_sigmadev, real time) {
real sigmadev = exp(temp_sigmadev) ; # **here I apply a link function of my choosing to sigmadev before utilising it in the loglikelihood**
    return y*log(mu)+y*log_diff_exp(lognormal_lccdf(time-1|mudev,sigmadev),lognormal_lccdf(time|mudev,sigmadev))-mu-log_rising_factorial(1,y); # this is the expansion of log of the PMF for poisson. I was hoping to improve my model performance by manually writing it out
  }
  
# now create some test data 
stanvars <- stanvar(scode = stan_funs, block = "functions")

mu = 1
sigma = 1
lambda_mu = log(50)
lambda_sigma = 0.5
nind = 10
max_t=20
n_incept = 10
base_lambda=data.frame(row=1:(n_incept*nind),inception=rep(c(1:n_incept),nind),industry=row %/% nind,base_l=rlnorm(nind*n_incept,lambda_mu,lambda_sigma))

n = nind*max_t*n_incept
trend = 0.3
intercept=0
test_data = data.frame(row = 1:n) %>%mutate(t = rep(c(1:max_t),n/max_t),industry = ((row-1) %% (max_t*nind))%/% max_t +1, inception = (row-1) %/% (max_t*nind), exposure = rep(1,n))%>% left_join(base_lambda, by = c("inception","industry")) %>% mutate(count = rpois(n,(base_l+trend*inception+intercept)*exposure*(plnorm(t,mu,sigma)-plnorm(t-1,mu,sigma))))

# now run the model
fit2 <- brm(
  bf(count | vreal(t) ~ log(1+exp( ((const*lambdasigma+lambdamu)+trend*inception) * exposure))
     ,const ~ (1|industry) + (1|inception)
     ,lambdasigma + lambdamu + trend~ 1
     ,nl=TRUE
  )
  , data = test_data
  , family = poisson_growth
  , stanvars = stanvars
  , prior = c(
      prior(normal(0,1),nlpar = "const")
      ,prior(normal(0,1),nlpar = "lambdasigma")
      ,prior(normal(1,1),nlpar = "lambdamu")
      ,prior(normal(0,1),class = "mudev")
      ,prior(normal(0,1),class = "sigmadev")
      ,prior(normal(0,1),nlpar = "trend")
  )
  ,chains=4
  ,iter=1e4
  ,thin=10
  ,control=list(max_treedepth=50)
)

"

So, inside the custom function poisson_growth_lpmf() I have defined a custom link for the sigma (a trivial exponential in this case). In your case the link function seems to be the exposure times the logistic transform of your parameter, but I can’t easily write it out for you without revising the survival model method you are using. I note that you might also be making use of the vreal() component. You would be using vreal(exposure) to pass the exposure to the custom loglikelihood function.

Let me know if this helps or if you need more.

I like using generalized nonlinear models for scenarios like this.

mod.1 <- brm(bf(survival ~ (1+exp(-eta))^(-exposure),
                eta ~ nest_height + (1|nest_id),
             data = dat, nl = TRUE,
             family = bernoulli(link="identity"))

Hi nelsjohnson, thanks for your answer. When I try implement it I get the error:
Error: Expecting a single value when fixing parameter 'data.nest_id'.

I don’t know what causes that error. Try moving the data = dat argument to outside the bf function

Just to add to this topic; I get the same error fitting a hurdle model that ran error-free on simulated data a few weeks ago:

model1 <- brm(bf(DV ~ 0 + Intercept + IV1 + IV2 + (0 + Intercept | PID), 
                 hu ~ 0 + Intercept + IV1 + IV2 + (0 + Intercept | PID), 
                 data = data, family = hurdle_poisson(),
                 prior = priors1,  sample_prior = TRUE))

Error: Expecting a single value when fixing parameter ‘data.PID’.

The model won’t run when I move the data = data argument outside the bf function.

Would be great if somebody has any insight what suddenly causes this error.