[beginner] Help specifying a negative reciprocal link function for brms in custom likelihood function?

Hi Stan/brms community,

I’m trying to set up a brms model which has the same custom link function as used for the glmer models in this paper about analysing RTs using transformations (Lo and Andrews, 2015).

It’s an inverse gaussian model, but the critical difference is that the link function returns the negative of the inverse, principally so that when interpreting the parameters, an effect with a positive value reflects an increase in reaction time (or a decrease in speed).

I’m still trying to get my head around all this, so I’m try to use brms, and set up a custom family for this case.

In the Lo and Andrews (2015) paper, they define the following function as a link function for glmer:

invfn <- function() {
    ## link
    linkfun <- function(y) -1000/y
    ## inverse link
    linkinv <- function(eta)  -1000/eta
    ## derivative of invlink wrt eta
    mu.eta <- function(eta) { 1000/(eta^2) }
    valideta <- function(eta) TRUE
    link <- "-1000/y"
    structure(list(linkfun = linkfun, linkinv = linkinv,
                   mu.eta = mu.eta, valideta = valideta, 
                   name = link),
              class = "link-glm")

here’s code to generate data. (modified from here)

#fake data
N_trials <- 50
N_participants <- 30
id_sd <- 0.05/1000
grp_mu <- 2/1000
sigma <- 0.3/1000
fctA_effect <- 10 #5ms effect A
fctB_effect <- 10 #10ms effect B
inter_effect <- 5 #5ms interaction effect

#empty tibble for data
df <- tibble(participant=character(), factorA=factor(), factorB=factor(), rt=numeric())

#loop to generate data
for(participant in 1:N_participants){
  p_mu <- rnorm(1, mean = grp_mu, sd = id_sd) #participant level intercept
  for(i in 1:2){ 
    for(j in 1:2){
      rt_inv <- 1/ rnorm(N_obs, p_mu, sigma) #make rt dist for condition
      rt <- rt_inv + (i-1)*fctA + (j-1)*fctB + (i-1)*(j-1)*inter_effect #add experimental effects
      condition_data <- tibble(participant = rep(paste0("P", participant), N_trials),
                              factorA = factor(rep(if_else(i==1, "A0", "A1"), N_trials)),
                              factorB = factor(rep(if_else(j==1, "B0", "B1"), N_trials)),
                              rt = rt)
      df <- rbind(df, condition_data)

#set contrasts so that 
contrasts(df$factorA) <- -contr.sum(2)
contrasts(df$factorB) <- -contr.sum(2)

running the following calls to glmer works fine, as does a call to lmer (which ignores the skewness of the data). Each results in a set of fixed estimates with similar signs and orders of magnitude.

glmer(rt ~ (1|participant) + factorA * factorB, data=df, family = inverse.gaussian(link=invfn()))
lmer(-1000/rt ~ (1|participant) + factorA * factorB, data=df)

I thought to use the inverse.gaussian family in brms, and just put the negative of rt (-rt) in as the predictor, but of course you can’t have negative data points for an inverse gaussian.

To get something that works in brms I found this discussion wheree the custom link function was implemented inside the loglikelihood.

So what I need to do is write a stan function to transform mu and sigma before calculating the loglikelihood on the transformed values.

I think what i need to do is define a custom family as follows:

neginvgauss <- custom_family( "neginvgauss", 
                              dpars = c("mu", "sigma"), 
                              links = c("identity", "identity"), #identity links because the link function will be in the likelihood function
                              type = c("real", "real"), 
                              lb = c(NA, 0), 
                              ub = c(NA, NA))

and modify the normal distribution log probability density function and rng so that the mu and sigma parameters take the negative reciprocal*1000 as distribution parameters:

stan_funs <- 
  real neginvgauss_lpdf(real y, real mu, real sigma) {
    return normal_lpdf(y | -1000/mu, 1000/sigma)
  real neginvgauss_rng(real mu, real sigma) {
    return normal_rng(-1000/mu, 1000/sigma)

stanvars <- stanvar(scode = stan_funs, block = "functions")

I think it’s these stan functions which aren’t working properly because including the negative reciprocal in the call to the log likelihood function probably isn’t the way to do this. Either that or the “random intercept” is too small an effect, maybe? I’ve spent limited time changing the fake data to see if that improves things.

Anyway the call to brms is currently as follows with priors I chose at random.

priors <- c(
  prior(normal(0,100), class = b, coef = factorA1),
  prior(normal(0,100), class = b, coef = factorB1),
  prior(normal(0,100), class = b, coef = factorA1:factorB1)
brmstest <- brm(RT ~ factorA * factor B + (1|id), data = df, family = neginvgauss, stanvars = stanvars)

The model runs (slowly compared to the brms with the build in families) but doesn’t really converge at all.

summary output here:

Family: neginvgauss 
  Links: mu = identity; sigma = identity 
Formula: rt ~ (1 | participant) + factorA * factorB 
   Data: df (Number of observations: 12000) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Group-Level Effects: 
~participant (Number of levels: 30) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)   141.33    148.02     0.01   398.60 2.40        5       48

Population-Level Effects: 
                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept           344.20    325.66     4.97   925.10 3.12        5       14
factorA1            -71.65    120.66  -338.36    55.83 3.00        5       27
factorB1            -74.66    118.42  -369.48    11.13 4.06        4       14
factorA1:factorB1  -183.65    215.80  -534.87    55.84 2.89        5       47

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     2.20      0.07     2.11     2.33 2.87        5       35

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

As you can see the Rhats are huge and the parameter estimates are nonsense.

If anyone could help point out where I am going wrong it will be greatly appreciated.

I’m working on macOS (apple silicon), and using brms version 2.18.0.