Fitting lapsing psychometric functions (with brms?)


#1

I’d like to fit a psychometric function to some binomial data. In the past I’ve always used vanilla logistic regression, but if there are random errors this can bias the estimates of the slope and intercept of the curve (see for example Wichman & Hill, 2001). Thus I’d like to include the possibility that there’s a mixture of random/lapse responses (e.g. sampled from a single binomial error distribution) and on-task responses (as in normal logistic regression). The “classical” functional form of the model is that the response probability is a saturating logistic function with asymptotes not exactly equal to 0 or 1.

My question is whether I can express this in brms other than through the mechanism for specifying a “generic” non-linear function, and (if not) whether that comes with a big performance penalty. Of course I could write the stan code by hand, but I’d really like to be able to take advantage of brms’s ability to concisely specify random effects as well.

To be concrete, here’s an example dataset and three brms models: first is vanilla logistic regression, second is the manual nonlinear specification of logistic regression (which on my system is about 50% slower to evaluate, which I can bear), and the third is my first attempt at a lapsing logistic regression.

Simulated data

library(tidyverse)
library(brms)

options(mc.cores = parallel::detectCores())

dat <- cross_df(list(x = seq(0, 10),
                     rep = seq(1, 5),
                     subj = seq(1, 10))) %>%
  mutate(theta = 0.02 + 0.95*plogis(1 * (x-5)),
         y = rbernoulli(n=length(theta), p=theta))

Vanilla logistic regression


fit1 = brm(y ~ 1 + x, data=dat, family=bernoulli())

Nonlinear formula logistic regression

fit2 = brm(bf(y ~ 1 / (1+exp(-(eta))),
              eta ~ 1+x,
              family=bernoulli(link="identity"),
              nl=TRUE),
           data=dat,
           prior = set_prior("student_t(3,0,10)", class="b", nlpar="eta"))

Lapsing logistic regression

fit3 = brm(bf(y ~ lapselow + (1-lapsehigh-lapselow) * 1 / (1+exp(-(eta))),
              eta ~ 1+x,
              lapselow ~ 1,
              lapsehigh ~ 1, 
              family=bernoulli(link="identity"),
              nl=TRUE),
           data=dat,
           inits="0",
           control=list(adapt_delta=0.99),
           prior = c(set_prior("student_t(3,0,10)", class="b", nlpar="eta"),
                     set_prior("uniform(0,0.1)", class="b", nlpar="lapselow"),
                     set_prior("uniform(0,0.1)", class="b", nlpar="lapsehigh")))

As my attempt to increase adapt_delta might suggest, I’m getting a ton of divergent transitions with the last model (nearly 25%). So I’m afraid that it’s going to be hard to make this work without digging into the stan code (or doing some kind of hack to mimic stan’s transformation of bounded parameters)

References

Wichmann, F. A., & Hill, N. J. (2001). The psychometric function: I. Fitting, sampling, and goodness of fit. Perception and Psychophysics, 63(8), 1293–1313.


#2

Your third model runs very fast (less than 8 seconds on my computer) with no divergent transitions and good parameter recovery with a few small changes:

  • Center x
set.seed(999)
guess <- .02
lapse <- .03
dat <- cross_df(list(x = seq(-5, 5),
                     rep = seq(1, 5),
                     subj = seq(1, 10))) %>%
  mutate(theta = guess + (1-guess-lapse)*plogis(1 * (x)),
         y = as.integer(rbernoulli(n=length(theta), p=theta)))
  • Assign priors as described here. (I think there’s also a typo in your model; the guess and lapse parameters are reversed from those discussed in Wichmann and Hill.) It would probably be best not to use flat priors with bounds but that seems to work for this example.
BF <- bf(
  y ~ guess + (1-guess-lapse) * inv_logit(eta),
  eta ~ 1 + x,
  guess ~ 1,
  lapse ~ 1, 
  family = bernoulli(link="identity"),
  nl = TRUE
) 
p2 <- c(
  prior(student_t(7, 0, 10), class = "b", nlpar = "eta"),
  prior(beta(1, 1), nlpar = "lapse", lb = 0, ub = .1),
  prior(beta(1, 1), nlpar = "guess", lb = 0, ub = .1)
)
fit_2 <- brm(
  BF,
  data = dat,
  inits = 0,
  control = list(adapt_delta = 0.99),
  prior = p2,
)
# ~8 seconds....
summary(fit_2)
 Family: bernoulli 
  Links: mu = identity 
Formula: y ~ guess + (1 - guess - lapse) * inv_logit(eta) 
         eta ~ 1 + x
         guess ~ 1
         lapse ~ 1
   Data: dat (Number of observations: 550) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
eta_Intercept       0.03      0.21    -0.40     0.47       1972 1.00
eta_x               1.12      0.18     0.84     1.54       1312 1.01
guess_Intercept     0.03      0.02     0.00     0.08       1487 1.00
lapse_Intercept     0.04      0.02     0.00     0.08       1585 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

I also tried this out with a probit link (you can use Phi(eta)) and it seems to be a little more accurate re the guess and lapse parameters.

Edit: I suppose the answer to your question is “No, you have to use the nonlinear syntax as other approaches (such as the link functions in psyphy::logit.2asym()) don’t work with brms. But the performance penalty is not bad.” Personally I think the nonlinear syntax is great and don’t really see the problem. Maybe with huge datasets the run time is too long?


#3

Thanks for the help! I should have read the docs on the priors more closely but that looks like it does exactly what I wanted.

I don’t think it should be a problem, and if it is I’ll report back. I think the nonlinear formula thing is wonderful too! It eliminates one of the major pain points of Stan for me, the tradeoff between flexibility in specifying the model and easily generalizing to mixed effects designs.


#4

FWIW, I think most of the work is being done by setting the priors appropriately. Even with x not centered it runs fast and without divergent transitions.


#5

Would it not be possible to write this as a mixture of Bernoullis?

Something like this:

mix <- mixture(bernoulli("probit"), bernoulli("probit"))
fit_mix <- brm(bf(y ~ 1, mu1 ~ x, mu2 ~ 1),
               data = dat, 
               family = mix)

I haven’t really worked this out, purely based on intuition.


#6

Oh that’s really clever! I didn’t realize brms supported mixture models but yes that’s essentially what this is. I take it mu1 and mu2 are automatically mapped to the means of the two mixtures?


#7

yes, and theta1 and theta2 are the mixing proportions


#8

so if this is correct, then using this model theta2 should be equal to lapse/(1-guess)


#9

🙌 very cool

(please pardon my thumb-typing)


#10

really nice! Do you have a worked out example?