Fitting lapsing psychometric functions (with brms?)

kleinschmidt · September 25, 2018, 8:17pm

I’d like to fit a psychometric function to some binomial data. In the past I’ve always used vanilla logistic regression, but if there are random errors this can bias the estimates of the slope and intercept of the curve (see for example Wichman & Hill, 2001). Thus I’d like to include the possibility that there’s a mixture of random/lapse responses (e.g. sampled from a single binomial error distribution) and on-task responses (as in normal logistic regression). The “classical” functional form of the model is that the response probability is a saturating logistic function with asymptotes not exactly equal to 0 or 1.

My question is whether I can express this in brms other than through the mechanism for specifying a “generic” non-linear function, and (if not) whether that comes with a big performance penalty. Of course I could write the stan code by hand, but I’d really like to be able to take advantage of brms’s ability to concisely specify random effects as well.

To be concrete, here’s an example dataset and three brms models: first is vanilla logistic regression, second is the manual nonlinear specification of logistic regression (which on my system is about 50% slower to evaluate, which I can bear), and the third is my first attempt at a lapsing logistic regression.

Simulated data

library(tidyverse)
library(brms)

options(mc.cores = parallel::detectCores())

dat <- cross_df(list(x = seq(0, 10),
                     rep = seq(1, 5),
                     subj = seq(1, 10))) %>%
  mutate(theta = 0.02 + 0.95*plogis(1 * (x-5)),
         y = rbernoulli(n=length(theta), p=theta))

Vanilla logistic regression


fit1 = brm(y ~ 1 + x, data=dat, family=bernoulli())

Nonlinear formula logistic regression

fit2 = brm(bf(y ~ 1 / (1+exp(-(eta))),
              eta ~ 1+x,
              family=bernoulli(link="identity"),
              nl=TRUE),
           data=dat,
           prior = set_prior("student_t(3,0,10)", class="b", nlpar="eta"))

Lapsing logistic regression

fit3 = brm(bf(y ~ lapselow + (1-lapsehigh-lapselow) * 1 / (1+exp(-(eta))),
              eta ~ 1+x,
              lapselow ~ 1,
              lapsehigh ~ 1, 
              family=bernoulli(link="identity"),
              nl=TRUE),
           data=dat,
           inits="0",
           control=list(adapt_delta=0.99),
           prior = c(set_prior("student_t(3,0,10)", class="b", nlpar="eta"),
                     set_prior("uniform(0,0.1)", class="b", nlpar="lapselow"),
                     set_prior("uniform(0,0.1)", class="b", nlpar="lapsehigh")))

As my attempt to increase adapt_delta might suggest, I’m getting a ton of divergent transitions with the last model (nearly 25%). So I’m afraid that it’s going to be hard to make this work without digging into the stan code (or doing some kind of hack to mimic stan’s transformation of bounded parameters)

References

Wichmann, F. A., & Hill, N. J. (2001). The psychometric function: I. Fitting, sampling, and goodness of fit. Perception and Psychophysics, 63(8), 1293–1313.

matti · September 26, 2018, 7:04pm

Your third model runs very fast (less than 8 seconds on my computer) with no divergent transitions and good parameter recovery with a few small changes:

Center x

set.seed(999)
guess <- .02
lapse <- .03
dat <- cross_df(list(x = seq(-5, 5),
                     rep = seq(1, 5),
                     subj = seq(1, 10))) %>%
  mutate(theta = guess + (1-guess-lapse)*plogis(1 * (x)),
         y = as.integer(rbernoulli(n=length(theta), p=theta)))

Assign priors as described here. (I think there’s also a typo in your model; the guess and lapse parameters are reversed from those discussed in Wichmann and Hill.) It would probably be best not to use flat priors with bounds but that seems to work for this example.

BF <- bf(
  y ~ guess + (1-guess-lapse) * inv_logit(eta),
  eta ~ 1 + x,
  guess ~ 1,
  lapse ~ 1, 
  family = bernoulli(link="identity"),
  nl = TRUE
) 
p2 <- c(
  prior(student_t(7, 0, 10), class = "b", nlpar = "eta"),
  prior(beta(1, 1), nlpar = "lapse", lb = 0, ub = .1),
  prior(beta(1, 1), nlpar = "guess", lb = 0, ub = .1)
)
fit_2 <- brm(
  BF,
  data = dat,
  inits = 0,
  control = list(adapt_delta = 0.99),
  prior = p2,
)
# ~8 seconds....
summary(fit_2)
 Family: bernoulli 
  Links: mu = identity 
Formula: y ~ guess + (1 - guess - lapse) * inv_logit(eta) 
         eta ~ 1 + x
         guess ~ 1
         lapse ~ 1
   Data: dat (Number of observations: 550) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Population-Level Effects: 
                Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
eta_Intercept       0.03      0.21    -0.40     0.47       1972 1.00
eta_x               1.12      0.18     0.84     1.54       1312 1.01
guess_Intercept     0.03      0.02     0.00     0.08       1487 1.00
lapse_Intercept     0.04      0.02     0.00     0.08       1585 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

I also tried this out with a probit link (you can use Phi(eta)) and it seems to be a little more accurate re the guess and lapse parameters.

Edit: I suppose the answer to your question is “No, you have to use the nonlinear syntax as other approaches (such as the link functions in psyphy::logit.2asym()) don’t work with brms. But the performance penalty is not bad.” Personally I think the nonlinear syntax is great and don’t really see the problem. Maybe with huge datasets the run time is too long?

kleinschmidt · September 26, 2018, 7:34pm

Thanks for the help! I should have read the docs on the priors more closely but that looks like it does exactly what I wanted.

I don’t think it should be a problem, and if it is I’ll report back. I think the nonlinear formula thing is wonderful too! It eliminates one of the major pain points of Stan for me, the tradeoff between flexibility in specifying the model and easily generalizing to mixed effects designs.

kleinschmidt · September 26, 2018, 7:41pm

FWIW, I think most of the work is being done by setting the priors appropriately. Even with x not centered it runs fast and without divergent transitions.

awellis · February 13, 2019, 2:11pm

Would it not be possible to write this as a mixture of Bernoullis?

Something like this:

mix <- mixture(bernoulli("probit"), bernoulli("probit"))
fit_mix <- brm(bf(y ~ 1, mu1 ~ x, mu2 ~ 1),
               data = dat, 
               family = mix)

I haven’t really worked this out, purely based on intuition.

kleinschmidt · February 13, 2019, 2:15pm

Oh that’s really clever! I didn’t realize brms supported mixture models but yes that’s essentially what this is. I take it mu1 and mu2 are automatically mapped to the means of the two mixtures?

awellis · February 13, 2019, 2:24pm

yes, and theta1 and theta2 are the mixing proportions

awellis · February 13, 2019, 2:51pm

so if this is correct, then using this model theta2 should be equal to lapse/(1-guess)

kleinschmidt · February 13, 2019, 2:52pm

🙌 very cool

(please pardon my thumb-typing)

fusaroli · March 7, 2019, 8:40pm

really nice! Do you have a worked out example?

tiflo · December 3, 2020, 3:17am

@awellis The mixture formulation of the lapsing model is great, but I find myself confused about why theta2 would be lapse/(1-guess). Are you using the terms lapse and guess in the same way as @matti above (following Wiechman and Hill, 2001), where guess determines the overall floor proportion and 1-lapse determines the ceiling proportion?

If the lapsing model is written as a mixture:

theta1 * inv_link(mu1) + (1-theta1) * inv_link(mu2)

I interpret the parameters as follows:

theta1: probability of not responding based on the stimulus
inv_link(mu1): probability of successes when not responding based on the stimulus (‘bias’)
theta2 = 1 - theta1: probability of responding based on stimulus
inv_link(mu2): probability of successes when responding based on stimulus

I don’t see how theta1 is expected to be identical to lapse / (1-guess) where lapse and guess are interpreted as in Wiechman and Hill (2001). Rather, it would seem to me that

theta1 = guess + lapse (since 1 - theta1 = 1 - guess - lapse)
inv_link(mu1) = guess / (guess + lapse)

I’m probably missing something? Any help would be appreciated.

awellis · December 5, 2020, 2:22pm

hi, I’ll get back to you on this one. It’s entirely possible that I got it wrong 😬

Topic		Replies	Views
Psychometric lapsing model for more than two possible categorical outcomes brms specification , multinomial-response , mixture	3	444	March 7, 2023
Finding threshold for adaptive paradigm via psychometric function brms cognitive-science	15	1769	March 29, 2020
Specifying trials in a binomial non-linear model (four-parameter logistic) giving strange results brms fitting-issues , specification , brms	4	1126	June 8, 2023
Model won't converge under its nested (embedded) formulation Modeling brms	3	396	August 21, 2023
Decomposing Regression Coefficient in Non-Linear Model Modeling techniques , specification , irt , brms	5	559	June 20, 2023