Chance-corrected beta binomial model for signal detection theory analysis of replicated n-AFC discrimination test

I would like to analyze data from a replicated n-alternative forced choice (n-AFC) discrimination test with brms in R.
There are frequentist ways of this signal detection theory / sensometrics analysis (e.g., in the sensR package) but I would like to try the Bayesian way.

Sample data

For N participants (pid = 1,…,N) doing m n-AFC trials (0 ≤ correct_responsesm), data would look like this:

pid,correct_responses,covariate
  1,                1,       58
  2,                0,       70
  3,                m,       15
...
  N,                5,       38

This type of data tends to be overdispersed.

Overdispersion with the beta binomial distribution

In general, overdispersion can be modelled using a beta binomial distribution where the binomial probability (per participant) is assumed to be beta distributed in the interval ( 0 , 1 ).
As the beta binomial distribution is implemented in Stan, modeling would be probably something like the following with brms:

fit <- brm(
  correct_responses | trials(m) ~ 1 + covariate + (1 | pid),
  family = beta_binomial(),
  ...
)

The problem

Some authors (e.g., Brockhoff, 2003; Meyners, 2007; Morrison, 1978) argue that the beta binomial model should be adapted to replicated n-AFC tests by correcting for the guessing probability p_0 = \tfrac{1}{n} because the binomial probability is distributed in the interval ( p_0 , 1 ) (see Bi, 2015, p. 283 and Chapter 10 in general).
The chance-corrected beta binomial model has the following pdf (Bi, 2015, p. 288, Equation 10.2.24):

P\left(x|m,a,b,p_0\right)=\frac{\left(1-p_0\right)^m}{\textrm{B}\left(a,b\right)}\binom{m}{x}\sum\limits_{i=0}^{x}\binom{x}{i}\left(\frac{p_0}{1-p_0}\right)^{x-i}\textrm{B}\left(a+i,m+b-x\right)

where \textrm{B}\left(\cdot{},\cdot{}\right) is the Beta function, m is the number of n-AFC trials, x is the number of correct responses, a,b\geq{}0 are the parameters of the distribution, and p_0 is the guessing probability.


My question

Can I use the existing beta_binomial() family from brms/Stan and correct for p_0 somewhere else (e.g., in formula of brm()) to model my data or do I need a custom family representing the chance-corrected beta binomial?


Reference

Bi, J. (2015). Sensory discrimination tests and measurements: Sensometrics in sensory evaluation (2nd ed.). Wiley Blackwell.
Brockhoff, P. B. (2003). The statistical power of replications in difference tests. Food Quality and Preference, 14(5–6), 405–417. Redirecting
Meyners, M. (2007). Proper and improper use and interpretation of Beta-binomial models in the analysis of replicated difference and preference tests. Food Quality and Preference, 18(5), 741–750. Redirecting
Morrison, D. G. (1978). A probability model for forced binary choices. The American Statistician, 32(1), 23–25.

1 Like

Is there a reason why you prefer to use brms?

If I understand correctly, you have the full likelihood right there, the probability of the discrete observation x under m trials with Beta function parameters a, b and correction probability p_0. I’d assume from there it would be straightforward to implement it in Stan, I don’t know if it’s possible to implement the correction in brms, but even if it is, maybe the former is straightforward enough and it’ll solve your problem.

1 Like

Hi caesoma, thank you for your reply.
I have worked with R, but not with stan. Before I use stan directly or implement something myself, I want to make sure I really need to.
If I cannot use existing functionality, I’ll give your suggestion of implementing it directly in stan a go.

That’s fair. I’m not a brms user myself, so I cannot be as helpful with that. But there any many users of brms, and maybe they can give you more specific input. brms is a great tool, and it can be more convenient to use it instead of the Stan language, but it also allows you to skip some details that can be helpful in understanding the full model, and with it also not allow some tweaks (or not as easily) to it.

My personal view is that the Stan language itself is normally not much of a barrier for anyone with some programming skills, especially if you are fairly comfortable with R (C is another one it resembles in some aspects). The bigger hurdle is being able to specify the model components from the ground up, the likelihood usually being the more complicated piece of it. In addition to the forum, Stan now has a language server, and specific plugins as well as the common widesperead LLM assistants will probably be able to help writing and debugging a Stan language program.

Again, that’s not to say that one or the other is inherently better, but I think users of more “friendly” interfaces shouldn’t be discouraged of using the Stan language directly.