Brms custom formula for logistic regression

Hello,

I am new to brms and, to a large extent, to Bayesian data analysis and would appreciate some guidance on the following topic.

I wish to analyse data collected in a psychophysical experiment which involves asking people to listen to speech sounds that vary across a certain acoustic dimension (x), and to categorize each sound in either of two categories, c1, c2. The standard approach to analyzing data of this kind is to perform a logistic regression, with a formula such as resp ~ 1 + x + (1 + x | participant), where resp refers to the participant’s binary choices to the sounds as characterized by parameter x.

I would like to define a custom formula in which the slope would be expressed as a function of the means and variances of two gaussian distributions, N(x, m1, sd), and N(x, m2, sd), associated with the two categories (with the assumption that both distributions have the same standard deviation sd.) Specifically, the slope would be defined as (m1 - m2) / sd^2. Likewise, the intercept would defined as (m1^2 - m2^2) / (2 * sd^2).

In addition to this, I want to set priors for the means m1, m2 and variance sd, in the form of a normal inverse chi-squared distribution, and to get samples from the posterior distributions for these priors.

(In essence, I am trying to implement in brms the Bayesian model of speech perception developed by Kleinschmidt & Jaeger, 2015.)

I’m a speech scientist, not a statistician, and any help on how to do this in brms would be greatly appreciated.

What is the likelihood supposed to be in this model? Is it entirely driven by fitting the normal distributions to the covariate data `x`, such that the slopes and intercepts are just generated quantities? Or do we compute the slope and intercept, then use the regular logistic regression likelihood for the data?

Relatedly, where do you want the random effects by participant to enter your model specification? Do we place random effects by participant on the parameters of the normal distributions and then compute the participant-specific means and slopes (which parameters get a random effect–just the means or the sd as well)? Or do we compute the mean intercept and slope via the normal distributions and then place a random effect around those?

Yes, the likelihood is given by the normal distributions along x. As to your second question, I think option 2 would be best. Thanks.

Option 2 on the second question isn’t possible/meaningful if the likelihood is just given by normal distributions along x–fitting random effects in that way requires a model that looks like a regression.

To fit those normal distributions in `brms`, you’d just do

``````brm(x ~ category, ...)
``````

You could optionally add random effects of the first flavor with

``````brm(x ~ category + (1 + category | participant), ...)
``````

or even

``````brm(
bf(
x ~ category + (1 + category | participant),
sigma ~ 1 + (1 | participant)
), ...)
``````

or for a slightly more flexible model (with modeled covariance between the random effects on the mean and the sd)

``````brm(
bf(
x ~ category + (1 + category |ID1| participant),
sigma ~ 1 + (1 |ID1| participant)
), ...)
``````