Using BRMS when the dichotomous outcome is measured with error

I’m an enthusiastic veterinarian researcher on the use of Bayesian GLMM using BRMS package (also a beginner with your package). In my clinical context I need to deal with a binary dichotomous outcome which use an imperfect test. So working on cattle pneumonia I have a test (chest ultrasound) that is not 100% specific and sensitive (we could define the Se/Sp as beta prior definition).
I traditionally used OpenBUGS models which are sometime a pain to adapt vs simple brms syntax.
For this reason I am asking you if this imperfect outcome definition can be implemented in brms.

Please also provide the following information in addition to your question:

  • Operating System:
  • brms Version:

Do you mean a beta-binomial model? Explanation to fit this model in brms can be found at https://cran.r-project.org/web/packages/brms/vignettes/brms_customfamilies.html

Yes the model I want to use is a binomial event,
however, the event I measure is a proxy of my condition of interest (that is not 100% accurate)
I study a disease (pneumonia for instance in calves) which is difficult to define per se. we use an imperfect test (ex: chest ultrasound) to assess calves’ status.
so for my dataset the observed Y is chest ultrasound result (1 or 0) and I want to derive regressor on the true disease statuse acounting for my Y imperfection,.
I know that the positive chest ultrasound result can be defined as a case mix of true and false positve: In openbugs I would write my model like this:

Model { for(i in 1:N) {
y[i] ∼ dbern(q[i]) #test result of the individual i
q[i] ¡ <− pi[i]Se+(1-pi[i])(1-Sp) #defines my risk of positive test
logit(pi[i]) <− model] }
#priors for test accuracy parameter Se and Sp
Se ∼ dbeta(a,b)
Sp ∼ dbeta(c,d)

so in a first step we define that the observed event y (test result) is linked to the condition of interest through test accuracy. we put informative priors on the test accuracy and then run the model linking the true disease status to the regressor according to the test result, .

the framework used was initially described in McInturff et al., Modelling risk when binary outcomes are subject to error. Stat Med 2004;23:1095–1109

I think the following could work, but I haven’t tested it myself yet.

bform <- bf(
  y ~ q,
  nlf(q ~ p * Se + (1 - p) * (1 - Sp)),
  nlf(p ~ inv_logit(model)),
  Se ~ 1,
  Sp ~ 1,
  nl = TRUE,
  family = bernoulli()
)
  
bprior <- set_prior("beta(a, b)", nlpar = "Se", lb = 0, ub = 1) +
  set_prior("beta(c, d)", nlpar = "Sp", lb = 0, ub = 1)
  
fit <- brm(bform, data = data, prior = bprior, ...)

1 Like

Sorry for the delay between your answer and my reply. Thank you very much for your precious help.
I tried to implement your formula and when running the model I still have some problems.
My dataset is as following
Cluster Temp X_1 X_2 X_3 X_4 X_5 Test
1 39.1 0 0 0 0 0 0
1 38.7 1 0 0 0 0 0
1 38.9 0 0 0 0 0 0
1 38.1 0 0 0 0 0 1
1 38.9 0 0 0 0 0 0
1 39.1 0 0 0 0 0 0
1 38.3 0 0 1 0 0 0
1 38 0 0 0 0 0 0
1 39 1 0 0 0 0 0
1 38.4 0 1 0 0 0 0
1 38.8 0 0 0 0 0 0
1 38.5 0 0 0 0 0 0

sorry for sending this incomplete message:
the variable cluster is an Id for the farm where the animals come from, X_1 to X_5 are binary variables (0/1), Test is the imperfect test to assess disease (binary as 0/1) for assessing the probability of being sick §.
I then tried to implement the following coding for the 1st 2 covariables X_1 and X_2 as you previously suggested:
bform <- bf(
#Consol ~ y,
Test ~ y,
nlf(y ~ p * Se + (1 - p) * (1 - Sp)),
nlf(p ~ inv_logit(Temp + X_1 + X_2 + (1|Cluster))),

nlf(logit§ ~ Temp + X_1 + X_2 + (1|farm)),

Se ~ 1,
Sp ~ 1,
nl=TRUE,
family = bernoulli())
#then adding priors based on litterature findings

bprior <- set_prior(“beta(27.02, 7.92)”, nlpar = “Se”, lb = 0, ub = 1) +
set_prior(“beta(80.58, 6.08)”, nlpar = “Sp”, lb = 0, ub = 1) #priors from published papers

#then running the final model:
fit <- brm(bform, data = data, prior = bprior)

I obtain the following error message:
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in ‘model24681a526818_file24685a48377’ at line 39, column 60

37:   for (n in 1:N) {
38:     // compute non-linear predictor values
39:     nlp_p[n] = inv_logit(C_p_1[n] + C_p_2[n] + C_p_3[n] + (1|Cluster));
                                                               ^
40:   }

PARSER EXPECTED: “)”
Error in stanc(model_code = paste(program, collapse = “\n”), model_name = model_cppname, :
failed to parse Stan model ‘file24685a48377’ due to the above error.
In addition: Warning message:
Rows containing NAs were excluded from the model.
SYNTAX ERROR, MESSAGE(S) FROM PARSER:
error in ‘model246867a22943_file24685a48377’ at line 39, column 60

37:   for (n in 1:N) {
38:     // compute non-linear predictor values
39:     nlp_p[n] = inv_logit(C_p_1[n] + C_p_2[n] + C_p_3[n] + (1|Cluster));
                                                               ^
40:   }

PARSER EXPECTED: “)”
Error in stanc(model_code = paste(program, collapse = “\n”), model_name = model_cppname, :
failed to parse Stan model ‘file24685a48377’ due to the above error.

#I’m not able to find what’s wrong with this message.

Thanks in advance for your advices!

Sorry for the late reply. In the formulas you need to separate what should be taken literally (this goes into a non-linear formula) and what should be preprocessed by brms (e.g., (1|cluster); this goes into a standard formula).
Using (1 | cluster) in a non-linear formula can thus not yield any reasonable model. For more details on the specification of non-linear models, see vignette(brms_nonlinear) as well as https://arxiv.org/abs/1905.09501 for examples.