Logistic regression when the dependent variable is measured with error

Dear braintrust,

A couple of years ago I submitted a post on the way to determine risk factors associated with a latent disease status that is assessed by a test with imperfect sensitivity and specificity (see original post).

So basically, this is a logistic regression where the outcome (Y) is measured by an imperfect test. For my database, I have my imperfect test “TUS” which is an imperfect test (Sensitivity Se, Specificity Sp).

My predictors (X) are T_rectal, Ear, Eye, Cough, Nasal. My latent dependent variable is the true disease status.

My commented code looks like the following:

#first of all I write the non linear formula
bform <- bf(
TUS ~ q, #test result of the individual which is a Bernoulli event

nlf(p ~ inv_logit(T_rectal+Ear+Eye+Cough+Nasal)), #logistic model written directly

nlf(q ~ p * Se + (1 - p) * (1 - Sp)), #q depends on the TP (true positive) and FP (false positive

Se + Sp ~ 1, #put the Se and Sp as value to add in the model but with no data from them (only derived from the prior)

nl = TRUE, #we use a non linear formula

family = bernoulli("identity")
)

bprior <- set_prior("beta(4.62,0.86)", nlpar = "Se", lb = 0, ub = 1) + #prior Se

set_prior("beta(77.55,4.4)", nlpar = "Sp", lb = 0, ub = 1) #prior Sp

summary( brm(bform, data = bd, #dataset is named bd

prior = bprior, # uses default priors except for and Sp

warmup = 500,

iter = 3000,

chains = 4,

init= "0",

cores=2,

seed = 123))

What I don’t understand is that when looking for the model output I only obtain the posterior from 2 parameters (Se and Sp)

I don’t obtain the regression coefficients from my predictors X (T_rectal, Ear, Eye, Cough, Nasal)

Family: bernoulli

Links: mu = identity

Formula: TUS ~ q

p ~ inv_logit(T_rectal + Ear + Eye + Cough + Nasal)

q ~ p * Se + (1 - p) * (1 - Sp)

Se ~ 1

Sp ~ 1

Data: bd (Number of observations: 482)

Draws: 4 chains, each with iter = 3000; warmup = 500; thin = 1;

total post-warmup draws = 10000

Population-Level Effects:

Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS

Se_Intercept 0.40 0.03 0.33 0.47 1.00 5935 6306

Sp_Intercept 0.96 0.02 0.93 0.99 1.00 5775 4804

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS

and Tail_ESS are effective sample size measures, and Rhat is the potential

scale reduction factor on split chains (at convergence, Rhat = 1).

prior class coef group resp dpar nlpar lb ub source

beta(4.62,0.86) b Se 0 1 user

beta(4.62,0.86) b Intercept Se 0 1 (vectorized)

beta(77.55,4.4) b Sp 0 1 user

beta(77.55,4.4) b Intercept Sp 0 1 (vectorized)

So I’ve struggled months to find what was not correct from my model and would like to know if you would have any idea to be able to look for the coefficients of X and characteristics of the logistic part of my model.

Many many thanks in advance

I think the problem is that you don’t specify coefficients in your p part, i.e. it should be something like:

nlf(p ~ inv_logit(Intercept+b_T_rectal*T_rectal+b_Ear*Ear+b_Eye*Eye+b_Cough*Cough+b5_Nasal*Nasal))

Thanks for this suggestion.
I was thinking that it was considered by default in my model with the b coefficients.
I tried to change the line as suggested but I am still not able to run the code with the message error:

Error: The following variables can neither be found in ‘data’ nor in ‘data2’:
‘Intercept’, ‘b_T_rectal’, ‘b_Ear’, ‘b_Eye’, ‘b_Cough’, ‘b_Nasal’

I’ve tried to put priors on these coefficients with the set_prior() argument but still not running.

I realized something when reading some other nlf application. You need to do like this:

bform <- bf(
  TUS ~ q, # test result of the individual which is a Bernoulli event
  nlf(p ~ inv_logit(Intercept+b_T_rectal*T_rectal+b_Ear*Ear+b_Eye*Eye+b_Cough*Cough+b_Nasal*Nasal)), # logistic model written directly
  nlf(q ~ p * Se + (1 - p) * (1 - Sp)), # q depends on the TP (true positive) and FP (false positive
  Se + Sp ~ 1, # put the Se and Sp as value to add in the model but with no data from them (only derived from the prior)
  Intercept + b_T_rectal + b_Ear + b_Eye + b_Cough + b_Nasal ~ 1,
  nl = TRUE, # we use a non linear formula
  family = bernoulli("identity")
)

Many thanks,
I know better understand the nlf() syntax.