Priors for sensitivity and specificity in logistic regression model for diagnostic test

I am trying to find the sensitivity and specificity of a diagnostic test compared to a gold standard, using logistic regression modeling as described here The logistic modeling of sensitivity, specificity, and predictive value of a diagnostic test - ScienceDirect
I wanted a Bayesian modeling solution because it will allow me to obtain uncertainty intervals of the estimate, set priors, and potentially include other variables in the model. (The article is old, so if there are some flaws in this method, or a better way, please let me know!)
In the code below, I set up small example and test to make sure I recover the sensitivity and specificity in the model. For uncertainty interval, I use the shortest posterior density interval using code taken directly from @andrewgelman and @Bob_Carpenter paper diagnostic-testing/rstan-santa-clara.R at master · bob-carpenter/diagnostic-testing · GitHub

#This code calculates the sensitivity and specificity comparing a test to a gold standard using logistic regression.
#The formulas were taken from https://www.sciencedirect.com/science/article/pii/089543569290180U?via%3Dihub

library(brms)

#set up data and get sensitivity and specificity. 
# 0 = negative test and 1 = positive test. test = diagnostic test. gs = gold standard
test <- c(0,0,0,1,0,0,1,0,1,1,0,0,0,1,0,0,0,0,0,0)
gs <-   c(1,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,0,0,1)
sens <- 4/(4+2)
spec <- 13/(13+1)
data <- cbind(test,gs)
data <- data.frame(data)

#logistic model. The test is the outcome variable and the gold standard is the predictor
m1 <- brm(test ~ gs, data=data, family=bernoulli)

#extract posterior samples
samp1 <- posterior_samples(m1)

#define function for shortest posterior interval
spin <- function(x, lower=NULL, upper=NULL, conf=0.95){
  x <- sort(as.vector(x))
  if (!is.null(lower)) {
    if (lower > min(x)) stop("lower bound is not lower than all the data")
    else x <- c(lower, x)
  }
  if (!is.null(upper)) {
    if (upper < max(x)) stop("upper bound is not higher than all the data")
    else x <- c(x, upper)
  }
  n <- length(x)
  gap <- round(conf*n)
  width <- x[(gap+1):n] - x[1:(n-gap)]
  index <- min(which(width==min(width)))
  x[c(index, index + gap)]
}

#calculate sensitivity from the model and compare to the raw calculation of sensitivity
#equation 2 from the linked paper
m1_sens <- 1/(1 + exp(-(samp1$b_Intercept + 1*samp1$b_gs)))
m1_sens_mean <- mean(m1_sens)
m1_sens_mean
print(spin(m1_sens, lower=0, upper=1, conf=0.95))
hist(m1_sens,breaks=100)
sens

#calculate specificity from the model and compare to the raw calculation of specificity
#equation 3 from the linked paper
m1_spec <- 1 - (1/(1 + exp(-(samp1$b_Intercept + 0*samp1$b_gs))))
m1_spec_mean <- mean(m1_spec)
m1_spec_mean
print(spin(m1_spec, lower=0, upper=1, conf=0.95))
hist(m1_spec,breaks=100)
spec

The model recovers values for sensitivity and specificity that are close to the raw calculation.

I would like to set priors for sensitivity and specificity in this model. If (!) I did the


algebra correctly, then for this simple model where the outcome variable is binary diagnostic test and the predictor is binary (0/1) gold standard, then based on the paper I linked first (equations 2 and 3), the intercept in the model should be related to specificity in this way:

alpha = -log( (1/(1-specificity)) - 1 )

and coefficient for the predictor to specificity and sensitivity in this way:

beta = log( (1/(1-specificity)) - 1) - log( (1/sensitivity) - 1 )

It seems like I should be able to set priors for sensitivity and specificity by setting priors for alpha and beta - I just plug in values of sensitivity and specificity into the above formulas and come up with some plausible values for alpha and beta based on plausible values for sensitivity and specificity.
However, when I plug my model calculated (or the raw) values for sensitivity and specificity into those formulas, I don’t get my alpha and beta back…

-log( (1/(1-m1_spec_mean)) - 1 )
log( (1/(1-m1_spec_mean)) - 1) - log( (1/m1_sens_mean) - 1 )

-log( (1/(1-spec)) - 1 )
log( (1/(1-spec)) - 1) - log( (1/sens) - 1 )

I feel like I am missing something obvious…maybe having to do with the log transformations in the algebra? If this isn’t the right way, then how can I go about setting priors for sensitivity and specificity in this particular simple logistic regression model?

Thanks!

2 Likes

Hello there,

I don’t have an exact answer to your question, but I will point out some things I’ve faced in my research (I’m still starting in this area).

These articles discuss the problem of estimating sensitivity and specificity: https://pubmed.ncbi.nlm.nih.gov/28474394/, https://pubmed.ncbi.nlm.nih.gov/17098577/. https://malariajournal.biomedcentral.com/articles/10.1186/s12936-015-0966-y.

The first two papers deal with the estimation in the meta-analysis context, but the models can fit in your case, I guess. The third paper is very similar to what you are doing. They may be good references to start.

The algebra you made is correct. It can be simplified as

spec = 1 - \operatorname{expit}(\alpha) \implies \alpha = \operatorname{logit}(1-spec)

and

sens = \operatorname{expit}(\alpha + \beta) \implies \alpha + \beta = \operatorname{logit}(sens).

The problem with logit transformation is that it does not take the mean in log-odds space to the mean in probability space. I think a good reference is this website for the logit normal distribution. The problem gets harder because the moments of this distribution do not have a closed-form. I’m saying that because if you place normal distributions in the coefficients, there is no closed-form transformation to the probability space, as far as I know. A good thing to do is prior predictive checks, that is, simulate from different prior specifications and verify the results for specificity and sensitivity.

Sorry for not solving your problem, but I hope something can help you out!

2 Likes

Thanks for the response! I was wondering if it might be something along the lines of your last paragraph… I often do prior predictive checks, but usually I have some principled reasons for proposing a prior and then I check it. I guess I can get in the ballpark with the relationships between sensitivity and specificity and alpha and beta, and then use prior predictive checks…

1 Like