Prevalence estimation with imperfect test

Felix_Fischer · May 23, 2020, 9:14pm

Hi,

i have a binary outcome of a test in a large sample, i know the test is not perfect and wish to estimate the true prevalence. I use a logistic regression approach, where i marginalized the true disease status.

I have pretty good knowledge about the sensitivity/specificity of the test from earlier research and specify this as a beta prior. I am, however, puzzled by the fact that my sensitivity/specificity posterior is different from my prior. As far as i can see the data holds no information about the sensitivity/specificity and still, both get updated and in particular specificity moves all his way up to the boundary of the prior. so, i feel my resulting prevalence is based on a too optimistic specificity.

can someone explain to me what’s happening here?

data{
  int status[2];
  int Freq[2];
}
     
parameters {
  vector<lower=.5, upper=1>[2] sensspec;
  real<lower=0, upper=1> pie;
}
   
transformed parameters {
  real loglik[2];
    
  for (i in 1:2){
    if (status[i] == 2){          
      loglik[i] = Freq[i]*log(pie*sensspec[1] + (1-pie)*(1-sensspec[2]));
    } else if (status[i] == 1){
      loglik[i] = Freq[i]*log(pie*(1-sensspec[1]) + (1-pie)*sensspec[2]);
    }
  }
}


model {
  pie ~ beta(2,8);
  for (i in 1:2){
     sensspec[i] ~ beta(100, 10);
   }
  
  target += sum(loglik);
}

martinmodrak · May 29, 2020, 9:24am

Hi,
I’ve just ran the model and it didn’t appear to behave in an unreasonable way, in particular the marginal posterior of sensspec seemed to be almost identical to the prior distribution. Could you share the exact input data you use and the output you are getting?

Felix_Fischer · May 29, 2020, 9:42am

Thanks for looking into that, Martin! It seems to affect specificity more than sensitivity, probably because of the low prevalence and the larger sample size in noncases. See the comparison of prior and posterior:

library(rstan)
library(shinystan)
options(mc.cores = parallel::detectCores())
rstan_options(auto_write = TRUE)

data = list(status = 1:2, Freq = c(20000, 2000))

model_code = '
data{
  int status[2];
int Freq[2];
}

parameters {
vector<lower=.5, upper=1>[2] sensspec;
real<lower=0, upper=1> pie;
}

transformed parameters {
real loglik[2];

for (i in 1:2){
if (status[i] == 2){          
loglik[i] = Freq[i]*log(pie*sensspec[1] + (1-pie)*(1-sensspec[2]));
} else if (status[i] == 1){
loglik[i] = Freq[i]*log(pie*(1-sensspec[1]) + (1-pie)*sensspec[2]);
}
}
}


model {
pie ~ beta(2,8);
for (i in 1:2){
sensspec[i] ~ beta(100, 10);
}

target += sum(loglik);
}
'
model = stan_model(model_code = model_code)
samples = sampling(model, data = data)
prior = data.frame(parameter = c("sensspec[1]", "sensspec[2]"), value = rbeta(1000, 100, 10))

stan_hist(samples, "sensspec") + geom_histogram(data = prior, alpha  = .5)

martinmodrak · May 29, 2020, 10:33am

I see - there are two things:

You’ve put a lower bound of 0.5 on your values, so your prior is not really simulating from the prior you use in the model. Using rejection sampling to actually sample from beta with a lower bound:

 x  <- rbeta(1e5,100,10); x <- x[x > 0.5]; quantile(x, probs = c(0.025, 0.25, 0.50, 0.75, 0.975))
#      2.5%       25%       50%       75%     97.5% 
# 0.8492021 0.8920933 0.9113891 0.9284150 0.9549340

Which gets me very close to what I get when running the model with your data:

                mean se_mean    sd     2.5%      25%      50%      75%    97.5% n_eff Rhat
sensspec[1]     0.91    0.00  0.03     0.84     0.89     0.91     0.93     0.96  1705 1.00
sensspec[2]     0.94    0.00  0.01     0.91     0.93     0.94     0.95     0.96   924 1.01
pie             0.03    0.00  0.02     0.01     0.02     0.03     0.04     0.06   912 1.01

you have a data-prior conflict. Let’s simulate the proportion of positives you would expect given your priors:

N <- 1e5
sensspec <- matrix(rbeta(2 * N, 100, 10), nrow = N, ncol = 2)
pie <- rbeta(N, 2 , 8)
theta <- pie*sensspec[1,] + (1-pie)*(1-sensspec[2,])

quantile(theta, probs = c(0.025, 0.25, 0.50, 0.75, 0.975))
#     2.5%       25%       50%       75%     97.5% 
# 0.1070964 0.1823829 0.2434531 0.3207248 0.4977559

So the observed positive frequence of roughly 0.09 is a priori quite unlikely. Something has to give. So the model does the “rational” thing: it modifies it’s belief about all the parameters. I.e., if you believe a disease is reasonably common (around 1/5 of population), you test with sens and spec around 0.9 (which should give you a ton of false positives) and you get unexpectedly few positives, you need to start to doubt both that disease is common but also that your test is working as expected (or more generally, that your model is correct).

In that sense, the model behaves as expected. The correct reaction is not necessarily to update your beliefs about sensitivity and specificity, but to doubt that those beliefs, your prior on prevalence or other aspects of data collection are correctly expressed in the model.

Does that make sense?

Finally, the model as you wrote is not very idiomatic Stan and it took me a while to understand. For inspiration, here’s how I would write the same model, taking advantage of Stan built-in functions and a slightly simpler form of input data (but there are many ways to write this, I don’t claim this is best or definitive):

data{
  int N_total;
  int N_positive;
}

parameters {
vector<lower=.5, upper=1>[2] sensspec;
real<lower=0, upper=1> pie;
}

transformed parameters {
  real theta = pie*sensspec[1] + (1-pie)*(1-sensspec[2]);
}

model {
  pie ~ beta(2,8);
  sensspec ~ beta(100, 10); //This gives all elements of sensspec the same prior, no need for a loop
  
  N_positive ~ binomial_lpmf(N_total, theta);
}

Felix_Fischer · May 29, 2020, 11:09am

That’s very instructive, thanks for the clarification!

martinmodrak · May 29, 2020, 11:18am

Glad to have helped.

I just realized there is an even simpler way of putting it - if your test has specificity roughly 0.9, you would expect around 2000 positives and 20000 negatives if the true prevalence was exactly 0. So given your data, the specificity basically has to be higher than 0.9

Felix_Fischer · May 29, 2020, 11:22am

That’s a great explanation for the observed behaviour - there is actually some information (a lower bound) on the specificity of the test in the data. Thanks for pointing this out!

Topic		Replies	Views
Prior specification for sensitivity and specificity of prevalence model Modeling rstan , specification , brms	2	563	January 6, 2023
Joint prior for sensitivity and specificity Modeling epidemiology	1	847	May 29, 2020
Joint estimation of a test's sensitivity and specificity along with its predictive values Modeling brms	4	112	December 10, 2024
Model fails to represent uncertainty of test sensitvity/specificity: how to deal with data of different sizes in the likellihood? Modeling	8	586	April 1, 2022
Priors for sensitivity and specificity in logistic regression model for diagnostic test brms	2	1052	November 16, 2021

Prevalence estimation with imperfect test

Related topics