Gibbs Sampling in Stan

maxbiostat · June 16, 2020, 10:41pm

Using the notation from Bob and Andrew’s paper, let \gamma_i and \delta_i be the specificity and sensitivity of test i, respectively. Then

p_i = (1-\gamma_i)(1-\pi) + \delta_i\pi,

is straightforward to implement in Stan. This assumes the results of the tests are conditionally independent given the parameters.

Bob_Carpenter · June 29, 2020, 9:48pm

That’s not my model! And trying to fit it in BUGS/JAGS is why I gave up on discrete sampling and embraced marginalization. What used to take 24 hours in BUGS takes about 10m in Stan (and that was years ago and Stan’s improved a lot since then).

Right. That won’t work. You need something with parameters for each of the tests and associated likelihoods for those observations (plus priors).

You can, but that’s inefficient compared to dealing with them inefficiently because the multinomial requires you to compute all those probabilities.

It’s conditionally independent given the true disease status, which is either “missing data” or a “parameter” depending on your philosophical bent.

But it doesn’t have to be and often independence is a bad modeling assumption. Instead, similar tests will produce similar results. For instance, two PCR tests might have correlated results in the same way that an X-ray and MRI produce correlated results.

Assuming independence is also wrong in that some cases are more difficult. If someone is heavily infected, tests are all more likely to return positives. This goes beyond 0/1 state because the 0/1 state is based on some arbitrary threshold. You see this in cancer diagnosis—the bigger the tumor, the easier it is to get a positive result on a test (image or biopsy or physical exam).

I have a partly finished case study where I add a difficulty parameter to the Dawid and Skene model. But this is well known in the epidemiology literature if not in the ML literature. For instance, check out the classic Albert and Dodd paper, which has a nice survey of some of the approaches:

PhDemetri · June 30, 2020, 11:48pm

You can, but that’s inefficient compared to dealing with them inefficiently because the multinomial requires you to compute all those probabilities.

Generally, yes. With two tests, I think I can get away with it.

But it doesn’t have to be and often independence is a bad modeling assumption. Instead, similar tests will produce similar results. For instance, two PCR tests might have correlated results in the same way that an X-ray and MRI produce correlated results.

Excellent point. This will likely depend on the nature of the test. I’ll have to think more aout it.

Topic		Replies	Views
Prevalence estimation with imperfect test Modeling	6	638	May 29, 2020
Stan algorithm General	4	703	October 2, 2017
Prior specification for sensitivity and specificity of prevalence model Modeling rstan , specification , brms	2	563	January 6, 2023
Using Stan as part of a Gibbs sampling algorithm General	1	466	January 13, 2024
Binomial model with prior information on sensitivity and specificity Modeling specification , brms	3	123	August 1, 2024

Gibbs Sampling in Stan

Related topics