I have written before on this question here and an presenting an update to that question after incorporating some feedback.

# Background

4157 subjects have been administered two diagnostic tests for the same outcome. A gold standard is currently not available. The results of the tests are as follows:

- 164 are positive on both tests
- 415 are negative for the first but positive for the second
- 268 are positive for the first but negative for the second
- 3310 are negative for both

The goal is to estimate the prevalence of the disease given these results and prior information on the sensitivity and specificity of each test.

Let the sensitivity and specificity for test i be denoted by \delta_i and \gamma_i respectively. My model is as follows:

\delta_1 \sim \mbox{Trunc-Normal}(\mu_{\delta_1}, \sigma_{\delta_1})

\delta_2 \sim \mbox{Trunc-Normal}(\mu_{\delta_2}, \sigma_{\delta_2})

\gamma_1 \sim \mbox{Trunc-Normal}(\mu_{\gamma_1}, \sigma_{\gamma_1})

\gamma_2 \sim \mbox{Trunc-Normal}(\mu_{\gamma_2}, \sigma_{\gamma_2})

\pi \sim \mbox{Beta}(1,1)

y\vert \pi\,, \delta_1 \,, \delta_2 \,, \gamma_1 \,, \gamma_2 \sim \mbox{Multinomial}(p_{s}; N)

Here, y is a vector with the counts above, and p_{s} is a vector of the multinomial probabilities. Here is my stan model

```
data{
int y[4]; // Data containing the table counts. Entries are (PP, NP, PN, NN)
int N; // Total number of obserbations. Equal to sum(y)
real p_mu; // Prior mean for prevalence. We place a beta prior on the prevalence.
real p_kappa; // Prior precision parameter
real mu_se_1; //prior mean for admin senitivity
real sd_se_1; //prior standard deviation admin sensitivity
real mu_se_2; //prior mean for survey sensitivity
real sd_se_2; //prior standard deviation for survey sensitivity
real mu_sp_1; //admin specificity
real sd_sp_1;
real mu_sp_2; //survey specificity
real sd_sp_2;
}
parameters{
real<lower=0, upper=1> p; //population prevalence
real<lower=0, upper=1> se_1; //admin sensitivity
real<lower=0, upper=1> se_2; //survey sensitivity
real<lower=0, upper=1> sp_1; //admin specifitiy
real<lower=0, upper=1> sp_2; //survey specificity
}
transformed parameters{
// parameter for proportion of sample falling in each cell of the 2x2 table
// The two by two table is flattened so the cells are c(PP, NP, PN, NN).
// Here, the first letter indicates the result of the admin test and the second the result of the survey.
// The algebra to determine how the sensitivities and specificities are combined for the multinomial likelihood
// is as follows. Using the result NP as an example...
//
// P(NP) = P(NP|D+)P(D+) + P(NP|D-)P(D-)
//
// Note that P(D+) = pi is the prevalence. Assuming the test results are independent
//
// = P(N|D+)P(P|D+)P(D+) + P(N|D-)P(P|D-)P(D-) = (1-se_1) x se_2 x pi + sp_1 x (1-sp_2) x (1-pi)
//
// Similar computations can be performed for the remaining three cells.
//
// The code is vectorized in order to do all the computations simultanously.
// Thus, the mulitnomial probability for the second cell is the SE1[2]*SE2[2]*pi + SP1[2]*SP2[2]*(1-pi)
vector[4] SE1 = to_vector({se_1, 1-se_1, se_1, 1-se_1});
vector[4] SE2 = to_vector({se_2, se_2, 1-se_2, 1-se_2});
vector[4] SP1 = to_vector({1-sp_1, sp_1, 1-sp_1, sp_1});
vector[4] SP2 = to_vector({1-sp_2, 1-sp_2, sp_2, sp_2});
simplex[4] p_sample = p * SE1 .* SE2 + (1-p) * SP1 .* SP2;
}
model{
// Model likelihood
y ~ multinomial(p_sample);
// Model priors
p ~ beta_proportion(p_mu,p_kappa);
se_1 ~ normal(mu_se_1, sd_se_1);
se_2 ~ normal(mu_se_2, sd_se_2);
sp_1 ~ normal(mu_sp_1, sd_sp_1);
sp_2 ~ normal(mu_sp_2, sd_sp_2);
}
generated quantities{
// Posterior predictive distribution
// Take the learned multinomial sample probabilities
// Draw from the multinomial distribution
// 2x2 tables which are probable under the model.
int y_ppc[4] = multinomial_rng(p_sample, N);
real log_lik = multinomial_lpmf(y|p_sample);
}
```

# Problem

I’ve been simulating some fake data and fitting my model on the fake data to try and recover the prevalence. When I do this, my credible intervals do not contain the true prevalence nearly as frequently as they should (80% credible intervals contain the prevalence about 45% of the time). You can run my experiments by running this script.

# Question

The failure to recover parameters may point to a potential problem in my model. I’m fairly confident that the `p_sample`

in my Stan code is correct because I’ve followed this paper in deriving the probabilities, so I’m wondering if my approach to modelling is somehow flawed. Am I missing something, or might there be another explanation for why my model can’t recover the true prevalence?

You can run my model with this script.