Warning messages when running a simple model to estimate adjusted prevalence

Hi,

I have encountered some problems when running a simple model to estimate adjusted prevalence. I was wondering if the experts in this forum could give me some advice on how to remove the warning messages or make other suggestions.

The simple model is set as follows:

  1. the random variable (n) for the number of patients with positive test results following a binomial distribution with a probability (denoted by theta) in the total N patients
  2. the adjusted prevalence (p) is related to Theta by the formula “theta = p*sensitivity + (1 - p)*specificity”
  3. p has a prior distribution of Beta (1,1)
  4. sensitivity has a prior distribution of Beta, where the shape parameters are set so that the expectation value is equal to the value derived from the research data and the SD is small enough (around 0.05).
  5. specificity has a prior distribution of Beta, where the shape parameters are set so that the expectation value is equal to the value derived from the research data, and the SD is small enough (0.005).

Given that the dataset with N = (xxx, xxx) and n (xxx), the “rstan” code (provided in the end) was implemented. However, I have the warning messages shown below.

I wonder whether you have any advice on whether to remove the warning messages. Should I try other prior distributions for sensitivity and specificity (such as a truncated normal distribution) instead?
I look forward to receiving replies soon; thank you very much.

<< Warning messages: >>

Warning: There were 5676 divergent transitions after warmup. 
See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. 
Warning: Examine the pairs() plot to diagnose sampling problems 
Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable. Running the chains for more iterations may help. 
See https://mc-stan.org/misc/warnings.html#bulk-ess 
Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable. Running the chains for more iterations may help. 
See https://mc-stan.org/misc/warnings.html#tail-ess

<< rstan code: >>

prev <- data.frame(Total = 1xx,xxx, Case = 2xx)
model <- 'data {
                  int N;       
                  int n; }
            parameters{
                       real <lower=0, upper=1>  p;       
                       real <lower=0, upper=1> sens;     
                       real <lower=0, upper=1> spec; }
            transformed parameters {
                                    real <lower=0, upper=1> theta;
                                    theta = p * sens + (1 - p) * (1 - spec); }
            model {
                  sens ~ beta(xx, 51);
                  spec ~ beta(xx, 0.1);                  
                  target += uniform_lpdf(p | 0, 1); 
                  target += binomial_lpmf(n | N, theta); }'
Data <- list(
                       N = prev$Total,
                       n = prev$Case)
fit <- stan(model_code = model,
             data = Data,
             chains = 4,
             warmup = 3000,
             iter = 8000,
             cores = 1,
             refresh = 0,
             seed = 12345)

I’d strongly recommend putting your Stan code in its own files so that you can get line-number errors and syntax highlighting. Here’s a cleanly indented and slightly rewritten version:

data {
  int<lower=0> N;       
  int<lower=0, upper=N> n;
}
parameters{
  real <lower=0, upper=1> prev, sens, spec;       
}
transformed parameters {
  real <lower=0, upper=1> theta
    = prev * sens + (1 - prev) * (1 - spec);
}
model {
  sens ~ beta(20, 2);
  spec ~ beta(10, 2);
  prev ~ beta(1, 10);
  n ~ binomial(N, theta);
}

There are a couple divergent transitions, but I wouldn’t worry about that too much. The effective sample size looks good. You’ll see that you can’t identify the parameters well with this kind of data—a lot is being driven by the prior here.

The model is technically not identifiable—you can flip the sensitivity and specificity and prevalence to 1 minus their original values and get the same likelihood.

>>> m = csp.CmdStanModel(stan_file='foo.stan')
>>> f = m.sample(data={'N':1000, 'n': 100})
...
	Chain 4 had 2 divergent transitions (0.2%)
...

>>> f.summary(sig_figs=2)
          Mean     MCSE  StdDev        5%      50%      95%   N_Eff  N_Eff/s  R_hat
lp__  -340.000  0.03900  1.3000 -350.0000 -340.000 -340.000  1100.0   7900.0    1.0
prev     0.037  0.00076  0.0260    0.0029    0.033    0.087  1200.0   8700.0    1.0
sens     0.910  0.00140  0.0600    0.7900    0.920    0.980  1700.0  12000.0    1.0
spec     0.930  0.00070  0.0240    0.8900    0.930    0.970  1200.0   8500.0    1.0
theta    0.100  0.00015  0.0095    0.0870    0.100    0.120  4100.0  30000.0    1.0

As the data size goes up, you wind up having more divergent transitions as the problem gets stiffer. This is for N = 10,000, n = 1000:

14:36:35 - cmdstanpy - WARNING - Some chains may have failed to converge.
	Chain 3 had 13 divergent transitions (1.3%)
	Chain 4 had 25 divergent transitions (2.5%)

You might be able to reparameterize the probabilities on the log or log odds scale, but it’s a challenge given the way they are combined.

P.S. @andrewgelman and I fit exactly that link function in this paper:

We included our Stan code.