Using truncated negative binom. - large time increase ~100x

I had already tried and for so much data I get divergences even without truncation (dunno why).

data{
   int<lower=0> X;
   int x[X];
 }
 
 parameters {
   real mu;
   real<lower=0> sigma;
   vector<offset=mu,multiplier=sigma>[X] log_lambda;
 }
 
 model {
   sigma ~ student_t(7, 0, 2);
   mu ~ student_t(7, 0, 1);
   log_lambda ~ normal(mu, sigma);
   x ~ poisson_log( log_lambda);
 }
Warning messages:
1: There were 2000 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup 
2: Examine the pairs() plot to diagnose sampling problems

I though I might share my final solution. Considering

  • the overhead of calculating truncated negative binomial, and
  • the specific application to my algorithm (truncating the data point on 0.95 CI of the posterior check, and rerunning the model)

I did a little test to see if I could compensate with an approximation. It seems that (almost) irrespectively of the parameters mu and sigma of neg_binomial_2_log the inferred sigma is smaller than the nominal one by ~0.6 folds; that I will add to the generated quantities for posterior check as compensation.

That is, I deleted the data outside 0.95 CI and run a naive NB optimisation, and see how much smaller the overdispersion (1/exp(sigma)) was compared to the nominal one.

The code


library(tidyverse)
library(foreach)
library(doParallel)
library(rstan)
doParallel::registerDoParallel()



mo <- stan_model(model_code = 'data{int X; int x[X];} parameters {real mu; real sigma;} model {x ~ neg_binomial_2_log(mu, 1/exp(sigma)); mu ~ normal(0,3); sigma ~ normal(0,3); }' )

shrinkage =
	foreach(mu = seq(10, 1000, 10), .combine = bind_rows) %dopar%
	{
		foreach(sigma = seq(1, 100, 1), .combine = bind_rows) %do%	{

			x =
				rnbinom(1000, mu = mu, size = sigma) %>%
				enframe %>%
				filter(value <= qnbinom(0.975, mu=mu, size=sigma)) %>%
				filter(value >= qnbinom(0.025, mu=mu, size=sigma)) %>%
				pull(value)

			X = x %>% length

			tibble(
				mu = !!mu,
				sigma = !!sigma,
				shrink = optimizing(mo, data=list(X = X, x = x), hessian = TRUE) %>% { sigma / (1 / exp((.)$par[2]))	}
			)

		}
	}

shrinkage %>% ggplot(aes(mu, sigma)) + geom_raster(aes(fill = shrink))