I am fitting a simple mixture model to interval censored data. To do so I created a user-defined pmf based on a normal distribution which transforms the normal distributions to the discrete scale of my data.
functions {
// Define log probability density function for an interval censored normal
real intervalCensoredNormal_lpmf(int y, real mu, real sigma, vector breaks) {
// This is a discrete log probability mass function defined for integers between 0 and rows(breaks).
// Assigns the integral of the normal probability between breaks y and y+1 to y.
// If y is 0 assigns integral of normal probability between -inf and break 1.
// If y is rows(breaks) assigns integral of normal probability between the last break and +inf.
real p;
if (y == 0) { // from -inf to the first break
p = normal_lcdf(breaks[1] | mu, sigma);
} else if (y == rows(breaks)) { // from the last break to +inf
p = normal_lccdf(breaks[rows(breaks)] | mu, sigma);
} else { // between two breaks
p = log_diff_exp(normal_lcdf(breaks[y+1] | mu, sigma), normal_lcdf(breaks[y] | mu, sigma));
}
return p;
}
}
data {
int<lower = 0> N;
int y[N]; // an array with N elements
int<lower = 0> K;
vector[K] intervalBreaks; // a vector with K elements
}
parameters {
ordered[2] mu; // this must be ordered (an ordered vector) so the chains do not constantly switch between the two distributions
real<lower=0> sigma[2];
real<lower=0, upper=1> theta;
}
model {
mu[1] ~ normal(4, 4);
mu[2] ~ normal(8, 4);
sigma ~ normal(0, 5);
theta ~ beta(1.5, 1.5);
for (n in 1:N)
target += log_mix(theta,
intervalCensoredNormal_lpmf(y[n] | mu[1], sigma[1], intervalBreaks),
intervalCensoredNormal_lpmf(y[n] | mu[2], sigma[2], intervalBreaks));
}
A figure of my data is attached. Note that the first and the last bar in the histogram actually comprise values that can be between (negative) infinity and the lower (upper) break of the bar, which I accounted for in the pmf. That’s why the first bar is so high.
Now the problem:
Everything runs fine and smoothly, but I do get over 100 divergences even with adapt_delta
set to 0.9 or 0.99. Initially I wasn’t worried, since the divergences are quite well distributed in space when showing them with pairs
from rstan (figure attached), so I didn’t expect them to introduce major bias. However, when looking at the same figure in shinystan (attached) all divergences are around the smaller mode, and I am afraid that this mode remains under-represented because of the divergences.
- Any ideas why those divergences arise and if I can get rid of them?
- If not, should I be worried that the fit is biased, based on the divergences shown in shinystan?
Thanks for your advice.
density.pdf (38.2 KB)
pairs.pdf (156.1 KB)
shinstan-bivariate(1).pdf (257.5 KB)