Hello Stanimals,
Short summary: I have a model that predicts for some subset of my data (assumed to be log-normally distributed) that the expectation is practically zero. Those data are left-censored, and I’m modelling this censoring process. However, during prior predictive checks, Stan reports divergent iterations when the expectation is too close to zero, even when no data or censoring-related sampling statements are included in the code yet.
Elaboration: I am developing a non-linear model for data that I assume have log-normal measurement error. The model is supposed to generate an S-shaped regression line for which I’m using Stan’s Phi_approx()
as an approximation to the normal CDF. I use this because the mean and sd parameters have a specific biological meaning in the study context. Further, a subset of the data is left-censored, which I intend to explicitly model using either (1) the cumulative log normal distribution function or (2) through data augmentation by defining an additional parameter with an upper bound for every censored observation - yet to decide.
Now the issue I’m struggling with: while performing prior predictive checks, I ran into diverging iterations being reported (up to half of all samples!!!). I traced the culprit to a line of code where I calculate the natural logarithm of the output of Phi_approx()
, which will later on (after finishing prior predictive checks) be used as the mean parameter for the log-normal distribution. It seems that some trajectories result in log(...)
to underflow at some point, which is totally expected. I’m not too concerned about this however, as the underflow result of log(Phi_approx(...))
will be associated with left-censored data points (in which case the posterior density will be 1.0).
What I did not expect however, is Stan to report these underflows as diverging iterations, even when I’m not yet actually using the underflowed result of log(Phi_approx(...))
for anything. I checked that divergences weren’t caused by anything else; e.g. if I comment out the log()
statement and only sample from the priors and calculate Phi_approx(...)
, Stan purrs like a kitten. However, with log(Phi_approx(...))
in action, divergence alarm bells go off while the results returned by Stan are exactly the same (means, BCIs, Rhat = 1.000). This has me a little concerned that if I continue to use this model and ignore the warnings about divergences, I may miss out on more serious divergences as I build up the model to its full level of complexity (mixed-effects non-linear model).
Of course, I could hack my way out of this by doing log(Phi_approx(...) + c)
where c
is a small constant, but that doesn’t seem right. Or is this (or another type of hack) generally accepted for these situations?
Attached is a minimal working example with a fixed effects version of the model and prior predictive checks that reproduces the divergencies, along with R code to run it with rstan.
Best,
Luc
P.s.: I did already read and enjoy an earlier extensive post on divergencies Getting the location + gradients of divergences (not of iteration starting points) and the viz paper [1709.01449] Visualization in Bayesian workflow, but couldn’t find an answer there.
functions {
vector phi_approx(vector x,
real mu,
real sigma) {
/**
* @param x = random variates
* @param mu = mean of normal distribution
* @param sigma = standard deviation of normal distribution
* @return An approximation to the normal CDF
*/
return(Phi_approx((x - mu) / sigma));
}
}
data {
// Data
int<lower=0> N;
vector[N] time;
// Prior parameters
real mu_time_mu;
real<lower=0> mu_time_sd;
real<lower=0> sigma_time_mu;
real<lower=0> sigma_time_sd;
}
transformed data{
}
parameters {
// Mid-point of S-shaped curve (mean of approximate normal CDF)
real mu_time;
// Curvature of S-shaped curve (sd of approximate normal CDF)
real<lower=0> sigma_time;
}
transformed parameters {
}
model {
vector[N] prpc; // prior predictive check
vector[N] log_prpc; // log of prior predictive check
prpc = phi_approx(time, mu_time, sigma_time);
log_prpc = log(prpc); // Cause of divergent iterations being reported
// Prior distributions
mu_time ~ normal(mu_time_mu, mu_time_sd);
sigma_time ~ normal(sigma_time_mu, sigma_time_sd);
}
generated quantities {
vector[N] prpc;
vector[N] log_prpc;
prpc = phi_approx(time, mu_time, sigma_time);
log_prpc = log(prpc); // does not cause divergencies to be reported
}
toy_reproduce_divergencies.stan (1.4 KB) toy_reproduce_divergencies.R (1.5 KB)