Divergences during prior predictive check: why and how can the transformed parameters affect them?

When running my model for prior predictive check, I ended up with a substantial amount of divergences (about 40%). However, I noticed that when I removed the ‘transformed parameters’ section, there was no divergence anymore. It seems that the way the parameters are linked in the expression contained in the ‘transformed parameters’ section induced some divergences. This is suprising to me because I naively thought that the sole purpose of the ‘transformed parameters’ section was to transform the parameters so that they can be used in the likelihood function. But as there is no likelihood here (as it is a prior predictive check), I thought they didn’t play any role.

Note that the model is a minimally reproducible example which contains expression that makes little sense mathematically (e.g., the “inv_logit(logit())” part).


// load data objects
data {
  int N_x;//N_data/N_age
  array[N_x] real x;

parameters {
  real <lower=0,upper=1> nu;
  real<lower=0> delta;

transformed parameters {
  array[N_x] real theta;
  for(i in 1:N_x){
    theta[i] = inv_logit(logit(nu * x[i]^delta));

model {
  nu ~ uniform(0,1);
  delta ~ exponential(1);

R script

x.RData (19.2 KB)

wd=getwd() #wd=paste0(getwd(),"/debugging/20240126_ppc_divergence/") #save(x,file=paste0(wd,"x.RData"))
data_list = list(x=x,N_x=length(x))
fit <- mod2$sample(data = data_list,
                   chains = 4,
                   parallel_chains = 4,
                   seed = 1:4,
                   iter_warmup = 500,
                   iter_sampling = 300,
                   refresh = 40)

Here is a bivariate plot of the two parameters, with divergences in red.

That’s correct but the section still runs even if there results aren’t used in the model block. And if the transform fails, the sampler rejects the sample, so you get a divergence. Your inv_logit(logit(...)) transform requires the input to be between 0 and 1; it works if x < 1 but judging from the plot, the largest x in your data set is about 1.23.