In my data, subject i is from one of two distributions.

With probability p_normal[i], a likelihood of subject i’s value Y_obs[i]

is calculated from normal distribution (expected value is from regression Y = constant + coeff*covariate and standard deviation is fitted).

With probability 1-p_normal[i], a likelihood is calculated from geometric distribution (most of these individual have Y_obs = 0, and p for geometric distribution is fitted).

p_normal for each subject is known.

I simulated latent Y (Y_latent) with parameters below, the histogram is as below.

n_subj <- 200

n_obs <- 200 (now one data per subject)

n_cov <- 4 (number of covariates)

coeff <- c(1.2, 1.5, -0.9, 0.8) (coefficients)

constant = 4

And then to simulate observed Y, errors are applied (sd for normal distribution = 0.4, p for geometric distribution = 0.97)

and the values are discretized using floor(Y_latent) and truncated at 0,

so the histogram of Y_obs is as below.

To calculate likelihood from discretized normal distribution,

I use log_diff_exp(normal_lcdf(Y_obs + 1 | Y_expected, sigma), normal_lcdf(Y_obs | Y_expected, sigma)).

Also to extend (y-mu)/sigma from 8.25 to 37.5,

I use log1m(normal_cdf(-Y_obs, -Y_expected, sigma)) when (Y_obs - Y_expected) is positive.

functions {

vector geometric_lp(int n, int[] x, real p) {

vector[n] lp;

```
for(i in 1:n) {
lp[i] = x[i] * log1m(p) + log(p);
}
return lp;
```

}

real log_normal_truncated(int Y_obs, real Y_reg, real sigma){

real lp;

```
if (Y_obs - Y_reg > 0){
lp = log_diff_exp(0, normal_lcdf( -Y_obs | -Y_reg, sigma));
}
else {
lp = normal_lcdf( Y_obs | Y_reg, sigma);
}
return (lp);
```

}

real log_normal_discretized(int Y_obs, real Y_reg, real sigma){

real lp;

```
if (Y_obs - Y_reg > 1){
lp = log_diff_exp(normal_lcdf( -Y_obs | -1.0*Y_reg, sigma),
normal_lcdf( -Y_obs - 1 | -1.0*Y_reg, sigma));
}
else {
lp = log_diff_exp(normal_lcdf( Y_obs + 1 | Y_reg, sigma),
normal_lcdf( Y_obs | Y_reg, sigma));
}
return (lp);
```

}

vector discretized_normal_lp(int n, int[] Y_obs, vector Y_reg, real sigma) {

vector[n] lp;

```
for(i in 1:n) {
if(Y_obs[i] == 0){
lp[i] = log_normal_truncated(1, Y_reg[i], sigma);
}else{
lp[i] = log_normal_discretized(Y_obs[i], Y_reg[i], sigma);
}
}
return lp;
```

}

}

data {

int n_subj;

int n_obs;

int subj_obs[n_obs]; //indices to map subjects to observation

int Y_obs[n_obs]; //observations

int subj_start_ind[n_subj]; //first indices of subjects

int subj_end_ind[n_subj]; //last indices of subjects

int n_cov; //number of covariates

matrix[n_subj, n_cov] cov; //covariates

vector[n_subj] p_normal; //probability that a subject is from normal distribution

}

parameters {

vector[n_cov] coeff;

real intercept;

real<lower=0> sigma;

real<lower=0.8, upper=1> p_geo;

}

transformed parameters {

vector[n_obs] Y_reg = rep_vector(intercept, n_obs) +

cov[subj_obs] * coeff;

}

model {

vector[n_obs] lp_zero;

vector[n_obs] lp_reg;

intercept ~ normal(0, 5);

coeff ~ normal(0, 5);

sigma ~ normal(0, 5);

p_geo ~ uniform(0.8, 1.0);

lp_zero = geometric_lp(n_obs, Y_obs, p_geo);

lp_reg = discretized_normal_lp(n_obs, Y_obs, Y_reg, sigma);

for(i in 1:n_subj) {

target += log_mix(

p_normal[i],

sum(lp_reg[subj_start_ind[i]:subj_end_ind[i]]),

sum(lp_zero[subj_start_ind[i]:subj_end_ind[i]])

);

}

}

After warm-up, I get 87 divergent transitions. These are false positive due to limitation of normal_lcdf.

I checked many times by varying the range of (y-mu)/sigma.

Even though my actual data rarely gets divergent transitions (because it has larger standard deviation),

I was going to implement normal_lcdf of my own using boost’s erf for stability.

And I found that stan is already using it.

First, I want to know why stan developers had to set ranges between -37.5 and 8.25.

Also I want to hear more suggestions (Any suggestions besides increasing adapt-delta).