I want to remove the following issues from the following model ( I give a minimal code in which the following issue will appear).

the divergent transitions ,

get convergence chains for any seed. ( namely in some seed, the nonconvergent chains appear or in very long chains, the sub chain is not converged in the sense of the R hat criterion)
I welcome any questions. Because I made a very brief explanation without any important theory, I think the following explanation is difficult to read. Also my English is poor. I am the author of the package BayesianFROC in which the following model is implemented.
I made a model to compare imaging modaity (MRI, CT, PET,etc) in Radiological context.
For example, this model tells us that the radiologist can find more many lesions in images taken by MRI than CT.
Trial to get data consisting of TPs and FPs.

Suppose that there are N_I radiographs (images), in each of which there are several shadows caused by lesions or noises. We assume that there are N_L lesions, one image may contain no image and one image may contain one more lesions. Each images are taken by MRI, CT, or PET, etc. Now assume that there are M modalities.

Suppose that there are R readers. The readers try to find lesions in each image.
For each image, if a reader thinks there are lesions, then the reader localizes his suspicious locations with his confidence levels. For example, if reader localizes 3 positions in a single image, then reader also assigns to each suspicious location a confidence level which is an integer such as 5= â€śdefinitely presentâ€ť, 4=â€śprobably presentâ€ť, 3=â€¦, 1=â€śquestionableâ€ť. 
After readerâ€™s lesion finding task, some data analyst counts the number of true positive (hits) and the number of false positives (false alarms) .
So, we will get the following dataset
Data
Two readers and two modalities and three kind of confidence levels.
Confidence Level  Modality ID  Reader ID  Number of Hits  Number of False alarms 

3 = definitely present  1  1  H_{3,1,1}  F_{3,1,1} 
2 = equivocal  1  1  H_{2,1,1}  F_{2,1,1} 
1 = questionable  1  1  H_{1,1,1}  F_{1,1,1} 
3 = definitely present  1  2  H_{3,1,2}  F_{3,1,2} 
2 = equivocal  1  2  H_{2,1,2}  F_{2,1,2} 
1 = questionable  1  2  H_{1,1,2}  F_{1,1,2} 
3 = definitely present  2  1  H_{3,2,1}  F_{3,2,1} 
2 = equivocal  2  1  H_{2,2,1}  F_{2,2,1} 
1 = questionable  2  1  H_{1,2,1}  F_{1,2,1} 
3 = definitely present  2  2  H_{3,2,2}  F_{3,2,2} 
2 = equivocal  2  2  H_{2,2,2}  F_{2,2,2} 
1 = questionable  2  2  H_{1,2,2}  F_{1,2,2} 
To give the probability law of the two kinds of random variables, I made a model as follows.
Model
Where the model parameter is \theta = (\theta_c,\mu_{m,r},\sigma_{m,r},A_m) and 1\leq c \leq C, 1\leq m \leq M, 1\leq r \leq R, and \Phi denote the cumulative distribution functions of the canonical Gaussian. Note that \theta_{C+1} = \infty and \mathcal{H}_{c,m,r} := \{ H_{c+1,m,r},H_{c+2,m,r},\cdots,H_{C,m,r}\} and
Set d\theta_c = \theta_{c+1}\theta_{c}.
We use the following priors:
Implementation
# Make a dataset
# modality ID
m <c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3
,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5
,5,5,5,5,5,5,5,5,5,5,5,5)
# reader ID
q <c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,1,1,1,1
,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,1,1,1,1,1,2,2,2
,2,2,3,3,3,3,3,4,4,4,4,4)
# confidence level
c<c(5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2
,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3,2,1,5,4,3
,2,1,5,4,3,2,1,5,4,3,2,1)
# FP ( false alarm)
f<c(
0,4,20,29,21,0,0,6,15,22,1,15,18,31,19,1,2,4,16,17,1,1,21,24,23,1,1,5,30
,40,2,19,31,56,42,2,0,2,30,32,1,7,13,28,19,0,1,7,7,31,7,15,28,41,9,0,2,5
,24,31,1,4,18,21,23,1,1,0,11,35,6,14,37,36,18,0,2,4,18,25,0,2,19,23,18,0,2
,6,10,30,2,25,40,29,24,1,1,4,24,32
)
# TP (hit)
h<c(
50,30,11,5,1,15,29,29,1,0,39,31,8,10,3,10,8,25,45,14,52,25,13,4,1,27,28,29,1
,0,53,29,13,2,4,9,16,22,43,14,43,29,11,6,0,18,29,21,0,0,43,29,6,7,1,10,14,19
,32,23,61,19,12,9,3,16,29,34,1,0,52,29,10,4,3,10,16,23,43,15,35,29,18,9,0,17,27
,24,0,0,34,33,7,13,2,12,16,21,35,15
)
C<5 # the number of confidence levels
M<5 # the number of modalities
Q<4 # the number of readers
NI<199 # the number of images
NL<142 # the number of lesions
# the length of the dataset
N <C*M*Q
# make an array format hits data
ff < numeric(N) #Initialization of Cumulative False alarm
harray<array(0,dim=c(C,M,Q));
for(md in 1:M) {
for(cd in 1:C) {
for(qd in 1 : Q){
for(n in 1:cd){
ff[cd+(md1)*C*Q+(qd1)*C]<ff[cd+(md1)*C*Q+(qd1)*C]+f[n+(md1)*C*Q+(qd1)*C]
}
harray[cd,md,qd] < h[cd+(md1)*C*Q+(qd1)*C]
}}}
# make a data to be passed to sampling()
data < list(N=N,Q=Q, M=M,m=m ,C=C , NL=NL,NI=NI
,c=c,q=q,
h=h, f=f,
ff=ff,
harray=harray
)
# Make a Stan model
Stan.model < rstan::stan_model( model_code="
data{
int <lower=0>N;
int <lower=0>M;
int <lower=0>C;
int <lower=0>Q;
int <lower=0>h[N];
int <lower=0>f[N];
int <lower=0>q[N];
int <lower=0>c[N];
int <lower=0>m[N];
int <lower=0>NL;
int <lower=0>NI;
int <lower=0>ff[N];
int <lower=0>harray[C,M,Q];
}
transformed data {
int <lower=0> NX;
NX = NI;
}
parameters{
real w;
real <lower =0 > dz[C1];
real mu[M,Q];
real <lower=0> v[M,Q];
real <lower=0> hyper_v[Q];
real <lower=0,upper=1>A[M];
}
transformed parameters {
real <lower =0> dl[C];
real <lower=0,upper=1> ppp[C,M,Q];
real <lower =0> l[C];
real z[C];
real aa[M,Q];
real <lower =0> bb[M,Q];
real <lower=0,upper=1> AA[M,Q];
real deno[C1,M,Q];
real hit_rate[C,M,Q];
z[1]=w;
for(md in 1 : M) {
for(qd in 1 : Q) {
aa[md,qd]=mu[md,qd]/v[md,qd];
bb[md,qd]=1/v[md,qd];
for(cd in 1 : C1) z[cd+1] = z[cd] + dz[cd];
ppp[C,md,qd] = 1 Phi((z[C] mu[md,qd])/v[md,qd]);
for(cd in 1 : C1) ppp[cd,md,qd] = Phi((z[cd+1] mu[md,qd])/v[md,qd])  Phi((z[cd ] mu[md,qd])/v[md,qd]);
for(cd in 1 : C) l[cd] = (1)*log(Phi(z[cd]));
dl[C] = fabs(l[C]0);
for(cd in 1:C1) dl[cd]= fabs(l[cd]l[cd+1]);
}
}
for(md in 1 : M) {
for(qd in 1 : Q) {
AA[md,qd]=Phi( (mu[md,qd]/v[md,qd])/sqrt((1/v[md,qd])^2+1) );//Measures of modality performance
}}
for(md in 1 : M) {
for(qd in 1 : Q) {
deno[C1,md,qd]=1ppp[C,md,qd];
for(cd in 3:C){ deno[c[cd],md,qd]=deno[c[cd1],md,qd]ppp[c[cd1],md,qd]; }
}}
for(md in 1 : M) {
for(qd in 1 : Q) {
for(cd in 1:C1){
hit_rate[cd,md,qd]=ppp[cd,md,qd]/deno[cd,md,qd];
}
hit_rate[C,md,qd]=ppp[C,md,qd];
}}
}
model{
int s=0;
for(qd in 1 : Q) {
for(md in 1 : M) {
target += normal_lpdf( AA[md,qd]A[md],hyper_v[qd]);
} }
for(n in 1:N) {
target += poisson_lpmf(ff[n]l[c[n]]*NX);
}
for(qd in 1 : Q) {
for(md in 1 : M) {
s=0;
for(cd in 1 : C){
target += binomial_lpmf(harray[cd,md,qd]  NLs, hit_rate[c[cd],md,qd] );
s = s + harray[cd,md,qd]; }
}}
w ~ uniform(3,3);
for(cd in 1:C1) dz[cd] ~ uniform(0.001,7);
for(md in 1 : M) { for(qd in 1 : Q) {
mu[md,qd] ~ uniform(11,11);
v[md,qd] ~ uniform(0.01,11);
}}
}
")
# Fit a model to data
fit < rstan::sampling(
object= Stan.model, data=data, verbose=F,
seed=1234567, chains=1, warmup=111, iter=1111
, control = list(adapt_delta = 0.9999999,
max_treedepth = 15)
# ,init = initial
)
rstan::traceplot(fit,pars=c("w"))
rstan::check_hmc_diagnostics(fit)
# MCMC fails if the seed is changed.
fit < rstan::sampling(
object= Stan.model, data=data, verbose=F,
seed=1, chains=1, warmup=111, iter=122
, control = list(adapt_delta = 0.9999999,
max_treedepth = 15)
# ,init = initial
)
rstan::traceplot(fit,pars=c("w"))
rstan::check_hmc_diagnostics(fit)