High Pareto-k values for an cdm model with real datasets

I am a student and am using the nimble package to estimate the cognitive diagnostic model. Only one attribute is measured for each question. The model I use is the summing cognitive diagnosis model. However, since only one attribute is measured for each question, theoretically, this model is no different from DINA/DINO.
I am conducting a real data study. The data used is the dataset of the PISA 2015 scientific test, which contains 450 subjects. These subjects come from three countries. I have taken into account the inconsistency of countries. The probability of mastery patterns in different countries is different.
After the model estimation was completed, I used loo to check the model fitting situation, but there were many high Pareto K values.
I’m very confused.

  • Why is there such a high Pareto value?
  • How should I deal with this problem?
    This is my loo estimation result
Computed from 10000 by 7200 log-likelihood matrix.

Estimate   SE
elpd_loo -4219.6 41.1
p_loo 566.7 11.8
looic 8439.3 82.3
------
MCSE of elpd_loo is NA.
MCSE and ESS estimates assume independent draws (r_eff=1).

Pareto k diagnostic values:
Count Pct.    Min. ESS
(-Inf, 0.7] (good) 5266 73.1% 2505
(0.7, 1] (bad) 390 5.4% <NA>
(1, Inf) (very bad) 1544 21.4% <NA>
See help('pareto-k-diagnostic') for details.

This is my model

# Define the model code
code <- nimbleCode({
# Examinee's Answering situation
for (i in 1:I) {
for (j in 1:J) {
for(k in 1:K){ att[i,j,k] <- alpha[i, k] * Q[j,k] }
gamma_me[i,j] <-
lambda1[j]*att[i,j,1]+
lambda2[j]*att[i,j,2]+
lambda3[j]*att[i,j,3]
logit(response[i, j]) <- lambda0[j] + gamma_me[i,j]
U_ij[i, j] ~ dbern(response[i, j]) # Since the number of experiments =1, the results using Bernoulli and binomial distribution at this time are similar
y_rep[i,j] ~ dbern(response[i,j]) # Generate posterior prediction data
loglikelihood[(i-1)*J+j] <- U_ij[i,j]*log(response[i, j]) + (1-U_ij[i,j])*log(1-response[i, j])
}
C_per[i] ~ dcat(pai[1:C,cnt[i]])
for(k in 1:K){
alpha[i,k] <- ks[C_per[i],k]
}
}



# Calculate deviance
deviance <- -2 * sum(loglikelihood[1:(I*J)])
# Calculate the total log-likelihood
#totalLogLik <- sum(loglikelihood[1:(I*J)])

# Prior
for(guo in 1:3){
pai[1:C,guo] ~ ddirch(ks_del[1:C,guo])
}

# Project Parameters
for (j in 1:J) {
lambda0[j] ~ dnorm(-1.8,sd = 0.5)
xlambda1[j] ~ T(dnorm(3.5/K_star[j],sd=0.5),0,)
xlambda2[j] ~ T(dnorm(3.5/K_star[j],sd=0.5),0,)
xlambda3[j] ~ T(dnorm(3.5/K_star[j],sd=0.5),0,)
# -- -- -- -- -- -- -- -- - - - - - - - - - - - - - - - - #
lambda1[j] <- Q[j,1]*xlambda1[j]
lambda2[j] <- Q[j,2]*xlambda2[j]
lambda3[j] <- Q[j,3]*xlambda3[j]
}

})

# Define Data
data <- list(
U_ij= U_ij,
K_star=K_star,
ks=ks,
ks_del=ks_del
)

# The modified constants removed L_theta
constants <- list(
I = I,
J = J,
K = K,
Q = Q,
C=C,
cnt=cnt
)

# Define the initial value
inits1 <- list(
C_per=sample(C,I,replace = TRUE),
pai=matrix(rep(1/C,C*3),C,3),
alpha = matrix(sample(c(0,1),I*K,replace = TRUE),I,K)
lambda0=lambda0,
lambda1=lambda1,
lambda2=lambda2,
lambda3=lambda3,
xlambda1=xlambda1,
xlambda2=xlambda2,
xlambda3=xlambda3
)

model <- nimbleModel(
code = code,
data=data,
constants = constants,
inits = inits1
)

This is the formula I use to calculate loo

# Extract the MCMC sample of loglikelihood
log_lik_samples <- MCMCchains(samples$samples, params = "loglikelihood")
# Convert to a matrix, with each row being an iteration and each column being a data point (I*J columns)
log_lik_matrix <- as.matrix(log_lik_samples)
Suppose the data dimensions are I= the number of subjects and J= the number of questions
log_lik_array <- array(NA, dim = c(nrow(log_lik_matrix), I, J))
for (i in 1:I) {
for (j in 1:J) {
log_lik_array[, i, j] <- log_lik_matrix[, (i-1)*J + j]
}
}

library(loo)
# Convert a three-dimensional array to a two-dimensional matrix (iteration × data points)
log_lik_flat <- matrix(log_lik_array, nrow = dim(log_lik_array)[1], ncol = I*J)
# Calculation LOO
loo_result <- loo(log_lik_flat)
# Output Result
print(loo_result)
` ` `