Hi, I have a censored data and I am trying to fit a hierarchical model with latent variable(Z). Here is the model

Z_{ij} \sim Bernoulli( P_z ).

\tilde{Y}_{ij} \sim Normal(\theta_i, \sigma).

Y_{ij} = M if Z_{ij} = 1 or \tilde{Y}_{ij} > M.

Y_{ij} = \tilde{Y}_{ij} if Z = 0 or \tilde{Y}_{ij} \leq M.

The model consider two types of censoring. Any observation has certain chance (P_z) to be censored. The other type of censoring is when the value gets too large.

When fitting the model I put a beta prior on P_z; a Gamma prior on \sigma and a normal prior on \theta_i's.

Here is the code where I used to generate the data.

```
i_size = 3
j_num = 500
sample_size = i_size * j_num
p_z = 0.15
M = 40
sigma = 1
np.random.seed(1)
z_list = np.random.binomial(n=1, p=p_z, size=sample_size)
theta_list = np.random.uniform(10, 40, size=i_size)
y_ct_list = np.random.normal(loc=np.repeat(theta_list, j_num), scale=sigma, size=sample_size)
y_list = np.repeat(np.nan, sample_size)
observed_boolean_list = (z_list == 0) & (y_ct_list <= M)
y_list[observed_boolean_list] = y_ct_list[observed_boolean_list]
x_array = np.repeat(np.arange(i_size), j_num)
x_mat = np.zeros((x_array.size, x_array.max()+1))
x_mat[np.arange(x_array.size), x_array] = 1
```

When I tried to fit the model I described above I tried to fit the equivalent mixture model instead.

```
data {
int<lower=0> N_cens;
int<lower=0> N_obs;
int<lower=1> i_size;
real M;
vector<lower=0>[N_obs] y_obs;
matrix[N_obs, i_size] x_obs;
matrix[N_cens, i_size] x_cens;
}
parameters {
real<lower=0, upper=1> p_z_cens;
real<lower=0> sigma;
vector<lower=0>[i_size] theta;
}
model {
sigma ~ gamma(0.2, 0.1);
theta ~ normal(20, 7);
p_z_cens ~ beta(2, 5);
for (i in 1:N_obs) {
target += log(1 - p_z_cens) + normal_lpdf(y_obs[i] | x_obs[i] * theta, sigma);
}
for (i in 1:N_cens) {
target += log_mix(p_z_cens , 0, normal_lccdf(M | row(x_cens, i) * theta, sigma));
//target += log_sum_exp(log(p_z_cens),log(1 - p_z_cens) + normal_lccdf(censored_ct | x_cens[i] * theta, sigma));
}
}
```

To explain what happens in the model block, when the data is not censored the likelihood f(y_{ij}) is

(1 - P_z) * normal\_pdf(y_{ij}). When the data is censored the likelihood is P(Y_{ij} = M) = P_z * 1 + (1 - P_z) * (1 - Normal\_cdf(M)).

Here is how I code the model_data.

```
model_data = {"N_cens": int(sum(np.isnan(y_list))), "N_obs": int(sum(~np.isnan(y_list))), "i_size": i_size,
"M": M, "y_obs": y_list[~np.isnan(y_list)],
"x_obs": x_mat[~np.isnan(y_list), :],
"x_cens": x_mat[np.isnan(y_list), :]}
```

There are several problems I encounter.

- The warning message says â€śThe parameter p_z_cens has 2 priors.â€ť which I donâ€™t see why it is the case.
- The trace plots of \sigma and P_z donâ€™t appear to be random when the number of replicates for each I(j_size) gets large. For example, one trace plot I got is

- The posterior distributions are not concentrating more and more around the true value as the sample size gets large. For \sigma, itâ€™s completely off.

If anyone can tell me what I did wrong (ie. I didnâ€™t use the correct likelihood or error in Stan code), I really appreciate it. Thanks!