Hi,
I have some problems with specifying a model and the models listed in the user guide weren’t quite applicable or at least I could not see that they were. Here is my problem:

I observe discrete timepoints of death of some test organisms. I assume that the observed survival times (or. time of observed death  ToD) follow a Gamma distribution.

Additionally I observe that the distribution has three modes, which I can explain by three predictors which produce the different effects.

Furthermore it is censored after the end of the experiment, although some individuals are still alive.
So far I have been working with a finite mixturemodel of Gamma distributions. Which I adapted mainly from the finite mixtures section in the user’s guide and the censoring section.
data {
int<lower=1> K; // number of mixture components
int<lower=1> N; // number of data points
real y[N]; // observations (death counts at day)
vector[N] UL;
matrix[N,K] weights; // matrix of K predictor weights (N Ksimplexes)
}
parameters {
vector<lower=0>[K] mu; // locations of mixture components
vector<lower=0>[K] sigma; // scales of mixture components
}
transformed parameters {
vector[K] shape;
vector[K] rate;
shape = square(mu) ./ square(sigma);
rate = mu ./ square(sigma);
}
model {
matrix[N,K] log_weight; // cache logweights calculation
mu ~ normal(0, 10);
sigma ~ normal(0, 10);
log_weight = log(weights);
for (n in 1:N) {
// weighing
row_vector[K] lps = log_weight[n];
for (k in 1:K)
// censoring
if(y[n] < UL[n]){
lps[k] += gamma_lpdf(y[n]  shape[k], rate[k]);
} else {
lps[k] += gamma_lccdf(UL[n]  shape[k], rate[k]);
}
target += log_sum_exp(lps);
}
}
This has been working satisfactory until I realized that that each individual time of death is better modeled if it would be caused by one single predictor and not a weighed mixture.
After some more reading in the forum I realized that this is a general issue that discrete parameters can’t (yet) be sampled in Stan and are instead better marginalized out, which is more efficient. So far so good. What I don’t understand at this point is the following:
5.2 Latent Discrete Parameterization
…
This model is not directly supported by Stan because it involves discrete parameters zn, but Stan can sample μ and σ by summing out the z parameter as described in the next section.
.
5.3 Summing out the Responsibility Parameter
To implement the normal mixture model outlined in the previous section in Stan, the discrete parameters can be summed out of the model. If Y is a mixture of K normal distributions with locations μk and scales σk with mixing proportions λ in the unit Ksimplex, then
p_Y(y~ ~ \lambda, \mu, \sigma) = \sum_{k=1}^K \lambda_k~ normal(y~~\mu_k, \sigma_k)
The way this is coded in stan appears to model each observation y_n as the sum of all K weighed probabilities. Is this an approximation, which works well for mixtures? I can see that all Y observations can be represented as the quoted formula but I seem to be missing how this also applies to any single observation. Could someone explain this to me? I’m sure I missed something.
At the moment I am trying to implement this model with latent discrete paramters like in the change point model (7.2  Stan User’s Guide). Is this generally recommendable for my problem or is there a better alternative?
Thank you, Florian