Hi all!

I am trying to use a hierarchical mixture model for a classification task. After fitting this model to some training data I would like to evaluate this model on some validation or test data, without further fitting of the hyper parameters. How would I do this in Stan?

There are two reasons why I would like to avoid having an influence of the validation datapoints on the hyper parameters:

- saving computation: Fitting the model on the relatively large dataset takes a while and for application the model is meant to predict classifications for a single new group of data. It seems wasteful to repeat this computation for each new classification and keeping all data in would not be representative of this application.
- The validation data could be useful for learning a better model and for making a fair comparison to models which cannot take this into account I would like to prevent this from happening.

Concretely, I have the following setup:

The data come from K subjects with N_k observations each, which are meant to be classified into two classes based on a feature f, which contains a random offset, which is shared among all observations from the same subject. For simplicities sake assume that there is only one feature dimension, which I model as f = \mu_C + \mu_k + N(0, \sigma_C). In Stan notation this is:

```
data {
int<lower=1> N; // number of datapoints
int<lower=1> K; // number of subjects
real latents[N]; // feature
int<lower=0> targets[N]; // targets, i.e. categories
int subject[N]; // which subject goes with which observation
}
parameters {
real mu_k[K]; // subject shifts
real mu[2]; // class means
real<lower=0> sigma[2]; // standard deviation per class
real<lower=0> sigma_pop; // standard deviation of mu_k
}
model {
for (k in 1:K) {
mu_k[k] ~ normal(0, sigma_pop);
}
for (n in 1:N) {
latents[n] ~ normal(mu[targets[n]] + mu_k[subject[n]], sigma[targets[n]]);
}
}
```

This is shortened and simplified from the actual program. Please excuse typos etc.

After sampling from this models parameters (means and variances for the classes and sigma_pop) I get a new subject with N_k new observations f_val and would like to use stand to sample its mean mu_k and class memberships for the new observations without updating any estimates for the training data.

I have so far found two ways which do not reach my aims:

- pass f_val to the original model fit, which leads to model fits depending on this data
- Using the transformed parameters or a new model, which I did not get to sample the mu_k for the new validation subject

Any Ideas how this could or should be done?