(Note: edited to remove reference to hierarchical models specifically)

So the manual shows how models with repeated binary outcomes can have the likelihood part of computation accelerated through use of sufficient stats, and we’ve worked out the same acceleration for repeated Gaussian outcomes.

These strategies work great when repeated observations is ubiquitous at the level of the likelihood (ex. in a hierarchy of schools as groups and students as observations there’s multiple students in each school, in a hierarchy of human respondents on a survey and survey items as observations there’s multiple items for each human, etc.). However, it’s often the case that data contains mostly repeated observations per group, but some hierarchical groups have only one observation. I those cases, I’ve thusfar been reverting back to the non-accelerated “observation-level” computation of the likelihood, assuming that if I naively incremented the target using sufficient stats for those groups with repetitions then separately incremented the target using the observation-level likelihood, that this would not be appropriate (possibly reflecting giving higher weight to one or the other?).

So I guess the first question is: Is it indeed inappropriate to mix sufficient-stats increments and observation-level increments in a model? To be concrete, this would be the minimal binomial version:

```
data{
int n_obs_groups_with_reps ; // number of observation groups with repetitions
array[n_obs_groups_with_reps] int total_observations_per_group ;
array[n_obs_groups_with_reps] int total_ones_per_group ;
int n_obs_groups_without_reps ; // number of observation groups without repetitions
array[n_obs_groups_without_reps] int single_observation_per_group ;
}
parameters{
real<lower=0, upper=1> theta;
}
model{
// sufficient-stats increment for those with reps:
total_ones_per_group ~ binomial( total_observations_per_group , theta ) ;
// observation-level increment for those without reps:
single_observation_per_group ~ bernoulli( theta ) ;
}
```

And for the Gaussian version it would be:

```
data{
int n_obs_groups_with_reps ; // number of observation groups with repetitions
vector[n_obs_groups_with_reps] number_of_observations_per_group ;
vector[n_obs_groups_with_reps] mean_of_observations_per_group ;
vector[n_obs_groups_with_reps] variance_of_observations_per_group ;
int n_obs_groups_without_reps ; // number of observation groups without repetitions
vector[n_obs_groups_without_reps] single_observation_per_group ;
}
transformed data{
vector[n_obs_groups_with_reps] sqrt_n = sqrt(number_of_observations_per_group) ;
vector[n_obs_groups_with_reps] sd_gamma_shape = (number_of_observations_per_group - 1.0) / 2.0 ;
}
parameters{
real mu;
real<lower=0> sigma;
}
model{
mu ~ std_normal() ;
sigma ~ std_normal() ;
// sufficient-stats increment for those with reps:
mean_of_observations_per_group ~ normal( mu, sigma ./ sqrt_n ) ;
variance_of_observations_per_group ~ gamma(
sd_gamma_shape
, sd_gamma_shape ./ pow(sigma,2)
) ;
// observation-level increment for those without reps:
single_observation_per_group ~ normal( mu, sigma ) ;
}
```

If that naive approach is indeed incorrect, presumably there’s some correction factor that needs to be applied to account for the mixture of increment kinds; any idea what that correction might be? Maybe something like multiplying each increment of the sufficient stats kind by the number of observations that were collapsed to a single stat for that increment?