Many groups with a single observation (mix of longitudinal + cross-sectional)

Our data have 90% of people observed once and 10% of people at >= 2 time points. Our model includes person “random” effects modeled as normal(0, sigma). Our dataset is huge, so even with only 10% repeated folks, we can estimate sigma. Are there modeling / computational recommendations for this scenario to improve / speed up model fitting?

Some references:

For everything normal, the person-specific intercepts can (and often should) be integrated out of the posterior distribution that includes the standard deviation of the intercepts across people \sigma. For outcomes that are not conditionally normal but structures that still only include one person-specific intercept, you can still integrate them out but have to do so numerically. These days it can be done in Stan code with integrate_1d and in parallel with map_rect.

1 Like