We have binary longitudinal survey data, with some people measured repeatedly through time: y_{ij} \in \{0,1\} for person i and measurement j taken at time t_{ij}

We fit a model for E[y_{ij}\ |\ x_{ij}, t_{ij}], which we will then project to the target population distribution of p(x_{ij}) (for some time t_{ij} = t) as in Ghitza and Gelman (2020).

So in the language of Longitudinal Data Analysis (p.9), we want marginal (population-averaged) models E[y_{ij}\ |\ x_{ij}, t_{ij}], not subject-specific (â€śrandom effectsâ€ť/â€śmixedâ€ť) models E[y_{ij}\ |\ x_{ij}, t_{ij}, b_i] where b_i is person i's â€śrandom effectâ€ť.

Two options:

Fit a â€śrandom effectsâ€ť model and marginalize over the person effects b_i. We have to be careful about mis-specifying the distribution p(b_i), see Antonelli et al. (2016). Has anyone implemented this non-parametric person effects model in Stan ?.

Fit the marginal model. But how to model dependence among a personâ€™s measurements ? What is a convenient joint distribution for binary multivariate data ? One option uses the latent logistic variables (see Oâ€™Brien and Dunson (2004)): y_{ij} = I(L_{ij} \leq f(x_{ij},t_{ij})) (L_{i1},...,L_{in_i}) \sim \text{Multivariate Logistic}(0,R)
where marginally E[y_{ij}\ |\ x_{ij}, t_{ij}] = \text{logit}^{-1}(f(x_{ij},t_{ij})). Has anyone implemented this multivariate logistic in Stan ?

Does anyone have another idea for marginal models for longitudinal binary data in Stan ?

One possible way forward is to choose some candidate distribution for the random effect and then assess whether that choices looks okay via mixed predictive checking. For example, with a Gaussian random effect distribution, you could choose as your discrepancy measure a Shapiro-Wilk test statistic computed over your fitted random effects, and then ask how this compares per-iteration to the Shapiro-Wilk statistic computed over a random draw from the hyperparameters. Iâ€™ve done some of this with huge data sets, and what I find in general is that it is quite hard to reject the assumption of normality unless a substantial fraction of the parameters are informed by at least several Bernoulli trials. If a subset of the parameters are strongly informed and a lot of the parameters are weakly informed, it can be more powerful to compute the discrepancy measure just over the strongly informed parameters. So depending on your data this test might not be all that sensitive to misspecification, but on the other hand if a test like this isnâ€™t sensitive to misspecification then presumably a more flexible class of random-effect distribution will lead to inference that is weakly identified at best. If you are able to detect misspecification like this, then you could graphically examine the distributions of the random effect parameters to get ideas for a better parametric distribution of the random effect.

Finally, this is a bit off-the-wall, but I just figured Iâ€™d mention that if the outcome is zero-inflated (i.e. if the random effect distribution is negative-infinity-inflated), then this model is mathematically equivalent to what ecologists call an â€śoccupancy modelâ€ť and you might find some useful resources in that direction. To be clear, traditionally occupancy models include only a zero-inflation term and not an additional subject-level random effect, but thereâ€™s no reason in principle that you couldnâ€™t then include an additional random effect in the linear predictor for the mean conditional on not being in the zero-inflation state.

The inner expectation is the subject-specific (â€śrandom effectsâ€ť/â€śmixedâ€ť) model, which as you say can be logistic, with closed form e.g.: E[y_{ij}\ |\ x_{ij}, t_{ij}, b_i] = \text{logit}^{-1}(\beta x_{ij} + \alpha_{t_{ij}} + b_i).

The outer expectation is over the â€śrandom effectâ€ť distribution of b_i |\ x_{ij}, t_{ij}. Taking this expectation doesnâ€™t in general have a closed form. More seriously, it is sensitive to the distribution of b_i |\ x_{ij}, t_{ij}, see Antonelli et al. (2016).

Yes, we did some posterior predictive checks for quantities we care about, which alerted us to the possible misspecification of the â€śrandom effectsâ€ť.

And plotting the distribution of estimated â€śrandom effectsâ€ť (as you suggest) reveals multi-modality.

Really interesting pointer to the â€śoccupancy modelsâ€ť ! Our outcome isnâ€™t zero-inflated, but perhaps the technique points to helpful ideas to more flexibly model â€śrandom effectsâ€ť in general. Thanks so much !

This is a shot in the dark, but if you have relatively few time points then it may be feasible to structure the model as categorical sequences of 0s/1s rather than multivariate logistic. For example, if any given respondent has at most 3 observations across time, youâ€™d be predicting membership into one of eight sequences. For respondents who are missing observations, you would then have to marginalize over all remaining possible sequences. Instead of assuming a distribution for the random effects, you could freely model the baseline probability of being in each trajectory. I assume this would address the issue you highlight regarding normally-distributed random effects.

So, I experimented with an approximation approach to this problem that, after very limited testing, doesnâ€™t appear to fail terribly. At some point I might test it more thoroughly. I use an extended Kalman filter to update the latent states and transform them to the linear predictor, then a) compute the likelihood conditional on the predictor exactly, and b) update the latent states assuming the residuals are normally distributed, which they arenâ€™t. You can see an example with limited explanation at Binary data in state space models | Charles Driver