Hi,

Recently I am performing posterior predictive simulation for certain subgroups of the orginal data but encountered some coding inefficacy problems. Suppose we have a hierarchical logistic regression model for 5 districts, and we would like to do posterior predictive simulation for each district separately. Since I don’t know whether Stan has any data structure like list in R (I am a R user), my current strategy is to split the original data into five pieces with corresponding predictors (for example, in total I have n data points and the predictor X is of length n, but now I split the data into n1,x1,…,n5,x5 and put them as separate inputs in the ‘data’ chunk, then define five y_pred vectors with length n1, n2, n3, n4 and n5) and do posterior predictive simulation separately for each of the five y_pred.

However, the above method is quite inefficient and messy if I have more subgroups and I am wondering whether Stan has a more efficient way to accomplish this goal?

Thx!