I am analyzing a longitudinal data set where observations for time t (level 1) are nested within units i (level 2). The baseline model with random intercept and slope looks something like this:
y_ti = beta_0i + beta_1i * x1_ti + e_ti, with
beta_0i = gamma_00 + gamma_10 * a_i + u_0i
beta_1i = gamma_10 + gamma_11 * a_i + u_1i
and a bivariate normal structure for the second level error terms. In the original model, x1 corresponds to a unit-specific trend. In the second half of the observation period, some units were exposed to a treatment. All units were observed at the same points in time, however, the treatment occured at different points in time for different units (e.g., for unit 1 in period 15 and for unit 2 in period 25). In addition, some of the units were not exposed to the treatment at all. Also, there is another variable (x2) and I would like to find out if the associated random slope differs before and after the treatment occured. This means I am not only interested in the variation in slopes between units for beta_2, but also in the within-unit variation in slopes before and after the treatment occured for those units that were exposed to the treatment. In the following equation, I added the fixed treatment effect and the additional random slope:
y_ti = beta_0i + beta_1i * x1_ti + beta_2i * x2_ti + beta_3 * treat_ti + e_ti
So basically, I am interested in the interaction between the treatment variable (fixed effect) and the x2 variable (random effect). To solve this, my current approach is to model the outcome for time t (level 1) nested within units i (level 2), and units i cross-classified by status j, i.e., treatment and non-treatment (level 3). The cross-classified approach seems to be a possible solution, since the units are not nested in the higher-order treatment and non-treatment groups, but may belong to the one or the other at different points in time. My current approach is therefore:
y_tij = beta_0i + beta_1i * x1_ti + beta_2ij * x2_tij + beta_3 * treat_ti + e_tij
This gives me two coefficients (slopes) for beta_2 for each treated unit (before and after) and one coefficient (slope) for each non-treated unit. This information could be used to explore the effect of the treatment on the random slope associated with variable x2.
I attached a reproducible example using simulated data. The model runs fine (also on the original data) and recovers the parameters well. So my question is: is this a reasonable modeling approach, or are there better ways to model this interaction?
Any help is much appreciated!
Run file: run_file.R (6.1 KB)
Stan model: model.stan (2.3 KB)