Multilevel model for longitudinal data: fixed x random interaction (crossed random effects)

I am analyzing a longitudinal data set where observations for time t (level 1) are nested within units i (level 2). The baseline model with random intercept and slope looks something like this:

y_ti = beta_0i + beta_1i * x1_ti + e_ti, with

beta_0i = gamma_00 + gamma_10 * a_i + u_0i
beta_1i = gamma_10 + gamma_11 * a_i + u_1i

and a bivariate normal structure for the second level error terms. In the original model, x1 corresponds to a unit-specific trend. In the second half of the observation period, some units were exposed to a treatment. All units were observed at the same points in time, however, the treatment occured at different points in time for different units (e.g., for unit 1 in period 15 and for unit 2 in period 25). In addition, some of the units were not exposed to the treatment at all. Also, there is another variable (x2) and I would like to find out if the associated random slope differs before and after the treatment occured. This means I am not only interested in the variation in slopes between units for beta_2, but also in the within-unit variation in slopes before and after the treatment occured for those units that were exposed to the treatment. In the following equation, I added the fixed treatment effect and the additional random slope:

y_ti = beta_0i + beta_1i * x1_ti + beta_2i * x2_ti + beta_3 * treat_ti + e_ti

So basically, I am interested in the interaction between the treatment variable (fixed effect) and the x2 variable (random effect). To solve this, my current approach is to model the outcome for time t (level 1) nested within units i (level 2), and units i cross-classified by status j, i.e., treatment and non-treatment (level 3). The cross-classified approach seems to be a possible solution, since the units are not nested in the higher-order treatment and non-treatment groups, but may belong to the one or the other at different points in time. My current approach is therefore:

y_tij = beta_0i + beta_1i * x1_ti + beta_2ij * x2_tij + beta_3 * treat_ti + e_tij

This gives me two coefficients (slopes) for beta_2 for each treated unit (before and after) and one coefficient (slope) for each non-treated unit. This information could be used to explore the effect of the treatment on the random slope associated with variable x2.

I attached a reproducible example using simulated data. The model runs fine (also on the original data) and recovers the parameters well. So my question is: is this a reasonable modeling approach, or are there better ways to model this interaction?

Any help is much appreciated!

Run file: run_file.R (6.1 KB)
Stan model: model.stan (2.3 KB)

Hi,
at first glance this looks OK, but this is hard to tell without fully understanding your data, which I do not. Generally you can check that your model is appropriate with prior predictive checks and posterior predictive checks. The Visualisation paper has some nice thoughts on this. Also you can search this forum or the whole Internet for these terms and you should find a lot of useful thoughts.

Also note that you can use MathJax (some hints at https://math.meta.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference) to display your Math nicely in your posts.

2 Likes

And if it’s helpful all the code we used for the paper is available at