Cross-classified multiple membership models with brms

Controlling for outcome T1 when predicting outcome at T2 by means of including T1 as predictor in the model is flawed, as it assumes that T1 is observed without measurement error, which (depending on the level of measurement error) will lead to regression towards the mean.

The most correct way - that I know of - is to model both T1 and T2 as the response variable, with a predictor T1 vs. T2, grouping variable(s), and interaction between T1/T2 and the grouping variable if you want to quantify potentially different responses over time between groups. Plus a random intercept for individuals. An alternative way would be to use T1 as a predictor, but explicitly modelling the measurement error/variation in T1 (not sure this is possible in brms, but for sure possible with custom Stan code). However, you are then decoupling the estimation of measurement error in T1 from the estimation of measurement error in T2, which is a waste of statistical power. So, the first approach described in this paragraph is parsimonious, in my experience.

See also my follow-up post to the one you are linking to: Different Intercept Terms in Frequentist and Bayesian Regression - #6 by LucC

Not exactly. Model T1 as an outcome, jsut as T2, and include a random intercept. For it to work, you need to organise the data in long format:

ID class teacher time response
1 A Johnson T1 4
1 A Johnson T3 5
2 A Johnson T1 2
2 A Johnson T3 4

response ~ time + (1 | ID) + (1 | class) + (1 | teacher)

But that will only tell you the average change in response over time (with correct standard error for the average change). If you want to know e.g. whether the changes are dependent on the teacher, you would model:

response ~ time * teacher + (1 | ID) + (1 | class)

Note that all of these models use the teacher and class information very simplistically, e.g. as “observed” at the moment of the response. But if a person changed class or teacher one day before a response, the model will not distinguish that change from one that happened earlier during the year. For that to work, if possible, you could quantify the amount of e.g. cumulative teacher-time experienced by an individual up to the point of the response.

You can add random slopes on top of all if this, if you like. It might get quite messy though if you are also modelling interactions. I would then probably recode the interactions into explicit dummy variables, and build custom code to add a random slope effect to all of those interactions at ones (i.e. as an uncentered, scaled random effect).