Proper Specification of multiple membership model including multiple covarariates that vary over different levels of multi-membership grouping factors

I cannot share the data as there are unique identifiers of teachers and students still contained therein. Either way, my question is general enough that I don’t think it’s necessary to have specific data.

I have data with the following structure. My outcome variable varies at the student level—call it academic achievement for ease. Students are nested in one or two teachers. We have a few measurements on each teacher—let’s just use years taught teaching as an example.

Our interest is exploring teacher effects on students. The variance components model is easy enough to set up:

StudentOutcome~1+ (1 |mm(Teacher_1, Teacher_2)) 

Adding in a level 2 (teacher) predictor makes things more complicated. From what I can tell, this would be the right way to specify a model where student outcomes are predicted by a random teacher effect and a fixed effect measured at the teacher level.

StudentOutcome~TeacherYearsTeaching_Avged +
 (1+mmc(TeacherYearsTeaching_Teach1, TeacherYearsTeaching_Teach2) | 
    mm(Teacher_1, Teacher_2))

Where TeacherYearsTeaching_Avged is the weighted average of time spent teaching for any teachers a student might have had. Right now, I’m assuming equal weights of each teacher and the data is structured such that if a student only had one teacher, the teacher identifiers in columns Teacher_1 and Teacher_2 are identical. At the group level, my output includes a random effect of teacher, the teacher-level variance of the effect of TeacherYearsTeaching_Avged, and the correlation between this random slope term and the intercept.

Now, say I want to add in one more teacher level effect: say a measure of teacher warmth. I have not been able to find any example syntax for that. I think this is what I would do:

StudentOutcome ~ TeacherYearsTeaching_Avged+TeacherWarmth_Avged+ 
  +(1+mmc(TeacherYearsTeaching_Teach1, TeacherYearsTeaching_Teach2)+ 
      mmc(TeacherWarmth_Teach1, TeacherWarmth_Teach2) 
    | mm(Teacher_1, Teacher_2))

I’m just not 100% confident that this is right. One side-effect of this model is that it adds as a parameter the correlation between the random slopes. I’m not terribly interested in that parameter as a matter of inference, though I imagine failing to account for it would be the same as assuming there is no correlation. Also, my model will eventually include more like 6 teacher level covariates. So, if I specify a random slope for each of those, I will end up with a bunch of new parameters indicating the extent to which these random slopes covary.

Can someone walk me through this and indicate whether my specification of a model with multiple teacher level covariates is correct?

ADMIN EDIT: Added code formating (```) for the formulas

1 Like

Sorry for not responding to your question earlier it is relevant!

Did you manage to resolve the issue in the meantime? To me the model seems roughly OK, but I’ve never built those models myself. You can always validate whether the model works for your data via prior and posterior predictive checks (e.g. as discussed in the workflow paper).

Best of luck with the model!

As far as I know, yes. I removed the random slope component so that I’m just estimating the random effects of nesting. The equation I fit looked like this:

StudentOutcome~ TeacherYearsTeaching_Avged+TeacherWarmth_Avged
          (1| mm(Teacher_1, Teacher_2))

As a sanity check, I restricted my sample to just kids with a single teacher and ran the corresponding simple multi-level models. Results are roughly comparable–changing only in ways that would reflect a more restricted sample (more uncertainty, etc etc). I’ll follow-up with the workflow paper, though. Sounds useful. Thanks!

1 Like