Multilevel regression for repeated measurement when some vars are only measured once

I hope someone can help me to do this correctly:

To start with the easier part, let’s say I have 2 variables:

y is my outcome and has been measured at 3 time points for each individual.
x is my predictor measured at the same three time points for each individual.

So, we have something like this:

id time y x
1 1 -0.0052001 -1.2837541
1 2 -0.7777503 1.2871669
1 3 -1.1164423 1.5863106
2 1 -0.0052001 0.4322705
2 2 -1.1802225 0.8316429
2 3 -0.2734144 -0.7329075
3 1 1.5340313 0.7874498
3 2 -1.1802225 -1.7150921
3 3 0.1480995 0.4722336

Like this, I should be able to fit a multilevel model where the relationship between y and x can vary between individuals and between time point. Thus:

y ~ x + (1 + x|id) + (1 + x|time)

But what if I now have another variable z that has only been measured once. If I want to include this variable into above regression model, I also have to put it into the dataframe shown above. So we get this:

id time y x z
1 1 -0.0052001 -1.2837541 1.7717719
1 2 -0.7777503 1.2871669 1.7717719
1 3 -1.1164423 1.5863106 1.7717719
2 1 -0.0052001 0.4322705 -1.4070905
2 2 -1.1802225 0.8316429 -1.4070905
2 3 -0.2734144 -0.7329075 -1.4070905
3 1 1.5340313 0.7874498 0.3481672
3 2 -1.1802225 -1.7150921 0.3481672
3 3 0.1480995 0.4722336 0.3481672

I could now fit this model:

y ~ x + z + (1 + x|id) + (1 + x|time)

But, something must be off here: The dataframe indicates that the variable z has been measured 3 times as well and that the value is the same for each measurement. That means we have 3 times as much data in this variable compared to reality. So, the regression model probably estimates now a much smaller SD for the posterior of the beta value for z, simply because it assumed 3x as much data for this specific variable.

Is there any way to “tell” the model that this value has only been measured once? How exactly can I solve this problem?

Thank you!

No need to worry here and no need to do anything different; the seemingly repeated data in z isn’t going to bias or affect inference thanks to the fact that you do have associated unrepeated values in y.

2 Likes

Thank you @mike-lawrence for ensuring me!

1 Like