My question is, how this would translate into brms, and if this is currently possible? I have no such comprehensive statistical background to understand the formulas in detail. What I find confusing is that the authors state at p. 208 that “The model as defined in Equations 1–3 forms level one of the multilevel model.” - which seems to me that level 1 includes all observations. However, some sentences later they say that “…whereas the model defined at level one is identical to an n=1 autoregressive time series model”. So is level 1 defined as a n=1 model, or does it include all observations, or am I confusing something here?
Let’s say, we have 3 time points and x and y were measured at each time point, while we have covariates that are constant accross all time points. From what I understood, the model would be written in brms like this:
Edited Code Example, this addresses Paul’s first comment
True, this is what I always do in SEM/lavaan - no idea, why I switched it in my notation above. Would you say that the above model represents a cross-lagged multilevel panel model? I’m not sure in which way the paper’s authors consider the n=1 autoregressive model… But maybe that’s already covered by using subject as random intercept (and more complex nested strucures would be written as nested random intercepts then…).
I do think it represents a cross-lagged panel model. From my understanding, the autoregressive part is just that y3 is regressed on y2 which is in turn regressed on y1 (same for x). But I haven’t read the paper thoroughly enough to be sure they mean exactly that.
Just running one of these models myself trying to replicate Lavaan findings in BRMS.
Typically in a CLPM you model covariance in outcomes and predictors. Setting set_rescor(TRUE) addresses the former but do we need to do anything to capture the latter ?
CLPMs assume equal spacing between waves, but that isn’t always the case and I suspect that it isn’t necessary in a multilevel setting. Is it possible to extend the syntax to account for unequal spacings between waves? Maybe by including time as a covariate?
If I’m not mistaken, this should take care of the correlations for the initial predictors. The rest of the predictors are outcomes too so should be accounted for.
I guess we could indeed “weight” the autoregressive predictors according their distance or let the model find out itself (using an interaction or something similar) how the autoregressive effects change of distance between time points.
So normally, I think, one would fit a model like this using the “long” data in which one row represents one person at one time (so there are subjects * times rows in total).
You could generate the lagged variables prior to sending to brms, like this
Is that right? Note that I add a random slope for the autoregressive parameter because that’s what that research group normally does and it comes closest to have a time series model at the subject level while still applying shrinkage to those subject-level estimates.
Yes, this looks reasonable to me. What to do with the covariate exactly depends on the research question I guess, but for a main effect of a covariate that does not vary across time within the same person, your approach looks good.
The problem here is that you then model residual correlations of y_lag by y, etc. which you might not want. To my knowledge there is no way to choose which residual correlations you want to model in brms.
I drew this in a state of the art SEM plotter, MS Paint. The red arrow is the bit I think we would struggle with but everything else is quite easily achievable.
Hopefully this is of some interest to you, gotta start paying back the help I get on here !
@jackbailey the problem with this approach when using normal response distributions is that we have two residual SDs battling for the same information, first the distributional parameter sigma and second the SD of obs. This will likely yield a badly sampling model.
Adding to what Paul said - I was suspicious that you could get this to converge and alas, I tried my best but couldnt get a decent enough sampling model.