# How to fit a "lagged regression"

Dear stan community,
I am facing a challenging model and I am asking for your guidance on how to set up my work.

I measured the value of a continuous response variable y(t) for several months sampling it every t=20 min. At each t I measured two continuous forcings variables x_1(t) and x_2(t).

When looking at the values, I noticed that the response variable “reacts” to the forcings with a delay which I would like to infer from the measures.
I have a couple of questions:

• is there an example of a “lagged regression” that I can use as a starting point for coding a model?
• I don’t know if it’s a good idea, but I could treat the delays T_1 and T_2 as parameters external to the model (i.e. explicitly stated in the data block) and implement a sort of grid sampling using LOOCV-IC to compare the models. This is a costly task (and I have to carefully think about indexing the vectors involved), so I am seeking your help to understand if it’s worth the effort.

Thanks in advance for any assistance and usual information!

Do you believe that it is reacting at some specific particular unknown lag, or that it is reacting to the forcing variables over some window? If the latter, what are you willing to assume about the size and shape of the window?

Thanks for the quick response. I would hope for two exact lags, but likely the lag of y with x_1 is between 3 and 5 hours and tje lag og y with x_2 between 6 and 8 hours. Moreover x_2 has only positive value with many of them are 0.

If the lags are exact, then I think you’ll need to marginalize over the possible values of the lag, either explicitly inside a model or perhaps using your LOO approach. Fortunately, with 2-hour windows of plausible values, you’ll have only about 6 possible values for each variable, so 36 terms in the marginalization, which is highly tractable.

Note, however, that if you instead you were willing to assume that the response depends on a weighted average of the variable over some window, and the weighting function has tails that smoothly decay to zero, then you might be able to fit this without marginalizing. For example, if the weighting function is a normal PDF whose scale you either know a priori or estimate from data, and whose mean you estimate from data, then you’d have continuous derivatives without marginalizing. However, I think you would need the scale to be on the order of 20 minutes or greater to avoid problems with highly variable curvature and possible multimodality.

If the forcing variables are substantially autocorrelated, this might be effectively the same thing as identifying the single best lag.

Given that the true best lag might not fall a precise integer multiple of 20 minutes in the past, I would think that you’re already at least somewhat committed to using snapshots of the forcing variables to represent their values in a time window on the order of 20 minutes long.

2 Likes

Dear @jsocolar it took me a while to understand in depth your advice about marginalizing. So, I will try to implement different approaches, maybe starting in a principled way. I will let you (and all the community) know how it’s going with more detailed questions and snippet of code. In the meanwhile… happy modeling to you!

1 Like

Sorry @jsocolar, can you say what you mean by marginalize here?

1 Like

The OP wants a regression of the form y = \alpha + \beta_1 x_1(t_1) + \beta_2 x_2(t_2) where the values of t_1 and t_2 are discrete unknowns. Thus the need to marginalize over the possible values of the pair (t_1, t_2). The OP has sufficiently strong prior information (or “eyeball the data” information) to restrict both t_1 and t_2 to about 6 possible values each, yielding 36 terms in the marginalization.

The marginalization will have the form

target += log_sum_exp(
log_p[1] + response_lpdf(y | alpha + beta[1] * x1[1] + beta[2] * x2[1],
...
log_p[N] + response_lpdf(y | alpha + beta[1] * x1[i] + beta[2] * x2[j]
)


where log_p is the logarithm of a simplex parameter.

edit: to write it in this compact form, x1 and x2 are arrays of pre-computed vectors giving the predictors at specific candidate lags.

1 Like