Distributed lag non-linear model

Does anyone here have experience with distributed lag non-linear models in Stan or brms (successful or not)?

Is it feasible in Stan? Existing papers use INLA or similar. And the literature we manage to find on this type of model is of a technical level that is beyond me. Tips for more introductory resources are welcome!

EDIT: brief description of DLNMs, quoting the cited abstract above:

"[…], a modelling framework that can simultaneously represent non-linear exposure-response dependencies and delayed effects. This methodology is based on the definition of a ‘cross-basis’, a bi-dimensional space of functions that describes simultaneously the shape of the relationship along both the space of the predictor and the lag dimension of its occurrence. "

Most of the models that can be fit in INLA can also be fit in Stan. I would say if you have models that can be fit in INLA, you should use INLA. Otherwise, Stan’s a good option for more general models.

In your particular case of distributed lag models, it’s just going to be a matter of writing down the density from the paper and then translating to Stan. I don’t see a clean presentation of the full model anywhere in that paper. Like most stats papers, it starts simple with formula (1) then continues to tinker with the model without ever recapituting the final model. Maybe they have code somewhere. The R code in the paper is just a wrapper to call something doing the real work.

1 Like

We have a toy version that I share with you. Might get you most of the way there. This model was built iteratively from a non-linear one year lag model we had. I hadn’t heard of the DLNM models (so going to use those for some work we are doing) so take that to mean I am not as sure about the lag and the cross-basis. However the non-linear part match results from a paper our group previously publish and the lag output makes sense in terms of what we know about the plant species we work with.

After we messed around in the model we feed the above paper, the maths, our data, the R packages cited in the paper, and some fine tuning of prompt into chatgpt as a double check.

I’ve attached the R script (FYI minimal best Stan practices in here) and the data we used. I’ve heavily commented the script so please read those comments. There are a number of places where the R code and model can be made cleaner, this was mostly a can we get this to work.

1.0 aw-model dlnl veg.R (9.6 KB)
annual_cottonwood_cover_w_gw.csv (13.4 KB)

2 Likes

These can be set up as penalized smoothers using {mgcv}'s tensor product capabilities, which are readily supported by {brms} using the t2() wrapper. I show in the following blog post how they can also be done in {mvgam} using the {te} wrapper: Distributed lags (and hierarchical distributed lags) using mgcv and mvgam | GAMbler. Basically the idea is to fit a smooth function of some covariate for the current time point, but to also fit smooth functions for lags of that covariate. The key is to force these functions to change smoothly over the lags to ensure efficient regularization. I also have a preprint which shows how you can set these up hierarchically, where we allow latent abundance for each of 9 focal species to depend on distributed lags of minimum temperature, but the species-level functions are partially pooled toward a shared “community level” function: Beyond single-species models: leveraging multispecies forecasts to navigate the dynamics of ecological predictability

3 Likes

Really appreciate this. One question re: brms and the t2 wrapper – how would you recommend passing the lag and weights matrices in this tutorial to brm? I initially thought the data2 argument could be used with a list, but no luck so far. Probably not designed for this purpose, but hoping you might have some thoughts.

No unfortunately I’ve never tried to supply list data to {brms} models. Can you fit the types of models you are interested in using {mvgam}?

I’m definitely intrigued. In this particular case, however, I have ordinal data. Not sure if adding more distributional families is in the cards, but happy to add to your “Future enhancement wishlist” burden ;)

Right, thanks for clarifying. Yes ordinal is on the cards for the next round of development in {mvgam}, but this release may be several months away unfortunately. However you could formulate the model using the poisson() family to generate the Stan code and necesarry data objects. You could then modify the Stan code to use the ordered_logistic distribution from {brms}, making sure to add parameter statements and priors for the cutpoints. This would allow you to fit the model, but of course none of the downstream prediction and summary functions would work properly yet :(

1 Like