The goal of mvgam is to estimate parameters of Dynamic Generalized Additive Models (DGAMs) for time series with dynamic trend components. It uses a State-Space framework with a formula syntax based on that of the package mgcv to provide a familiar GAM modelling interface. There is also built-in support for the increasingly powerful marginaleffects package to make interpretation easy.
The package is quite broad, allowing a range of predictor effects in the regression components of both latent process and observation models (these include the usual s(), ti() and te() wrappers from mgcv but also the gp() wrapper from brms. There is also support for monotonic smooths using basis constructors from the splines2 package. But the main focus is on the inclusion of a range of dynamic processes, inclusing random walks, autoregressive, vector autoregressive, continuous time autoregressive and piecewise growth processes. Too many to list so I’ve written an introductory blog post and developed a cheatsheet to help guide new users
Please share around, I’m happy to accept bug reports on the package GitHub Repo. As always, thanks to all the helpful posters on this site for making this work possible.
Really nice package, thanks for publishing! I am still going through the outline, but this is incredibly useful to address tricky auto-regressive time series in ecology, as evidenced by the complicated Stan code that gets generated.
Does the shared latent states feature support multivariate time series? In other words, say I have data from a few species collected at many locations, and I would like to model it as the same multivariate data-generating process with some hierarchical structure. Is this possible?
Furthermore, do you support irregularly sampled data, i.e. non-uniform time periods?
Thanks for the nice message @ystad. Yes you can set up multivariate processes when using shared states via the trend_map argument. But unfortunately it is not quite flexible enough yet to allow the same governing dynamics at multiple sites while also allowing different values of the latent states at each site. In other words, you can set up a model that forces all of species 1’s latent states to take particular temporal values, meaning that the latent state in site 1 at time t = 1 is exactly the same as the latent state in site 2 at t = 1. And your model can do the same for all of species 2’s latent states. You can also allow these processes to be multivariate (i.e. via a Vector Autoregressive model or perhaps a correlated Random Walk). But it would be nice if you could learn the governing dynamics (i.e. AR coefficients, covariance structures etc…) over all sites while allowing the actual latent state values to vary across sites. Unfortunately this is not yet possible, though it is something that I’d like to build into the package in future.
To answer your other question, yes the package can handle irregularly sampled data using a continuous time autoregressive process. I actually provide an example in this post: Autocorrelation for unevenly spaced time series - #8 by nicholasjclark. But there are no options to set these up as multivariate because they operate on time differences rather than actual time points.