Space-time models in Stan / BRMS

Dear Stan,

Many of the problems that I come across in my line of work (marine biology) involve problems that are essentially space-time questions i.e. modelling a (complex non-gaussian) response (presence / absence of a species) that varies as a function of space (long-lat) and time (with and without additional explanatory variables, such as ocean temperature).

I have recently discovered BRMS and it’s ability to readily produce and model CAR correlation structures. However, I am wondering if this can be extended to a spacetime model? Much like can be done with INLA’s SPDEs, I would be aiming to have multiple realisations of the CAR field, one for each time step, with correlations (e.g. AR1) between temporally adjacent fields.

How realistic is such a goal in BRMS at the moment? And in the future?

Best wishes,

Mark

  • Operating System: Linux Mint 18.2
  • brms Version: NA

No idea about brms but I’m working with space-time models in Stan directly and would like to hear more about the models you’re interested in.

Hi,

Lets see if I can describe what I’m looking for! I have a set of fisheries data derived from recreational anglers as a function of both space and time - because the species of interest migrates in and out of the area of interest, the catch rates vary in both space and time in a complex and correlated manner. To complicate the manner futher, the data is discretized count data and is at least negative binomial, probably zero-inflated and over-dispersed on top of that as well. I would like to develop a spatial-temporal-smoother so that I can visualise / analyse the underlying migration process.

I am currently working in INLA where I am representing space in terms of a “besag” CAR model between polygons that I have defined - I have also played with an SPDE representation of space, but the CAR approach seems to be much easier to setup. I then have temporal correlations between these spatial fields using an AR1 or RW1 process.

I’m not a particularly strong user of INLA, and it can be tough-going at times for those without a formal statistical education. I’m attracted by the user-friendliness of BRMS, particularly with the idea of “Bayesian Workflow”, and was hopeing that there might be a user-friendly solution in there for this type of problem…

Cheers,

Mark

I don’t think we can manage that with brms in particular because of the combination of correlation structures in a specific way within one model. You may want to use brms as the basis to set up your Stan code, though, and then develop it further in your direction.

hi everyone
I’m trying create a spatio-temporal model with TMB (and your connection with Stan (tmbstan)) but I now read what @sakrejda has been working with this types of models in Stan directly!
¿Do you have some example to see this? Beacause I think that we can incorporate into of Stan the spatial dependence as a Gaussian Process (Random Field) but I think that is very expensive in related with the computational cost of estimation

Regards!

1 Like

My focus has been on interpretable splines because they allow you to make trade-offs between computational requirements and scale of inference explicitly. I used the mesher from INLA to lay out knots when I was doing that work. The biggest barrier ended up being the posterior geometry problems in basic AR-family models so from my point of view that needs to be solved to get really solid inference from spatio-temporal models

1 Like

Thanks for your reply @sakrejda

Is very interesting what you says about get solid inference in the geometry AR() processes. I will read more about this because I think that incorporate the dependence spatial and temporal in the observations is a great challenge for create this types of models in Stan

I’ve worked with similar kinds of data and have used tensor product splines to estimate the space-time components. You can use the t2() function in the formula to set up these kinds of terms. For example a smooth anisotropic spatial smooth interacting with a time component would be set up as

t2(x, y, time, k = c(10,10,15))

where x and y are the spatial coordinates, time is something like the observation year, and k is the basis size for each marginal smooth. An isotropic version would be

t2(x, y, time, d = c(2,1), k = c(50,15))

where the d sets the x and y coordinates to be from a 2d thin-plate spline, which has a single maximum wiggliness (here set to say 50).

The t2() in brms uses mgcv to set up the smoothers so as long as it goes into a t2() you can use any of a wide range of smoother types in the brms model.

You mention polygons (as if the data are areal observations?); if so, then this kind of spatial data can be accommodated in the spline framework via a Markov random field using the bs = 'mrf' option in mgcv and hence brms. You need to pass t2() a factor indexing the region/polygon for each observation rather than x,y pairs, but you can do this via a neighbourhood object generated from a polygon shapefile object read into R. See ?mgcv:::mrf for the detail.

These are not the kinds of models you mentioned but they are somewhat similar.

4 Likes

Hi Gavin,

<Click!> I’m familiar with these types of models in mgcv, but hadn’t made the conceptual linkage over to BRMS – they are
a logical way to move forward. Thanks for the tip!

One point of clarification – is there any way to implement the equivalent of a te() tensor-product spline in BRMS, or
is it limited to the isotropic case?

Thanks.

Mark

1 Like

te() is not yet supported by brms but may be supported in future releases.

t2() is the equivalent of te() it just uses a different parameterisation that allows it to be used in nlme, brms, etc via the mixed model representation of the splines. There are some differences in behaviour between te() and t2() when you get into potentially estimating hierarchical GAMs with these types of smoother, but whether that’s an issue or not depends on what you want to get out of the model.

Colleagues and I have recently written a paper on HGAMs from an mgcv perspective and which has a section on te() vs t2() and links to further work, which may be of interest. One of the things on the ToDo list is to write up some blog posts showing brms versions of the models we fit in that paper.

3 Likes

As @ucfagls pointed out, t2(x,y,…) is a tensor product smoother suitable for both isotropic and anisotropic predictors.

You can of course use a multivariate Gaussian Process for spatial data in brms as gp(longitude,latitude …) for either isotropic or anisotropic predictors, which of course can be done also for isotropic predictors in mgcv directly as s(..., bs="gp").

And this is especially useful for spatially explicit data modelling.

Word of warming though, GP spatial is very very very slow using Stan and if you have large data sets (> 100k georeferenced fisheries sets for instance), it will exhaust your computer’s system heap and not run.

It will run in mgcv - eventually, but at least using substantial less memory allocation.

Stan/brms is incredible machinery but not for GPs with large data sets - at least not for the moment unless you use a cluster.

1 Like

@MilaniC is right in the exact GPs are super slow. An alternative could be approximate GPs, which are also available via the gp() function of brms if you use the “k” argument. In theory, they scale roughly like splines but in practice show a bit more convergence warnings. More on this soon once we have the paper about it finished.

2 Likes

Re the gp basis in mgcv; it works, but you do have to pay very careful attention to what it assumes. The GP only fits into the GAM framework used by mgcv if the basis functions themselves don’t depend on any parameters of the model. In the GP basis, the basis functions depend on the effective range parameter. mgcv uses a heuristic that is, or is based on, the largest distance between a pair of observations. That is often not what you want.

You can profile over this effective range parameter but that gets slow as you need to fit your model many times depending on how you tune the grid of values for the effective range parameter to profile over. Although, at least the profiling can be done over many cores if you have them.

3 Likes

@ucfagls that is much appreciated salient advice on the mgcv implementation of GPs

1 Like