SPDE models in brms

Spatial models can be implemented in brms thanks to the link with splines constructor from mgcv, see for instance this thread. In a recent articles Miller and colleagues showed how basis penalty smoothers can be interpreted as stochastic partial differential equations (the method used in INLA) and vice-versa. In practice this means that one can create a mesh of the study area and estimate the spatial effect using mgcv::gam. The authors provide R code defining new functions to construct and predict from the smooth.

I thought I’d give it a shot and see if the new smooth construct defined by Miller and colleagues could be directly applied in brms:

library(INLA)
library(brms)
# source the new smooth code,
# code available with the article
source("mgcv_spde_smooth.R") 

## load the data
ca20 <- geoR::ca20
# put this in a data frame
dat <- data.frame(x = ca20$coords[,1], 
                  y = ca20$coords[,2], 
                  calcium = ca20$data, 
                  elevation = ca20$covariate[,1], 
                  region = factor(ca20$covariate[,2]))
# cretae the mesh, some data manip
mesh <- inla.mesh.2d(loc = dat[,c("x", "y")],
                     max.edge = c(50, 5000))

## fit the model with mgcv
mod <- brm(calcium ~ s(x, y, bs = "spde", k = mesh$n, xt = list(mesh = mesh)),
           data = dat)

This returns the following error:

Error in smooth2random.mgcv.smooth(sm, names(data), type = 2) : 
  Can not convert this smooth class to a random effect

So it seems that there is still a step missing of converting this new class of smooth into random effects that can be used in brms. Would this be something achievable? It is beyond my skills but I’d be happy to give it a shot given a few pointers.

  • Operating System: Win10
  • brms Version: 2.13.0
4 Likes

I read that article a while ago, it’s very intuitive and easy to understand.
I don’t know how \texttt{brms} works but I think that you could use the \texttt{spde} function from \texttt{smooth.construct}, fit a model in \texttt{mgcv} and obtain the penalized matrices to incorporate them in a \texttt{Stan} code. It’s just an idea.

This is not probably the more efficient process but it could work for you if you want to use sparse matrices.

Ok, this took some time to sink in @cavieresJJ but I think I got somewhere thanks to your reply.

The steps I followed where:

  • create a smooth object based on the function provided in the paper, right now this is most easily created by running a mgcv::gam model but there should be more efficient way to get a it:
mod <- gam(chl ~ s(lon, lat, bs = "spde", k = mesh$n, xt = list(mesh = mesh)),
           data = aral,
           control =  gam.control(scalePenalty = FALSE),
           method = "REML")
  • get the penalized matrix:
bs_mat <- PredictMat(mod$smooth[[1]], aral[,c("lon", "lat")])
  • use brms awesomeness to generate Stan code and data:
cat(make_stancode(chl ~ s(lon, lat), aral), file = "spde_stan.stan")
data_aral <- make_standata(chl ~ s(lon, lat), aral)
data_aral$Zs_1_1 <- bs_mat
data_aral$knots_1[1] <- ncol(bs_mat)
  • fit the model:
mod_stan <- stan("spde_stan.stan",
                 data = data_aral)

If we compare the posterior means for the different observations we get very similar results between INLA and Stan:

and also between MGCV and Stan:

4 Likes

Hi @lionel68

Great! That was an idea only and you found a good approximation!
Seeing the graphs, the results are very similar, what is the difference between time of computation?

I know that \texttt{INLA} is faster than \texttt{Stan} in this types of problems (spatial modelling) but maybe (due the numerical approximation in \texttt{INLA}) there could be a little difference in the parameters estimated between methods and interesting to analyze.

Cheers

The graphs of the INLA vs Stan and MGCV vs Stan predictions are extremely similar. Is it to be expected that INLA and MGCV would give near-identical results here while Stan would say something slightly different?

INLA was faster than Stan for the few examples that I tried, comparing the estimated parameters of the spatial effects might be tricky because the smooths are implemented in brms / Stan as random effects (see Using random effects in GAMs with mgcv), but one can of course always compare the predictions of the linear predictor or the posterior draws.

The predictions are indeed pretty close but I might be jumping too fast here because the smooth terms in brms are first re-parametrized in a ¨random effect¨ parametrization using mgcv::smooth2random methods, and there are no such methods implemented for the spde spline class (yet). I started an issue on the brms github repo about this.

Ok I digged a bit deeper into it and I am not sure that the smooths created to represent the SPDE can be re-parametrized as random effect smooth.

From the Miller paper we have:

y \sim N(\mu, \sigma) \\ \mu = \alpha + \beta A \\ \beta \sim MVN(0, S^{-1}) \\ S = \tau * (\kappa ^4 C + \kappa^2 G_1 + G_2)

Where A is the projection matrix connecting the interpolation points on the mesh (the nodes of the triangles) to the observed values, S is the precision matrix for the \beta terms, \tau and \kappa the two parameters from the matern covariance function and the C and G matrices the finite elements matrices computed from inla.mesh.fem.

I tried to plug this all into a Stan model:

data {
  int<lower=0> N; // number of observations
  int<lower=0> n_knots; // number of interpolation points on the mesh
  matrix[n_knots, n_knots] G[3]; // FEM matrices
  matrix[N, n_knots] A; // projection matrix
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
  real<lower=0> kappa;
  real<lower=0> tau;
  vector[n_knots] alpha; // for non-centered parametrization
}
transformed parameters {
  matrix[n_knots, n_knots] S; // precision matrix
  vector[n_knots] beta; // smooths coefficient
  S = tau * (G[1] * kappa ^ 4 + G[2] * 2 * kappa ^ 2 + G[3]);
  // implies beta ~ MVN(0, S)
  beta = rep_vector(0, n_knots) + cholesky_decompose(inverse(S)) * alpha;
}
model {
  alpha ~ std_normal();
  mu ~ normal(0, 1);
  sigma ~ lognormal(0, 1);
  kappa ~ lognormal(0, 1);
  tau ~ lognormal(0, 1);
  y ~ normal(mu + A * beta, sigma);
}

Sampling is started but it takes such a long time that I am wondering how to improve efficiency.

I guess that the issue might be the matrix inversion, but I do not see how to use the non-centered parametrization on the precision rather than the variance-covariance matrix. Or is it better to just directly sample the \beta from a multivariate normal precision function like beta ~ multi_normal_prec()? Or am I going wrong here?

Am actually puzzled that the naive approach of plugging-in the projection matrix as the basis function matrix into Stan code generated by brms for tp smooths worked so well as shown in the graphs above, maybe I should do some simus on that …

Hi @lionel68

S should be the precision matrix righ? So, it is the inverse of the covariance and it’s sparse. I think that you should put S directly in your code and not do the inversion. I do that with \texttt{TMB} and it works. The advantage of \texttt{TMB} is that you can take the optimized parameters from the fit and start with them for the sampling.In your case you must be careful to choose correctly the the prior’s on \kappa and \tau because they are deciding the behavior of the spatial random field.

The sampling is slow compared with \texttt{INLA} because the latter is a deterministic approximation method but if you want to build a model with a high number of hyperparameters then the computation cost should be similar to \texttt{Stan}.

1 Like

Thanks @cavieresJJ for this, I am not so familiar with TMB, can you link it to Stan somehow? Or is your advice to move there completely?

Given that this discussion is moving away from brms I started a new thread.

Hi @lionel68

Yes, you can link \texttt{TMB} to \texttt{Stan} using \texttt{tmbstan} (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5967695/pdf/pone.0197954.pdf)

If you want to do this type of analysis then my advice is to use \texttt{TMB}, only you need to put the prior’s on the (hyper) parameters of your model ( similar like you make in \texttt{Stan}), the problem is that you should take a bit more time to learn about \texttt{TMB}.

Here you can read an interesting paper related to spatial modelling in \texttt{TMB} and the authors give a code that you can reproduce easily (https://arxiv.org/pdf/2103.09929.pdf)

Best wishes

1 Like