Fitting a GP model with subject-specific GP terms in brms


I’m working on a problem involving multiple, independent gaussian processes that I would like to fit jointly together with a common mean function. Specifically, I would like to fit a model of the form

mu ~ GP(0, C_mu)
f_i ~ GP(0, C_i), independent for i = 1, …, n
Y_ij | mu, f_i ~ N(beta0 + mu(t_ij) + f_i(t_ij), sigma^2)

but with the constraint that sum_{i=1}^n f_i(t) = 0 so that each GP f_i represents a subject-specific deviations from the common GP mean function mu.

I thought that I could do something like
y ~ gp(t, k = 10, c = 5/4) + gp(t, k = 10, c = 5/4, by = id)
since the gp-terms in brms are zero mean and the intercept is included by default, but that fails miserably. Sometimes it won’t even go into warmup (failing after a lot of “Gradient evaluated at the initial value is not finite”), and other times it starts but with very bad sampling behavior.

Is there something I’m missing in the parametrization or something specific related to the default priors?

Thanks in advance!

Here’s an example


seCov <- function(s, t, alpha, rho) {
  alpha^2 * exp(-(s-t)^2 / (2 * rho^2))

mu <- function(t) {
  sin(t*10) + t*10

simGP <- function(t, alpha = 1, rho = 0.1, sigma = 0.1) {
  cMat <- outer(t, t, seCov, alpha = alpha, rho = rho)
  f <- as.numeric(rmvnorm(1, sigma = cMat))
  y <- mu(t) + f + rnorm(length(t), 0, sigma)
  list(f = f, y = y)

n <- 25 #25 GPs
t <- seq(0, 1, length.out = 20) #20 obs for each

d <-"rbind", lapply(1:n, function(id) {
  gp <- simGP(t)
  data.frame(id = id, t = t, f = gp$f, y = gp$y)
d$id <- as.factor(d$id)

m <- brm(y ~ gp(t, k = 10, c = 5/4) + gp(t, k = 10, c = 5/4, by = id), data = d)
1 Like

It’s not brms, but here’s (rather old) code for doing a hierarchical GP in Stan.

Thanks, Mike.

I might end up taking inspiration from your code and building the Stan model up from scratch myself, but the real problem is more complicated with yet another nested level within ids, left and right censoring in the observations, and a factor value influencing the over mean. I really benefit from brms setting all of this up automatically including the basis function approximation and the within chain parallelization with cmdstanr as a backend for a huge speed-up.

It would save me a lot of time if I could get this simplified example up and running using brms. I’m still not sure why my model fails so terribly.

1 Like

If they always sum to zero then they are not independent and you need to define a kernel that specifies the full covariance structure jointly over all subjects and values for t (or maybe alternatively you could have N-1 independent GPs for t and for the last subject is is deterministically -sum(GPs of other subjects)). The lgpr package (Longitudinal Gaussian Process Regression • lgpr) can model subject-specific deviations from the common trend with the zero-sum kernel described in, which I believe is exactly what you are asking, if you define a model

lgp(formula = y ~ t + t|id, ...)

Might be quite slow tho if you have 20*25=500 observations as it doesn’t (yet still) have the basis function approximation, but I have fitted simple models with 600 observations in under a day with it.


This is a very good point. I feel kinda stupid for not realizing that earlier.

Thank you for the reference to lgpr. That looks very interesting, indeed, and I’ll read up on it. Unfortunately, the number of obs for the real problem is much worse than in my example - the real data set has around 1 * 10^5 data points for 380 subjects - so some basis approximation would definitely be required.