Fitting a GP model with subject-specific GP terms in brms

AK10 · May 21, 2021, 2:11pm

Hi,

I’m working on a problem involving multiple, independent gaussian processes that I would like to fit jointly together with a common mean function. Specifically, I would like to fit a model of the form

mu ~ GP(0, C_mu)
f_i ~ GP(0, C_i), independent for i = 1, …, n
Y_ij | mu, f_i ~ N(beta0 + mu(t_ij) + f_i(t_ij), sigma^2)

but with the constraint that sum_{i=1}^n f_i(t) = 0 so that each GP f_i represents a subject-specific deviations from the common GP mean function mu.

I thought that I could do something like
y ~ gp(t, k = 10, c = 5/4) + gp(t, k = 10, c = 5/4, by = id)
since the gp-terms in brms are zero mean and the intercept is included by default, but that fails miserably. Sometimes it won’t even go into warmup (failing after a lot of “Gradient evaluated at the initial value is not finite”), and other times it starts but with very bad sampling behavior.

Is there something I’m missing in the parametrization or something specific related to the default priors?

Thanks in advance!

Here’s an example

library(brms)
library(mvtnorm)

seCov <- function(s, t, alpha, rho) {
  alpha^2 * exp(-(s-t)^2 / (2 * rho^2))
}

mu <- function(t) {
  sin(t*10) + t*10
}

simGP <- function(t, alpha = 1, rho = 0.1, sigma = 0.1) {
  cMat <- outer(t, t, seCov, alpha = alpha, rho = rho)
  f <- as.numeric(rmvnorm(1, sigma = cMat))
  y <- mu(t) + f + rnorm(length(t), 0, sigma)
  list(f = f, y = y)
}

n <- 25 #25 GPs
t <- seq(0, 1, length.out = 20) #20 obs for each

set.seed(12345)
d <- do.call("rbind", lapply(1:n, function(id) {
  gp <- simGP(t)
  data.frame(id = id, t = t, f = gp$f, y = gp$y)
}))
d$id <- as.factor(d$id)

m <- brm(y ~ gp(t, k = 10, c = 5/4) + gp(t, k = 10, c = 5/4, by = id), data = d)
m

mike-lawrence · May 21, 2021, 5:47pm

It’s not brms, but here’s (rather old) code for doing a hierarchical GP in Stan.

AK10 · May 21, 2021, 6:00pm

Thanks, Mike.

I might end up taking inspiration from your code and building the Stan model up from scratch myself, but the real problem is more complicated with yet another nested level within ids, left and right censoring in the observations, and a factor value influencing the over mean. I really benefit from brms setting all of this up automatically including the basis function approximation and the within chain parallelization with cmdstanr as a backend for a huge speed-up.

It would save me a lot of time if I could get this simplified example up and running using brms. I’m still not sure why my model fails so terribly.

jtimonen · May 21, 2021, 8:20pm

If they always sum to zero then they are not independent and you need to define a kernel that specifies the full covariance structure jointly over all subjects and values for t (or maybe alternatively you could have N-1 independent GPs for t and for the last subject is is deterministically -sum(GPs of other subjects)). The lgpr package (Longitudinal Gaussian Process Regression • lgpr) can model subject-specific deviations from the common trend with the zero-sum kernel described in https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab021/6104850, which I believe is exactly what you are asking, if you define a model

lgp(formula = y ~ t + t|id, ...)

Might be quite slow tho if you have 20*25=500 observations as it doesn’t (yet still) have the basis function approximation, but I have fitted simple models with 600 observations in under a day with it.

AK10 · May 21, 2021, 8:29pm

This is a very good point. I feel kinda stupid for not realizing that earlier.

Thank you for the reference to lgpr. That looks very interesting, indeed, and I’ll read up on it. Unfortunately, the number of obs for the real problem is much worse than in my example - the real data set has around 1 * 10^5 data points for 380 subjects - so some basis approximation would definitely be required.

Topic		Replies	Views
GP interpolation with brms brms fitting-issues , specification , brms	3	479	July 28, 2022
Latent Gaussian processes for repeated measures Modeling rstan , techniques , specification , gaussian-process , brms	5	1727	January 10, 2022
Auto-grouping option in brms' gp() brms gaussian-process	5	1096	September 29, 2019
Priors and Gaussian process hyperparameters Modeling brms	4	101	February 17, 2025
How to custom priors for sdgp and lscale parameters of a gaussian process? brms	2	695	November 14, 2023

Fitting a GP model with subject-specific GP terms in brms

Related topics