Auto-grouping of latent variable in Gaussian process

Hi all,

I am trying to include a Gaussian process on longitude/latitude sampling coordinates in a hierarchical model. Samples cover two years and I want to model spatial correlation within each year only, so I’m using the “by = Year” term in brms to fit a separate GP for each year. There are a number of duplicate coordinates within each year which I would like to assign the same latent variable by using the “gr = TRUE” argument. However, doing so causes LOO and general posterior draw functions to return the following error:

Error in as_draws_matrix(gp[[“Cgp”]], dim = dim(eta)) :
length(x) %in% c(1, dim[2]) is not TRUE

It seems that the dimensions of the contrasts for the sub-GPs are not equal to 2 as expected. Disabling the “gr” argument fixes the error on a small subset of the data, but
returns a different error on the full data about the covariance matrix not being positive definite. In any case, it does not seem appropriate to disable the grouping option given the data structure.

Any ideas on what could be causing this error?

Thanks in advance!

  • Operating System: Windows 10
  • brms Version: 2.12.0

Could you check whether the error persists I’m the github version of brms?

If yes please provide a minimal reproducible example so that I can test it myself.

Hi Paul,

Thanks for the quick reply.
I tried running the model in the github version, but unfortunately the error persists.

Here’s a small example:

set.seed(10661)

# Simulate data
dat1 <- data.frame(Year = rep(c("2015","2016"), 50),
                   Count = rpois(100, 1),
                   Site = rep(c("Site1","Site2","Site3","Site4"), each = 25),
                   Lat = c(runif(25, min = 47, max = 48),
                           runif(25, min = 49, max = 50),
                           runif(25, min = 51, max = 52),
                           runif(25, min = 53, max = 54)),
                   Lon = c(runif(25, min = 7, max = 8),
                           runif(25, min = 9, max = 10),
                           runif(25, min = 11, max = 12),
                           runif(25, min = 13, max = 14)),
                   X1 = rnorm(100,3,1))

# Create some duplicate coodinates within each year for all sites
dat2 <- dat1
# Site1
dat2[c(10,12),4] <- dat2[8,4]
dat2[c(10,12),5] <- dat2[8,5]
dat2[c(9,11),4] <- dat2[7,4]
dat2[c(9,11),5] <- dat2[7,5]

# Site2
dat2[c(30,32),4] <- dat2[28,4]
dat2[c(30,32),5] <- dat2[28,5]
dat2[c(29,31),4] <- dat2[27,4]
dat2[c(29,31),5] <- dat2[27,5]

# Site3
dat2[c(60,62),4] <- dat2[58,4]
dat2[c(60,62),5] <- dat2[58,5]
dat2[c(59,61),4] <- dat2[57,4]
dat2[c(59,61),5] <- dat2[57,5]

# Site4
dat2[c(80,82),4] <- dat2[78,4]
dat2[c(80,82),5] <- dat2[78,5]
dat2[c(79,81),4] <- dat2[77,4]
dat2[c(79,81),5] <- dat2[77,5]

# Define formulas with and without grouping 
formula1 <- Count ~ X1*Year + (1|Site) + gp(Lon,Lat, by = Year, iso = FALSE, gr = T)
formula2 <- Count ~ X1*Year + (1|Site) + gp(Lon,Lat, by = Year, iso = FALSE)
no_cores <- 2

# Model on data with no duplicate coordinates
M1 <- brm(formula1, 
          data = dat1, 
          chains = 2, cores = no_cores, iter = 1000, warmup = 500,  
          family = poisson(link = "log"))
             
loo1 <- loo(M1) # No error.

# Same model on data with duplicate coordinates
M2 <- brm(formula1, 
          data = dat2, 
          chains = 2, cores = no_cores, iter = 1000, warmup = 500,  
          family = poisson(link = "log"))

loo2 <- loo(M2) # Error.

This now made me realize that the error also persists when turning off the auto-grouping contrary to what I stated in my original post.

# Different model without auto-grouping on duplicate data
M2b <- brm(formula2, 
          data = dat2, 
          chains = 2, cores = no_cores, iter = 1000, warmup = 500,  
          family = poisson(link = "log"))

loo2b <- loo(M2b) # Same error.

Hope that helps, thanks for your time!

Thanks! Should now be fixed on github.

Yes that fixed it! Thanks again for your help, much appreciated.