Dear forum,
I’m struggeling with what seems to be a simple issue: I have a dataset in which there is spatial autocorrelation (SAC). I want to test for this SAC, account for this SAC, and then test again to see the model has adequately accounted for the SAC. My question is perhaps similar to: Extracting residual covariance structure in brms
I have read around and tried many methods to make this work in brms, but cannot seem to figure it out. If I understand correctly, the gp() and sar() components affect the conditional response, but not the residuals directly. So either I am not extracting conditional residuals (that include the effects of the gp() or sar()), or the covariance structures I include do not account for the spatial autocorrelation. However, similar cov structures do easily account for the same autocorrelation in packages glmmTMB and spdep.
I have read that posterior_epred() and posterior_predict() should both show conditional residuals, but perhaps they dont. I have tried different ways of extracting residuals from a brm object to see if there is a way that would incorporate the effect of an SAC term.
There is also the comment of @paul.buerkner that one can use loo() but I fail to see how I would be able to explicitely test (visually or with some test statistic) that the model adequately accounted for the spatial autocorrelation in the data. Spatial autocorrelation - #2 by paul.buerkner
I basically want a plot that shows that there is no residual spatial autocorrelation, see example code below.
Thank you for your help!
Kind regards,
Louis
Code to replicate problem:
library(sf) # for spatial operations
library(gstat) # for variogram
library(spdep) # for neighbors and Moran's I
library(brms)
## Create data
n <- 100
set.seed(1)
# spatial effect
coords <- data.frame(x = rnorm(n),
y = rnorm(n))
spatial_effect <- with(coords, 3*sin(x) + cos(y))
# covariates
x1 <- rnorm(n, 2000, 200)
x2 <- rpois(n, 5)
group <- sample(1:6, n, replace = T)
# Response
response <- spatial_effect + .6*scale(x1) - .4*scale(x2) + rnorm(6)[group] + rnorm(n, 0, 0.5)
# Assemble data frame
data <- data.frame(response = response, x1 = x1, x2 = x2, group = group, coords)
data_sf <- st_as_sf(data, coords = c("x", "y"))
## Run models
m_no_autocor <- brm(response ~ x1 + x2 + (1|group), data = data)
data_sf$resid <- residuals(m_no_autocor)[, "Estimate"] # get residuals
plot(data_sf["resid"], pch = 16, cex = 2) # plot residuals
plot(variogram(resid ~ 1, data = data_sf)) # semivariogram
moran.mc(data_sf$resid, listw, nsim = 100) # moran's I
# SAC correction with sar()
m_sar <- brm(response ~ x1 + x2 + sar(M, type = "lag"), data = data, data2 = list(M = nb))
data_sf$resid <- residuals(m_sar, method = "fitted")[, "Estimate"]
plot(data_sf["resid"], pch = 16, cex = 2) # plot residuals
plot(variogram(resid ~ 1, data = data_sf)) # semivariogram
moran.mc(data_sf$resid, listw, nsim = 100) # moran's I
# SAC correction with gp()
m_gp <- brm(response ~ x1 + x2 + gp(x + y, k = 5) + (1|group), data = data)
data_sf$resid <- residuals(m_gp, method = "posterior_epred")[, "Estimate"] # get residuals
plot(variogram(resid ~ 1, data = data_sf)) # semivariogram
moran.mc(data_sf$resid, listw, nsim = 100) # moran's I
## Other ways of getting residuals
DHARMa.helpers::dh_check_brms()
data_sf$resid <- residuals(m_gp, method = "posterior_epred")[, "Estimate"]
## Other packages
# glmmTMB - no SAC correction
library(glmmTMB
m_glmmtmb <- glmmTMB(response ~ x1 + x2 + (1|group), data = data)
summary(m_glmmtmb)
data_sf$resid <- resid(m_glmmtmb) # get residuals
plot(data_sf["resid"], pch = 16, cex = 2) # plot residuals
plot(variogram(resid ~ 1, data = data_sf)) # semivariogram
moran.mc(resid(m_glmmtmb), listw, nsim = 100) # moran's I
# SAC correction with exponential term: exp()
data$pos <- numFactor(data$x, data$y) # make one factor level per unique combination of x-y coordinates
# dat$group <- factor(rep(1, nrow(dat))) # for unique coordinates
m_exp <- glmmTMB(response ~ x1 + x2 + exp(pos + 0 | group), data = data)
summary(m_exp)
data_sf$resid <- resid(m_exp) # get residuals
plot(data_sf["resid"], pch = 16, cex = 2) # plot residuals
plot(variogram(resid ~ 1, data = data_sf)) # semivariogram
moran.mc(resid(m_exp), listw, nsim = 100) # moran's I
## with spdep/spatialreg
library(spatialreg)
# Spatial lag model (lagged dependent variable)
m_lag <- lagsarlm(response ~ x1 + x2, data = data, listw = listw) # no random effect possible
summary(m_lag)
data_sf$resid <- residuals(m_lag) # plot residuals
plot(data_sf["resid"], pch = 16, cex = 2)
- Operating System: Windows 10
- brms Version: 2.23.0