Hi, I’m trying to fit a CAR model, which I’m new to, using brms. The response variable is continious and on a irregular grid and they stem from model predictions. Because of that, the residual distributions are very weird and I have therefore discretized it and played around with a ordinal models.
Here are two plots to illustrate the continious and discretized data:
library(brms)
library(ggplot2)
d <- readr::read_csv("https://raw.githubusercontent.com/maxlindmark/spatial-metabolic-index/main/example_data.csv")
# Plot data
ggplot(d, aes(X, Y, fill = biom_slope_gr_gc)) +
geom_raster()
ggplot(d, aes(X, Y, fill = biom_slope)) +
geom_raster() +
scale_fill_gradient2()
The ordinal model fits really well, here’s an example:
# Fit non-spatial ordinal model
fitcod <- brm(
formula = biom_slope_gr_gc ~ temp_slope_sc*mean_temp_ct + mean_log_biomass_ct,
data = cod,
chains = 3,
cores = 3,
iter = 3000,
init = 0,
family = cumulative(probit)
)
# I'm following this link to get randomized quantile residuals for a discrete distribution so that I can plot them in space: https://cran.r-project.org/web/packages/tidybayes/vignettes/tidybayes-residuals.html#probability-residuals-and-quantile-residuals
res_ns <- cod |>
ungroup() |>
add_predicted_draws(fitcod) |>
mutate(.prediction = ordered(levels(biom_slope_gr_gc)[.prediction], levels = levels(biom_slope_gr_gc))) |>
summarise(
p_lower = mean(.prediction < biom_slope_gr_gc),
p_upper = mean(.prediction <= biom_slope_gr_gc),
p_residual = runif(1, p_lower, p_upper),
z_residual = qnorm(p_residual),
.groups = "drop_last"
)
# QQ-plot
ggplot(res_ns, aes(sample = z_residual)) +
geom_qq() +
geom_abline()
# In space
ggplot(res_ns, aes(X, Y, color = z_residual)) +
geom_point(size = 0.6) +
scale_color_gradient2()
However, I wonder now if there’s an issue with spatial autocorrelation, which the underlying data and the discretized data show clear signs of. Below is code to add a CAR structure to the model. For speed I filter the data and use a simpler model just to test the CAR part, which I haven’t gotten to converge yet besides for the example in ?car.
d <- cod |> filter(X > 620 & X < 650 & Y > 6120 & Y < 6150)
# Set up distance and neighbourhood matrices
# See ?car
K <- nrow(d)
distance_grid <- d[, c("X", "Y")]
distance <- as.matrix(dist(distance_grid))
W <- array(0, c(K, K))
W[distance == 3] <- 1 # The distance between two adjacent cells is 3
d$grp <- 1:K
rownames(W) <- 1:K
fitcar <- brm(
#formula = ordered(biom_slope_gr_gc) ~ temp_slope_sc*mean_temp_ct + mean_log_biomass_ct + car(W, gr = grp, type = 'icar'),
formula = biom_slope ~ temp_slope_sc + car(W, gr = grp, type = 'icar'),
data = d,
chains = 4,
cores = 4,
iter = 4000,
init = 0,
data2 = list(W = W),
#family = cumulative(probit)
family = gaussian()
)
plot(fitcar)
Basically, doesn’t converge on any front. Any ideas on where to start adressing this?
- Operating System: Sonoma 14.3
- brms Version: brms_2.20.4