I am new to brms and trying to fit a brms
model with Beta() distribution and CAR to account for spatial autocorrelation. My response variable represents ecosystem resilience and is a continuous value between 0.0 and 1.0. As I am using spatial data as input, the model has tested significant for spatial autocorrelation (using Moran’s I from both spdep
and DHARMa.helpers
). Here is my model code when including the CAR term, as well as calculating loo and testing for spatial autocorrelation:
resil.formula.sa <- resil ~ soilSand + slope + twi + fireFreq2015 + elephant + drought2015 + wetness2015 + geology + car(W, gr=ID)
resil.model.sa <- brm(formula = resil.formula.sa,
data = train.df,
data2 = list(W=W),
family= Beta(link="logit"),
warmup = 4000,
iter = 6000,
threads = threading(6),
chains = 5,
cores=5,
seed = 123,
save_pars = save_pars(all = TRUE))
loo.resil <- loo(resil.model.sa, save_psis = TRUE, cores = 16)
bres <- residuals(resil.model.sa, method = "posterior_predict")[,"Estimate"]
moran.test(bres, listW)
When running the the model without the CAR term, all Pareto k values from the loo package are < 0.7. However, when I include the CAR term, I suddenly have A LOT of bad Pareto k values:
Computed from 10000 by 11969 log-likelihood matrix.
Estimate SE
elpd_loo 29983.7 70.7
p_loo 8879.5 56.7
looic -59967.4 141.4
------
MCSE of elpd_loo is NA.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.2, 1.6]).
Pareto k diagnostic values:
Count Pct. Min. ESS
(-Inf, 0.7] (good) 2446 20.4% 118
(0.7, 1] (bad) 8620 72.0% <NA>
(1, Inf) (very bad) 903 7.5% <NA>
See help('pareto-k-diagnostic') for details.
In addition, my Moran’s I value is not better and still significant:
Moran I statistic standard deviate = 50.631, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.8384105677 -0.0001950078 0.0002743321
Computed from 10000 by 5129 log-likelihood matrix.
Using the CAR model gives me a much better pp_check() graph, better loo value and better Bayes R-squared. Thus, I am very confused by the very poor Pareto k values and how to address them. Is there a different way to check for spatial autocorrelation for CAR models? I have seen brms:::posterior_predict_gaussian_lagsar
mentioned in this post , but I can’t find an equivalent for CAR structures. Unfortunately, Beta() families are not supported when implementing the SAR structures.
Thank you in advance for any insight and help!