I am working on a model using brms to study ecosystem stability (response variable: cv_functioning_inverse
). The data is derived from pairwise distances between sites, and I aim to account for the shared contributions of sites in these pairwise relationships but also try to correct for non-independence of the observation.
Data Context:
- Response Variable:
cv_functioning_inverse
(scaled inverse coefficient of variation for ecosystem functioning). - Predictors:
beta_turnover_abund
: Beta diversity.multivariate_dispersion
: Environmental distance.
- Distance Matrix Context:
- Sites are represented in pairs (
site_1
,site_2
), and the response variable reflects ecosystem stability between these pairs. - Due to the symmetric nature of the distance matrix, sites can appear as
site_1
orsite_2
randomly. - To avoid introducing bias, I randomised the order of
site_1
andsite_2
during sampling.
- Sites are represented in pairs (
Here is a simplified mock example of the dataset:
site_1 | site_2 | beta_turnover_abund | multivariate_dispersion | cv_functioning_inverse |
---|---|---|---|---|
A | B | 0.5 | 0.8 | 0.6 |
B | E | 0.7 | 0.9 | 0.4 |
B | A | 0.5 | 0.8 | 0.6 |
… |
(Note: The order of site_1
and site_2
is randomised to avoid bias in directional relationships.)
Model:
model_formula_mm <- bf(
scale(cv_functioning_inverse) ~
scale(beta_turnover_abund) *
scale(multivariate_dispersion) +
(scale(beta_turnover_abund) *
scale(multivariate_dispersion) | mm(site_1, site_2))
)
fitted_model_mm <- brm(
formula = model_formula_mm,
data = data,
prior = priors,
chains = 4,
iter = 2000,
cores = 4,
seed = 123 # Set seed for reproducibility
)
Problem:
When running posterior predictive checks (pp_check
), the predictions look off compared to the observed data.
Questions:
- Model Appropriateness:
- Does using
mm(site_1, site_2)
for the random effects structure appropriately account for the shared contributions ofsite_1
andsite_2
? - Is there a better way to model pairwise relationships like this?
- I would like to keep a regression style model as most research is using multiple regressions or correlation analysis.
- Odd Results:
- What diagnostics should I focus on to troubleshoot poor posterior predictive checks? Could the issue arise from:
- Mis-specified priors?
- Over-complex random effects structure?
- Scaling of predictors and response?
Additional Details:
- Number of sites: 43
- Number of pairwise observations: ~4200 (randomly drawn)
I would greatly appreciate any insights on improving model fit and diagnosing issues with posterior predictive checks. Thank you in advance!
Additional Output:
> loo_results <- loo(fitted_model_mm, reloo = TRUE)
> loo_results
Computed from 4000 by 4200 log-likelihood matrix.
Estimate SE
elpd_loo -4199.1 61.3
p_loo 163.2 7.0
looic 8398.2 122.7
------
MCSE of elpd_loo is 0.3.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.4, 1.6]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.
> fitted_model_mm
Family: gaussian
Links: mu = identity; sigma = identity
Formula: scale(cv_functioning_inverse) ~ scale(beta_turnover_abund) * scale(multivariate_dispersion) + (scale(beta_turnover_abund) * scale(multivariate_dispersion) | mm(site_1, site_2))
Data: data (Number of observations: 4200)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~mmsite_1site_2 (Number of levels: 43)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.80 0.10 0.63 1.03 1.00 1083 1492
sd(scalebeta_turnover_abund) 0.84 0.10 0.67 1.07 1.00 1216 1974
sd(scalemultivariate_dispersion) 0.65 0.09 0.50 0.86 1.01 1314 2307
sd(scalebeta_turnover_abund:scalemultivariate_dispersion) 0.80 0.11 0.61 1.04 1.00 1047 1857
cor(Intercept,scalebeta_turnover_abund) -0.03 0.15 -0.34 0.27 1.00 1178 1512
cor(Intercept,scalemultivariate_dispersion) 0.10 0.16 -0.23 0.41 1.00 1392 1961
cor(scalebeta_turnover_abund,scalemultivariate_dispersion) 0.01 0.16 -0.30 0.34 1.00 1184 1299
cor(Intercept,scalebeta_turnover_abund:scalemultivariate_dispersion) -0.58 0.11 -0.76 -0.34 1.00 1566 2027
cor(scalebeta_turnover_abund,scalebeta_turnover_abund:scalemultivariate_dispersion) 0.28 0.15 -0.05 0.56 1.00 1322 1944
cor(scalemultivariate_dispersion,scalebeta_turnover_abund:scalemultivariate_dispersion) 0.05 0.19 -0.31 0.41 1.00 1213 1846
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.13 0.13 -0.38 0.11 1.01 840 1264
scalebeta_turnover_abund 0.17 0.13 -0.10 0.42 1.01 1084 1128
scalemultivariate_dispersion -0.53 0.11 -0.73 -0.32 1.00 1012 1610
scalebeta_turnover_abund:scalemultivariate_dispersion 0.20 0.13 -0.05 0.47 1.00 1114 1597
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.64 0.01 0.63 0.66 1.00 4705 2961
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).