Hi all,
I’m working on a Bayesian ordinal regression model using brms to estimate coral bleaching severity levels (4 categories: None, Mild, Moderate, Severe) based on thermal stress.
My data consists of two subsets depending on their timing relative to the heat stress peak:
1. Pre-peak subset:
Bleaching reports recorded before the peak. Severity is modeled as a function of DHW (degree heating weeks):
SEVERITY_CODE ~ s(DHW, k = 10) + (1 | LOCATION)
2. Post-peak subset:
Reports recorded after the peak. Here, severity is modeled as a 2D smooth of MAX_ANNUAL_DHW (the maximum DHW of the year) and DAYS_FROM_MAX_DHW (the number of days since the DHW peak):
SEVERITY_CODE ~ t2(MAX_ANNUAL_DHW, DAYS_FROM_MAX_DHW) + (1 | LOCATION)
These are two different models with non-overlapping predictor spaces. However, I would expect their predictions to align at the transition point, specifically when DAYS_FROM_MAX_DHW = 0, where post-peak conditions are essentially equivalent to those just before the peak (i.e., similar severity predictions in the pre-peak and post-peak models for MAX_ANNUAL_DHW = DHW). However, this is not always the case in the model outputs. This mismatch likely arises because:
- The pre-peak model is fit on more data concentrated around observed DHW values
- The post-peak model spreads its data thin across a 2D predictor space, making DAYS_FROM_MAX_DHW = 0 potentially underrepresented or noisier.
From an ecological standpoint, bleaching severity at DAYS_FROM_MAX_DHW = 0 (i.e., at peak heat stress) is a critical point of comparison across studies. Inconsistencies between the two models here undermine our ability to interpret severity transitions reliably and compare predictions across the entire heat stress trajectory.
How can I best ensure continuity between predictions from these two models around DAYS_FROM_MAX_DHW = 0, when DHW (pre-peak model) = MAX_ANNUAL_DHW (post-peak model) ?
Is there a better modeling framework or workaround to reconcile two different predictor spaces while maintaining ecological coherence in the outputs?
Any guidance - conceptual, statistical, or coding - would be incredibly helpful.
Thanks so much in advance!
Virginie