Hello, I am fitting a nonlinear model with seven parameters to a small dataset (27 observations) using a Bayesian approach. I use the brms package in R, which interfaces with Stan for sampling. The nonlinear model is bimodal, with two distinct peaks.
Each parameter has a biological interpretation. Specifically, I was able to determine reasonable prior means for the normal distributions by visually inspecting the data (see figure below). The parameters to estimate are:
a = Day of the onset of the first peak (day of the year)
b = Day of the onset of the second peak (day of the year)
c = Height of the first peak
d = Height of the second peak
e = Duration from the onset to the first peak (in days)
f = Duration from the onset to the second peak (in days)
g = Width of the peak
Despite these biologically informed priors, I am currently facing serious convergence issues during MCMC sampling. The following warnings were reported:
Warning messages: 1: There were 11733 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. 2: There were 3 chains where the estimated Bayesian Fraction of Missing Information was low. See https://mc-stan.org/misc/warnings.html#bfmi-low 3: Examine the pairs() plot to diagnose sampling problems 4: The largest R-hat is 1.93, indicating chains have not mixed. Running the chains for more iterations may help. See https://mc-stan.org/misc/warnings.html#r-hat 5: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable. Running the chains for more iterations may help. See https://mc-stan.org/misc/warnings.html#bulk-ess 6: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable. Running the chains for more iterations may help. See https://mc-stan.org/misc/warnings.html#tail-ess
I have already tried adjusting adapt_delta, max_treedepth
, and the number of iterations, but the issues persist. I am aware that the model may be too complex relative to the limited dataset (which I cannot expand), but it is essential for me to obtain parameter estimates, as no published values are currently available in the literature.
Since I have limited experience with Bayesian modeling, I would greatly appreciate any advice or suggestions for improving model stability.
Here is the dataset:
df <- structure(list(day_of_year = c(128, 136, 137, 138, 142, 143,
144, 162, 163, 164, 166, 167, 173, 197, 198, 199, 200, 201, 232,
233, 235, 236, 253, 255, 256, 299, 308), qa = c(0, 0.0153846153846154,
0.0192307692307692, 0, 0.0192307692307692, 0.00384615384615385,
0.00384615384615385, 0.00384615384615385, 0, 0.00769230769230769,
0.00384615384615385, 0.00384615384615385, 0.00769230769230769,
0, 0, 0, 0.0230769230769231, 0.00384615384615385, 0, 0, 0, 0,
0, 0, 0, 0.103846153846154, 0.0384615384615385)), class = "data.frame", row.names = c(NA,
27L))
Here is the code used to specify and fit the model:
model_formula <- brms::bf(
qa ~ if_else(day_of_year <= b, c * exp(-0.5 * ((day_of_year - a) / e)^2), c * exp(-0.5 * ((day_of_year - a) / e)^2) + d * exp(-0.5 * (((log((day_of_year - b)/f))^2) / g^2))),
a + b + c + d + e + f + g ~ 1, nl = TRUE)
priors <- c(
brms::prior(normal(138, 5), nlpar = "a", lb = 0),
brms::prior(normal(273, 5), nlpar = "b", lb = 0),
brms::prior(normal(0.01, 0.1), nlpar = "c", lb = 0),
brms::prior(normal(0.2, 0.1), nlpar = "d", lb = 0),
brms::prior(normal(5, 1), nlpar = "e", lb = 0),
brms::prior(normal(13, 1), nlpar = "f", lb = 0),
brms::prior(normal(0.8, 0.1), nlpar = "g", lb = 0),
brms::prior(student_t(3, 0, 0.1), class = "sigma")
)
fit <- brms::brm(
formula = model_formula,
data = df,
prior = priors,
iter = 6000,
chains = 2,
control = list(adapt_delta = 0.9999, max_treedepth = 20),
init = 0.1
)
Here is the figure that served as the basis for selecting biologically informed prior means:
Thanks very much for your help.