Recovering well partial pooled intercept, but not overall intercept

christophe-rd · July 31, 2025, 8:47pm

Please share your Stan program and accompanying data if possible.

I am simulated data that I fit using stan_lmer. For now, I partial pooled on the intercept for my ids and my species, here’s a snipet of my sim dataframe:

structure(list(ids = c("spp15_1", "spp15_1", "spp15_1", "spp15_2", 
"spp15_2", "spp15_2", "spp15_3", "spp15_3", "spp15_3", "spp15_4", 
"spp15_4", "spp15_4", "spp15_5", "spp15_5", "spp15_5", "spp15_6", 
"spp15_6", "spp15_6", "spp15_7", "spp15_7", "spp15_7", "spp15_8", 
"spp15_8", "spp15_8", "spp15_9", "spp15_9", "spp15_9", "spp15_10", 
"spp15_10", "spp15_10", "spp16_1", "spp16_1", "spp16_1", "spp16_2", 
"spp16_2", "spp16_2", "spp16_3", "spp16_3", "spp16_3", "spp16_4", 
"spp16_4", "spp16_4", "spp16_5", "spp16_5", "spp16_5", "spp16_6", 
"spp16_6", "spp16_6", "spp16_7", "spp16_7", "spp16_7", "spp16_8", 
"spp16_8", "spp16_8", "spp16_9", "spp16_9", "spp16_9", "spp16_10", 
"spp16_10", "spp16_10"), spp = c("spp15", "spp15", "spp15", "spp15", 
"spp15", "spp15", "spp15", "spp15", "spp15", "spp15", "spp15", 
"spp15", "spp15", "spp15", "spp15", "spp15", "spp15", "spp15", 
"spp15", "spp15", "spp15", "spp15", "spp15", "spp15", "spp15", 
"spp15", "spp15", "spp15", "spp15", "spp15", "spp16", "spp16", 
"spp16", "spp16", "spp16", "spp16", "spp16", "spp16", "spp16", 
"spp16", "spp16", "spp16", "spp16", "spp16", "spp16", "spp16", 
"spp16", "spp16", "spp16", "spp16", "spp16", "spp16", "spp16", 
"spp16", "spp16", "spp16", "spp16", "spp16", "spp16", "spp16"
), gddcons = c(10.145, 8.885, 9.725, 8.86, 9.045, 8.72, 10.06, 
8.775, 9.4, 9.93, 9.36, 8.735, 9.21, 9.205, 8.73, 9.11, 9.385, 
9.455, 9.37, 8.725, 8, 8.755, 9.375, 9.06, 8.765, 8.875, 9.07, 
9.65, 9.04, 9.065, 9.12, 8.795, 9.54, 8.85, 8.605, 9.3, 8.735, 
8.955, 9.335, 8.825, 9.8, 9.08, 8.905, 9.13, 9.1, 9.89, 9.25, 
9.17, 8.385, 9.64, 8.445, 8.335, 8.76, 8.71, 8.26, 9.095, 8.99, 
9.68, 8.78, 9.19), b = c(0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 
0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 
0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 
0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 
0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4
), a = c(1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 
1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5), a_ids = c(-0.487, 
-0.487, -0.487, -0.017, -0.017, -0.017, 0.367, 0.367, 0.367, 
-0.721, -0.721, -0.721, -0.117, -0.117, -0.117, -0.247, -0.247, 
-0.247, 0.178, 0.178, 0.178, 0.05, 0.05, 0.05, -0.351, -0.351, 
-0.351, 0.133, 0.133, 0.133, -0.146, -0.146, -0.146, 0.221, 0.221, 
0.221, -0.272, -0.272, -0.272, 0.296, 0.296, 0.296, 0.398, 0.398, 
0.398, -0.268, -0.268, -0.268, -0.373, -0.373, -0.373, 0.015, 
0.015, 0.015, -0.009, -0.009, -0.009, -0.272, -0.272, -0.272), 
    a_spp = c(0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 
    0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 
    0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 0.314, 
    0.314, 0.314, 0.314, 0.314, 0.314, -0.626, -0.626, -0.626, 
    -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, 
    -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, 
    -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, -0.626, 
    -0.626, -0.626, -0.626), sigma_y = c(0.3, 0.3, 0.3, 0.3, 
    0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 
    0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 
    0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 
    0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 
    0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3), ringwidth = c(5.007, 
    4.661, 5.226, 5.106, 5.306, 4.895, 6.404, 6.058, 6.553, 5.064, 
    4.883, 4.843, 5.018, 5.936, 5.035, 5.199, 5.566, 5.153, 5.883, 
    5.526, 5.342, 5.171, 5.632, 5.624, 4.774, 5.144, 4.931, 6.567, 
    4.523, 5.064, 4.81, 4.315, 4.347, 4.791, 3.865, 5.062, 4.134, 
    4.054, 4.328, 4.65, 4.951, 4.44, 4.736, 5.178, 4.837, 4.495, 
    4.042, 4.417, 3.728, 4.921, 3.339, 3.955, 4.358, 4.567, 4.548, 
    4.224, 4.748, 4.564, 4.377, 3.826)), row.names = 421:480, class = "data.frame")

Then I fit it using the following:

fit ← stan_lmer(ringwidth ~ 1 + gddcons + (1 | ids) + (1 | spp),  data = simcoef,chains = 4,iter = 4000,core=4)

I returned my a_spp and a_ids pretty well, but a is consistently underestimated and I don’t understand why. Below is a figure of my simulated intercept (in blue) vs my model output intercept.

I noticed a is underestimated by my model, thus I plotted a_spp+a_ids and excluded a from the equation, which results in:

The pattern of my sim data to be consistently higher than my model estimates seems to disappear. However, I don’t know what causes this pattern. Could someone guide me on diagnostics I could perform?

Thanks!

Topic		Replies	Views
Can partial pooling lead to shrinkage in multiple directions? Modeling rstan , fitting-issues	8	1047	February 17, 2021
Estimate predictor from multiple measurements and then use to predict outcome General	9	729	October 18, 2020
Consistent parameter over/under estimation when testing fake data Modeling rstan , fitting-issues	2	129	November 20, 2025
Stan_betareg intercept coefficient unaffected by prior rstanarm	15	2086	August 26, 2017
Multiple outcomes in rstanarm rstanarm techniques , specification	10	1786	September 18, 2017

Recovering well partial pooled intercept, but not overall intercept

Related topics