Difficulty in understanding pooling effect

First, I fitted the data in R with a GLM for each of the 21 conditions (no pooling),

lm(y ~ x, …)

and got the following result for the effect of x:

  Estimate Est.Error 2.5%ile   5%ile  95%ile 97.5%ile

condtion1 0.0008 0.0104 -0.0197 -0.0164 0.0180 0.0213
condtion2 0.0136 0.0130 -0.0122 -0.0080 0.0352 0.0394
condtion3 0.0004 0.0133 -0.0259 -0.0216 0.0224 0.0266
condtion4 -0.0038 0.0096 -0.0229 -0.0198 0.0122 0.0153
condtion5 -0.0131 0.0086 -0.0302 -0.0274 0.0013 0.0040
condtion6 -0.0085 0.0076 -0.0236 -0.0212 0.0042 0.0066
condtion7 0.0227 0.0114 0.0001 0.0038 0.0415 0.0452
condtion8 0.0050 0.0131 -0.0210 -0.0168 0.0268 0.0310
condtion9 0.0001 0.0116 -0.0228 -0.0191 0.0193 0.0231
condtion10 0.0094 0.0125 -0.0152 -0.0112 0.0301 0.0341
condtion11 -0.0181 0.0077 -0.0333 -0.0309 -0.0053 -0.0029
condtion12 0.0134 0.0139 -0.0140 -0.0096 0.0364 0.0409
condtion13 -0.0169 0.0081 -0.0329 -0.0303 -0.0035 -0.0009
condtion14 -0.0096 0.0111 -0.0315 -0.0280 0.0087 0.0122
condtion15 -0.0009 0.0071 -0.0151 -0.0128 0.0109 0.0132
condtion16 0.0074 0.0085 -0.0095 -0.0067 0.0216 0.0243
condtion17 0.0326 0.0148 0.0033 0.0080 0.0571 0.0618
condtion18 0.0086 0.0141 -0.0193 -0.0148 0.0320 0.0365
condtion19 -0.0083 0.0074 -0.0229 -0.0205 0.0039 0.0063
condtion20 0.0057 0.0094 -0.0129 -0.0099 0.0212 0.0243
condtion21 0.0180 0.0119 -0.0056 -0.0018 0.0377 0.0416

Then, I tried the same data with rstanarm (partial pooling),

stan_lmer(y~x+(1|subject)+(x|condition),…)

and had the following result for the effect of x:

 Estimate Est.Error 2.5%ile 5%ile   95%ile  97.5%ile

condtion1 -0.0006 0.0075 -0.0155 -0.0127 0.0119 0.0141
condtion2 0.0072 0.0079 -0.0076 -0.0053 0.0210 0.0233
condtion3 0.0023 0.0077 -0.0134 -0.0107 0.0153 0.0169
condtion4 -0.0045 0.0080 -0.0195 -0.0173 0.0087 0.0116
condtion5 -0.0037 0.0080 -0.0203 -0.0176 0.0090 0.0114
condtion6 -0.0036 0.0079 -0.0194 -0.0169 0.0093 0.0122
condtion7 0.0105 0.0082 -0.0040 -0.0021 0.0244 0.0278
condtion8 0.0038 0.0077 -0.0111 -0.0088 0.0168 0.0190
condtion9 0.0013 0.0076 -0.0134 -0.0109 0.0136 0.0167
condtion10 0.0090 0.0081 -0.0064 -0.0041 0.0220 0.0250
condtion11 -0.0053 0.0081 -0.0218 -0.0188 0.0078 0.0102
condtion12 0.0069 0.0078 -0.0071 -0.0052 0.0199 0.0228
condtion13 -0.0040 0.0084 -0.0218 -0.0187 0.0092 0.0113
condtion14 -0.0029 0.0078 -0.0189 -0.0160 0.0102 0.0123
condtion15 0.0001 0.0078 -0.0155 -0.0134 0.0125 0.0150
condtion16 0.0032 0.0079 -0.0120 -0.0089 0.0163 0.0187
condtion17 0.0163 0.0091 -0.0004 0.0020 0.0325 0.0360
condtion18 0.0180 0.0099 -0.0006 0.0020 0.0341 0.0373
condtion19 -0.0018 0.0078 -0.0180 -0.0154 0.0109 0.0126
condtion20 0.0014 0.0080 -0.0142 -0.0113 0.0152 0.0181
condtion21 0.0068 0.0082 -0.0080 -0.0056 0.0209 0.0241

And here are the high-level effects:

Group-Level Effects:
~condition (Number of levels: 21)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.16 0.03 0.12 0.22 286 1.01
sd(x) 0.01 0.00 0.00 0.02 1227 1.00
cor(Intercept,x) 0.65 0.24 0.07 0.98 2000 1.00

~subject (Number of levels: 124)
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept) 0.08 0.01 0.07 0.09 585 1.01

Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept 0.17 0.04 0.09 0.25 154 1.01
x 0.00 0.01 -0.01 0.02 439 1.00

Most of effect estimates across the 21 conditions with the Bayesian model make sense to me: they were pooled toward the center. I plotted the posterior predictive check for both models, and the Bayesian model did a better fitting than GLM. However, I’m baffled by one particular case: condition18 (in boldface in both results above) seems to have been pooled away from the center. Why is this? And how should I proceed to diagnose the situation?

I wouldn’t be too worried about it. The comparison of posterior medians to least squares estimates isn’t an exact science.

1 Like

I wouldn’t be too worried about it. The comparison of posterior medians to least squares estimates isn’t an exact science.

Thanks! Still I wonder why that specific condition is dragged away from the center. Its standard error of 0.0141 under GLM with no pooling is not particularly high or low relative to other conditions.

You have to also consider the dependence structure among them.

1 Like

You have to also consider the dependence structure among them.

Got it! Thanks a lot, Ben.