I would like to perform a Bayesian meta-analysis and currently comparing multiple pooling options, similar to the radon example in Andrew Gelman’s Books. I have data from 47 groups, with sample sizes varying between 2 and 49. When I use the common hierarchical model everything works fine:
Partial_pooling_model <- brm(data ~ 1 + (1|WWTP) (1)
However, there are some groups with comparatively large sample sizes, for which the data indicate that the residual sigma may actually be larger/ smaller than the estimated common sigma of the partial pooling model and in my case sigma actually matters for the following calculations. Therefore, I wanted to estimate both, the mean and sd in a hierarchical framework and tried:
Partial_pooling_both_model <- brm(bf (data ~ 1+(1|WWTP), sigma ~ 1+(1|WWTP)) (2)
The model converges without any divergent transitions or other error messages. What irritates me right now is that I expected both mu and sigma for the individual WWTP to shrink towards their population means, but for some WWTPs the resulting sigma is actually larger than if I would have fit a separate model to these data. The larger sigma in turn leads to a somehow unrealistically large prediction intervals.
My guess for this behavior is that the larger group-level sigma results from the group-level mean of the individual WWTP being shrunk towards the population mean, thus increasing the distance between the mean and the observations? (but I am not sure), but the more general question would be if a model in the form of (2) makes sense in the first place? I think it should but maybe I have overseen something and I haven’t found any discussion on this topic so far.