3-level or 4-level hierarchical model


I am conducting a study on methylmercury (MeHg) contamination of nestling Red-winged blackbirds ( Agelaius phoeniceus ) and Great-tailed grackles ( Quiscalus mexicanus ). I collected blood samples from nestlings to measure MeHg concentration in parts per billion, and I attempted to collect blood samples when the nestlings were 5-6 days old, 8-9 days old, and 11-12 days old. However, it was common to only get one sample per individual due to predation, nest failure, or finding the nest right before fledge.

For blackbirds, I collected 424 blood samples from 243 nestlings in 84 nests from 14 different ponds. For grackles, I collected 97 blood samples from 40 nestlings in 17 nests from 2 different ponds.

The structure of my data is:

  1. 1-3 blood samples per individual nestling
  2. Individual nestlings in a nest
  3. Nests in a pond

I am building models to predict blood MeHg concentration, and an important thing to understand is that aquatic ecosystems generally have higher concentrations of MeHg than terrestrial ecosystems. Aquatic ecosystems have bacteria that can convert nontoxic, inorganic Hg into the toxic MeHg form. These bacteria are either absent or rare in terrestrial ecosystems, so aquatic organisms or consumers of aquatic organisms generally have higher MeHg concentrations.

For blackbirds and grackles, the source of their MeHg contamination would come through the consumption of odonates (damselflies and dragonflies). Odonates have aquatic larval life stages (where they would be exposed to MeHg contamination), and terrestrial adult life stages (where they can transport MeHg to terrestrial predators). Therefore, I collected data on the number of odonates coming out of the ponds on a weekly basis, and I have estimates on the amount of MeHg odonates were transporting to the adjacent terrestrial ecosystem.

Now some studies have shown that odonates can make up 60-90% of a blackbirds diet, while other studies have shown that terrestrial prey like caterpillars can be the dominant prey item. I believe this might be explained by differences in prey availability between terrestrial and aquatic ecosystems, suggesting that blackbirds are generalists. I unfortunately did not collect data on terrestrial prey abundance during my field season. Instead, I used ArcPro from ESRI to create polygons of different habitat types (Pond, Grass, Tree, and Anthropogenic structures like roads and buildings) and then created 100m buffers around each nest. I believe that larger areas of grass and trees around a nest might provide more opportunity for the adult birds to forage for terrestrial prey for their nestlings.


So the models I’m looking to build will include variables for odonates, habitat type, and an interaction between odonates and habitat type.

The models I have built so far have been 4-level hierarchical models: blood samples nested within individual nested within nest nested within pond. I used pond as the highest level because I was assuming that nests within the same pond were experiencing similar conditions. However, now that I am explicitly including variables that describe the habitat and MeHg conditions around each nest, do I need to include a level for pond effects? Can I bump the levels down to a 3-level hierarchical model?

Thank you for your time, and I would appreciate to hear your opinion on the matter.

If individuals are accumulating MeHg from parents it might work to make approximate age a covariate for concentration rather than using a level of hierarchy within individuals. That’ll help particularly if you have few samples within individual.

Another way to simplify your model is to initially make pond a factor rather than a hierarchical level and to hope that the variable you constructed for terrestrial prey availability captures what it’s supposed to.

You’ll be able to address that in part by looking at within-pond heterogeneity among nests but there’s probably no great substitute for actual gut contents…

Thanks for your response, Sakrejda!

To address some of your points:

1- Gut contents: I did collect about 300 fecal sacs from the nestlings. As with the blood, I have about 1-3 fecal samples per nestling. I am currently processing those samples, and it will be a couple of months before I have that data set. I have also sent some of the blood I collected to UC-Davis for stable isotope analysis. Waiting to hear back on those results.

2- I have run a model that looks at approximate age as a covariate. For the following model:

  • log_MeHg = Log10 MeHg concentration (ppb)

  • gtgr = 0 (Red-winged Blackbird) or 1 (Great-tailed Grackle)

  • DAH_a = Days After Hatch adjusted. The youngest nestlings I sampled were 4 days old, so I subtracted 4 from the age category for rescaling.

hg_model4_fm <- bf(log_MeHg ~ 0 + intercept + gtgr + DAH_a + gtgr:DAH_a +

get_prior(hg_model4_fm, data = rwbl_filter)

hg_priors_4 <- c(prior("student_t(3,1.3,1)", class = "b", coef = "intercept"),
                 prior("student_t(3,0,1)", class = "b", coef = "DAH_a"),
                 prior("student_t(3,0,1)", class = "b", coef = "gtgr"),
                 prior("student_t(3,0,1)", class = "b", coef = "gtgr:DAH_a"),
                 prior("student_t(3,0,1)", class = "sd"),
                 prior("student_t(3,0,1)", class = "sigma"))

hg_model4 <- brm(hg_model4_fm, data = rwbl_filter, prior = hg_priors_4, 
                 iter = 2000, chains = 4, cores = 4,
                 control = list(adapt_delta = 0.9))


Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: log_MeHg ~ 0 + intercept + gtgr + DAH_a + gtgr:DAH_a + (1 | pond_index/nest_index/ind_index) 
   Data: rwbl_filter (Number of observations: 521) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup samples = 4000

Group-Level Effects: 
~pond_index (Number of levels: 14) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.18      0.05     0.12     0.29       1716 1.00

~pond_index:nest_index (Number of levels: 105) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.16      0.01     0.14     0.19       1400 1.00

~pond_index:nest_index:ind_index (Number of levels: 280) 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sd(Intercept)     0.01      0.00     0.00     0.02       2661 1.00

Population-Level Effects: 
           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
intercept      1.33      0.06     1.22     1.44       1627 1.00
gtgr           0.10      0.06    -0.03     0.23       1197 1.00
DAH_a         -0.02      0.00    -0.02    -0.01       4000 1.00
gtgr:DAH_a     0.03      0.00     0.02     0.03       4000 1.00

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.08      0.00     0.08     0.09       4000 1.00

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

And here’s the graph for the regressions for the “average” blackbird and grackle:

And here’s the same graph but transformed back into MeHg concentration:

Based on this model, it appears that nestling MeHg concentration doesn’t change much for blackbirds or grackles.

A couple of questions I have for you:

1- In the first paragraph of your response, I don’t understand why I would want to remove the hierarchical level for individuals, even when I have a covariate for age. If I ignored a level for individuals, wouldn’t that assume the individual samples are independent of each other and potentially ignore individual variation?

2- As a reminder, I created 100m buffers around each nest to measure the area and size of terrestrial and aquatic habitats available to the parents for foraging. I based the 100m buffers from the literature on blackbird foraging studies. The buffers also allowed me to include the numbers of odonates (damselflies and dragonflies) coming out from adjacent ponds that the parents could be foraging at. As you suggested, I took a look at the within-pond heterogeneity among nests. When it came to looking at the area of terrestrial versus aquatic habitat around each nest, it was pretty homogenous for birds nesting in the same pond. The ponds weren’t that big, and both blackbirds and grackles can nest in high densities and within close proximity. There was more heterogeneity in odonate emergence for birds nesting within the same pond. Odonate emergence peaked around June, so birds nesting before, during, or after could experience different MeHg fluxes around their nests. So back to my original question: if I build a model that includes habitat and odonate emergence as covariates, would I be justified in leaving out pond as a hierarchical level?

Thanks again for your time, and I look forward to your response!