Hierarchical Model as an ANOVA

Hi everyone. I’m not familiar with coding and statistics so I am having some trouble understanding my model.
The data set looks like this: 5 treatments (different nutrient addition, so it’s categorical) and one response variable that is a plant trait (ex: leaf area, a continuous variable). I have 10 individuals per treatment. Looks simple, but I don’t know if I am doing right:

model {
  brm(data = d, family = gaussian,
      DAS   ~  (1 | TRAT),
      prior = c(prior(normal(0, 10), class = Intercept),
                prior(cauchy(0, 1), class = sd)),
      iter = 4000, warmup = 1000, chains = 4, cores = 4,
      seed = 12)

where DAS is my response variable (continuous) and TRAT is the treatment variable (5 categories).
With this model I want to analyze intraspecific trait variation in these different treatments. In a frequentist approach I would use an ANOVA.

Looks correct or am I pooling the wrong thing?
Thank you

What you have done is not wrong per se, though it is a little unusual to fit a ‘treatment’ variable factor in this way. Specifying the model as ‘DAS ~ (1 | TRAT)’ does partial pooling. Partial pooling is normally used for what is traditionally called a random effect, factors like locations, individuals, etc., essentially vehicles of replication. It assumes the effects come from a normal distribution, with an estimated sd parameter. It also invokes some ‘shrinkage’ of the effects, such that the effects are regularised toward zero, to some degree.

For treatment-type variables, it would be more conventional to fit this as a non-pooled, ‘fixed’ effect, with the simpler formula ‘DAS ~ TRAT’. This would be more akin to a traditional ANOVA.

1 Like

Now I got it!
Thank you so much, it helped a lot.

I agree with @anhsmith that it is unusual (in that I rarely see it in practice) but not unheard of. The random effects ANOVA is the first multilevel example in Raudenbusch and Bryk 2002 (page 23 in second edition). And Gelman suggests that such a model is a good approach to adjust for multiple comparisons, particularly with small group sizes.

Given you only have 10 individuals per treatment, it would be worth comparing the estimates from partial-pooling (i.e. with random effects) and no-pooling (i.e. without random effects) solutions. I would use the random effects model if your goal is to make ad-hoc pairwise comparisons. With 10 possible pairwise comparisons, you want some adjustments to the estimates if that’s your goal. But if you’re just trying to determine the amount of variability across treatments, the no-pooling solution would be fine.

1 Like

Very interesting material, I will certainly read it. My goal is to detect phenotypic plasticity, so I need to see how different the plant trait is between treatments. The frequentist method ANOVA is not suitable because it usually does not give a significant p-value. Sometimes the difference is so small that it does not seem important (but it is!). The literature recommends using Bayesian inference or PERMANOVA.

Thank you for the references and ideas!

Note that the partial-pooling/random effects ANOVA approach is more conservative than the no-pooling approach (assuming you don’t adjust for multiple tests, e.g. Bonferroni) because it shrinks the treatment means toward each other. So it wouldn’t achieve that specific goal.

1 Like

Also note that you combined Stan and brms syntax here. The model block is only used when writing pure Stan code. So you should drop model and the curly brackets.

If I use the treatment as a random effect, I would lose the variation that actually interests me! So I have to use it as a fixed factor then. Everything makes more sense now. Thank you very much! Now my first question seems so silly! hahaha Thats good it means that I’m learning. Thank you!

I totally agree, @simonbrauer. More and more, I think we should be regularising our estimates by default, even effects that would traditionally be treated as ‘fixed’ and unpooled. Partially pooling effects of interest (by estimating the standard deviation of effects) is a pretty decent approach but only really appropriate for factors with more than two or three levels. And when there are only two or three levels, we can use a regularising prior (choosing a conservative-ish standard deviation for the effects, or fancy horseshoes, etc.). Like you say, it’s the conservative (i.e., downwardly biased) option and, given the reproducibility crisis, we could probably do with a little more conservatism!

1 Like

Glad this was all helpful.

I would still recommend you try the model both ways, compare the estimates (as in code below), and think about what the random effects model is doing (even if you end up using the fixed effects model for simplicity and ease of understanding). Though you are more likely to find a statistically significant comparison with the fixed effects model, you are also more likely to make an error when doing so. The random effects model shrinks the estimates based on the sample size, variation in group means, and residual variance. So it offers some protection against those sorts of errors.

You might also take a look at Chapter 17 of Lambert’s “A Student’s Guide to Bayesian Statistics” or Chapter 12 (in first edition, at least) of McElreath’s “Statistical Rethinking”. Both cover the random effects/partial pooling models in a fairly approachable way.


# Setup the data
group_means <- c(-2, -1, 0, 1, 2)
num_groups <- length(group_means)
obs_per_group <- 10
the_data <- data.frame(X = rep(1:num_groups, each = obs_per_group))
the_data$Y_fitted <- group_means[the_data$X]
the_data$Y <- the_data$Y_fitted + rnorm(nrow(the_data), 0, 2)
the_data$X <- factor(the_data$X)

fitted_df <- data.frame(X = factor(1:num_groups),
                        mean = group_means)

# Estimate the models
fixed_model <- brm(Y ~ X, data = the_data)
random_model <- brm(Y ~ (1 | X), data = the_data)

# Examine the group predictons
fitted(fixed_model, newdata = fitted_df)
fitted(random_model, newdata = fitted_df)

# Examine the pairwise differences (Group 1 vs Group 5)
fitted_fixed <- as.data.frame(fitted(fixed_model, newdata = fitted_df, summary = FALSE))
fitted_random <- as.data.frame(fitted(random_model, newdata = fitted_df, summary = FALSE))
quantile(fitted_fixed[,1] - fitted_fixed[,5], c(0.025, 0.5, 0.975))
quantile(fitted_random[,1] - fitted_random[,5], c(0.025, 0.5, 0.975))
1 Like

For reference, Kruschke recommended the hierarchical Bayesian ANOVA approach in chapter 19 of his text (here). I’ve worked through his JAGS code with brms here.


Hey, Simon! Thanks for the code. After a few days I ended up building a more complex model with more species and other categorical variable as a random effect.

Thank you so much for the references! Great material

1 Like