Hi everyone,
I’m struggling to compute meaningful Bayes Factors or model comparison metrics for a hierarchical model, possibly because my background is more frequentist. Here’s my problem:
- I have repeated measurements per individual and fit a hierarchical model where:
- Each individual has a parameter (\theta_i)
- These \theta_i depend on group-level parameters (\nu, \sigma)
- I estimate the posteriors for \nu, \sigma, and all \theta_i using a Gibbs sampler.
- A covariate suggests there might be two subpopulations, but I’m unsure.
Goal
I want to determine whether:
- Model 1: A single group (\theta_i \sim \mathcal{N}(\nu, \sigma^2) for all individuals.
- Model 2: Two separate groups (each with their own \nu_k, \sigma_k, k=1,2).
is better supported by the data.
Problem with LOO
- Model 1 (single group) estimates a high \sigma, so \theta_i fit individual data closely → high likelihood p(y_i|\theta_i)
- Model 2 (two groups) estimates low \sigma_k for each group → \theta_i are more shrunk → lower likelihood.
- Issue: LOO only evaluates p(y_i|\theta_i) and ignores the group-level likelihood p(\theta_i|\nu, \sigma). This favors the flexible Model 1 even if Model 2 is theoretically more appropriate.
Questions
- Is there a better way to compare these models? (e.g., Bayes Factors, WAIC, or a different parametrization?)
- Should I modify Model 2 to account for this shrinkage effect?
- Are there alternatives to LOO for hierarchical models that penalize overfitting more effectively?
Any advice or references would be greatly appreciated!