# Power analysis on hierarchical model

Hi,

I have a process where there is always a resampling (from the same conditions) and would like to perform power analysis using the following hierarchical model, ` resp ~ 1 + (1|r1_lvl)` where r1_lvl has only two groups (1st and 2nd sample)

So I am trying to expand this example. At the end I’m comparing my results with the `power.t.test` results to see the value of hierarchical modelling in this case.

When I have a a total size of 9 individuals

• I get power of >99% using `power.t.test` but
• with the hierarchical model the power is 6% even when the group-sd is extremely small.

Here is the code,

``````#Function to generate data with suitable structure for hierarchical model

dat_fun <- function(seed,
nr_of_groups = 1000,
units_per_resample = c(nr1 = 6, nr2 = 3),
mu = 0, # true mean
sd_r1 = 0.5, # reproducibility error
sd_r2 = 1.6  # repeatability error
)
{
set.seed(seed)
r1_lvl <- rep(LETTERS[1:nr_of_groups], times = units_per_resample)

r1_eff <- rnorm(nr_of_groups, 0, sd_r1)
r1_eff2 <- rep(r1_eff, times = units_per_resample)
r2_eff <- rnorm(units_per_resample %>% sum, 0, sd_r2)
dat_f <- data.frame(r1_lvl, r1_eff2, r2_eff)
dat_f\$resp <- with(dat_f, mu + r1_eff2 + r2_eff )
return(dat_f %>% as_tibble)
}
``````

The base model for the power analysis

``````prior1t <- c(prior(normal(0, 5), class = 'Intercept'),
prior(exponential(.1), class = 'sd'),
prior(exponential(.1), class = 'sigma'),
prior("exponential(1.0/29)", class = 'nu'))

#As it will be fit several times I mute the automatic messages
#I'm using the student family as I would like something like Kruschke’s BEST
set.seed(9017536)
reference_2t <- brm(resp ~ 1 + (1|r1_lvl),
data = dat_fun(1, nr_of_groups = 2, units_per_resample = c(6, 3)),
family=student,
prior = prior1t,
chains = 3, cores = 3,
silent = 2,
refresh = 0)
``````
``````# Simulate multiple samples and
# Re-fit new samples
# The first part is the same to the dat_fun() then
# there is the part to update the model and get the estimates with the Q2.5 and Q97.5

sim_d_and_fit <- function(seed,
nr_of_groups = NULL,
units_per_resample = NULL,
mu = NULL,       # true mean
reprod = NULL,   # reproducibility error
repeatab = NULL, # repeatability error
model = NULL
)
{
set.seed(seed)
r1_lvl <- rep(LETTERS[1:nr_of_groups], times = units_per_resample)

r1_eff <- rnorm(nr_of_groups, 0, reprod)
r1_eff2 <- rep(r1_eff, times = units_per_resample)
r2_eff <- rnorm(units_per_resample %>% sum, 0, repeatab)
dat_f <- data.frame(r1_lvl, r1_eff2, r2_eff)
dat_f\$resp <- with(dat_f, mu + r1_eff2 + r2_eff )

updated_fit <- update(model, newdata = dat_f)
tb_ft <- updated_fit %>% post_sample %>% as.data.frame %>%
rownames_to_column(var = 'param') %>%
filter(!grepl('lp', param))

return(tb_ft)
}

# Function to get the posterior samples from the brm model - part of the above fun
post_sample <- function(x, ...)(posterior_summary(as.data.frame(x), ...))
``````

The `map` function will rerun the above `sim_d_and_fit`

``````# Function to apply multiple times (n_sim) the sim_d_and_fit() function
multiple_fits_fun <- function(x, n_sim = 5, ...){
tibble(seed = sample(1:1e9, n_sim)) %>%
mutate(b1 = map(seed, sim_d_and_fit, ...)) %>%
unnest(b1) %>% return
}
``````

Tibble with the arguments I want to run the power analysis
Here I have,

• two different sample sizes
A. total of 9 samples (1st sample 6 & resample 3)
B. total of 18 samples (1st sample 6 & 2nd sample 12)
• two reproducibility errors - 0.9 and 0.1 (which should have minimal effect)
• tested in 1000 simulated dataset each
``````
testing_space <- tibble(
n_sim = 2) %>%
expand_grid(units_per_resample = list(c(6, 3), c(6, 12)),
reprod = c(0.9, 0.1),
repeatab = 0.9,
mu =  1.87,
nr_of_groups = 2) %>%
mutate(case_nr = row_number())
testing_space %>% as.data.frame
``````

For each row of the above tibble I fit the `multiple_fits_fun`

``````testing_space_t <- testing_space %>% rowwise %>%
mutate(
multiple_fits_fun(n_sim = n_sim,
mu = mu,
reprod = reprod, # reproducibility error
repeatab = repeatab,
units_per_resample = units_per_resample,
nr_of_groups = nr_of_groups,
model = reference_2t) %>% # Mind the model
nest(dat = everything())) %>%
unnest('dat') %>%
mutate(units_per_resample = as.character(units_per_resample))
``````

Now I estimate and plot the power with the following

``````sim <- testing_space_t

sim_pwr <-  sim %>% filter(param == 'b_Intercept') %>%
mutate(check = ifelse(Q2.5 > 0 | Q97.5 <0 , 1, 0)) %>%
group_by(case_nr, units_per_resample, reprod, repeatab, mu) %>%
summarise(power =  mean(check),
reprod = unique(reprod), repeatab = unique(repeatab),
units_per_resample = unique(units_per_resample))

sim_pwr %>%
ggplot(aes(x = case_nr, y = power, colour = reprod )) +
geom_line() +
geom_point() +
facet_wrap(~units_per_resample, scales = 'free_x') +
ggtitle('Weakly informative prior')

sim_pwr

``````

And the results after 1000 runs per case are below.

``````  case_nr units_per_resample reprod repeatab    mu power
<int> <chr>               <dbl>    <dbl> <dbl> <dbl>
1       1 c(6, 3)               0.9    0.875  1.87 0.062
2       2 c(6, 3)               0.1    0.875  1.87 0.068
3       3 c(6, 12)              0.9    0.875  1.87 0.144
4       4 c(6, 12)              0.1    0.875  1.87 0.32
``````

If we compare with a simple power analysis of 9 samples we get

``````power.t.test(
n = 9,
sd =  0.9,
delta =  1.87,
alternative = "two.sided",
type = "one.sample",
sig.level = 0.05)

One-sample t test power calculation

n = 9
delta = 1.87
sd = 0.9
sig.level = 0.05
power = 0.9996893
alternative = two.sided

``````

How is possible to get so big differences in the outcome of the two methods (6% vs >99%)?

I don’t have time right now to understand the simulation, but note that the t-test that you compute in the frequntist approach assumes a gaussian generative model (the generative model is gaussian; the thing that is t-distributed is the test statistic), whereas the model that you fit in `brms` assumes a t-distributed generative model.

1 Like