Hi,

I have a process where there is always a resampling (from the same conditions) and would like to perform power analysis using the following hierarchical model, ` resp ~ 1 + (1|r1_lvl)`

where r1_lvl has only two groups (1st and 2nd sample)

So I am trying to expand this example. At the end I’m comparing my results with the `power.t.test`

results to see the value of hierarchical modelling in this case.

When I have a a total size of 9 individuals

- I get power of >99% using
`power.t.test`

but - with the hierarchical model the power is 6% even when the group-sd is extremely small.

Here is the code,

```
#Function to generate data with suitable structure for hierarchical model
dat_fun <- function(seed,
nr_of_groups = 1000,
units_per_resample = c(nr1 = 6, nr2 = 3),
mu = 0, # true mean
sd_r1 = 0.5, # reproducibility error
sd_r2 = 1.6 # repeatability error
)
{
set.seed(seed)
r1_lvl <- rep(LETTERS[1:nr_of_groups], times = units_per_resample)
r1_eff <- rnorm(nr_of_groups, 0, sd_r1)
r1_eff2 <- rep(r1_eff, times = units_per_resample)
r2_eff <- rnorm(units_per_resample %>% sum, 0, sd_r2)
dat_f <- data.frame(r1_lvl, r1_eff2, r2_eff)
dat_f$resp <- with(dat_f, mu + r1_eff2 + r2_eff )
return(dat_f %>% as_tibble)
}
```

The base model for the power analysis

```
prior1t <- c(prior(normal(0, 5), class = 'Intercept'),
prior(exponential(.1), class = 'sd'),
prior(exponential(.1), class = 'sigma'),
prior("exponential(1.0/29)", class = 'nu'))
#As it will be fit several times I mute the automatic messages
#I'm using the student family as I would like something like Kruschke’s BEST
set.seed(9017536)
reference_2t <- brm(resp ~ 1 + (1|r1_lvl),
data = dat_fun(1, nr_of_groups = 2, units_per_resample = c(6, 3)),
family=student,
prior = prior1t,
chains = 3, cores = 3,
silent = 2,
refresh = 0)
```

```
# Simulate multiple samples and
# Re-fit new samples
# The first part is the same to the dat_fun() then
# there is the part to update the model and get the estimates with the Q2.5 and Q97.5
sim_d_and_fit <- function(seed,
nr_of_groups = NULL,
units_per_resample = NULL,
mu = NULL, # true mean
reprod = NULL, # reproducibility error
repeatab = NULL, # repeatability error
model = NULL
)
{
set.seed(seed)
r1_lvl <- rep(LETTERS[1:nr_of_groups], times = units_per_resample)
r1_eff <- rnorm(nr_of_groups, 0, reprod)
r1_eff2 <- rep(r1_eff, times = units_per_resample)
r2_eff <- rnorm(units_per_resample %>% sum, 0, repeatab)
dat_f <- data.frame(r1_lvl, r1_eff2, r2_eff)
dat_f$resp <- with(dat_f, mu + r1_eff2 + r2_eff )
updated_fit <- update(model, newdata = dat_f)
tb_ft <- updated_fit %>% post_sample %>% as.data.frame %>%
rownames_to_column(var = 'param') %>%
filter(!grepl('lp', param))
return(tb_ft)
}
# Function to get the posterior samples from the brm model - part of the above fun
post_sample <- function(x, ...)(posterior_summary(as.data.frame(x), ...))
```

The `map`

function will rerun the above `sim_d_and_fit`

```
# Function to apply multiple times (n_sim) the sim_d_and_fit() function
multiple_fits_fun <- function(x, n_sim = 5, ...){
tibble(seed = sample(1:1e9, n_sim)) %>%
mutate(b1 = map(seed, sim_d_and_fit, ...)) %>%
unnest(b1) %>% return
}
```

Tibble with the arguments I want to run the power analysis

Here I have,

- two different sample sizes

A. total of 9 samples (1st sample 6 & resample 3)

B. total of 18 samples (1st sample 6 & 2nd sample 12) - two reproducibility errors - 0.9 and 0.1 (which should have minimal effect)
- tested in 1000 simulated dataset each

```
testing_space <- tibble(
n_sim = 2) %>%
expand_grid(units_per_resample = list(c(6, 3), c(6, 12)),
reprod = c(0.9, 0.1),
repeatab = 0.9,
mu = 1.87,
nr_of_groups = 2) %>%
mutate(case_nr = row_number())
testing_space %>% as.data.frame
```

For each row of the above tibble I fit the `multiple_fits_fun`

```
testing_space_t <- testing_space %>% rowwise %>%
mutate(
multiple_fits_fun(n_sim = n_sim,
mu = mu,
reprod = reprod, # reproducibility error
repeatab = repeatab,
units_per_resample = units_per_resample,
nr_of_groups = nr_of_groups,
model = reference_2t) %>% # Mind the model
nest(dat = everything())) %>%
unnest('dat') %>%
mutate(units_per_resample = as.character(units_per_resample))
```

Now I estimate and plot the power with the following

```
sim <- testing_space_t
sim_pwr <- sim %>% filter(param == 'b_Intercept') %>%
mutate(check = ifelse(Q2.5 > 0 | Q97.5 <0 , 1, 0)) %>%
group_by(case_nr, units_per_resample, reprod, repeatab, mu) %>%
summarise(power = mean(check),
reprod = unique(reprod), repeatab = unique(repeatab),
units_per_resample = unique(units_per_resample))
sim_pwr %>%
ggplot(aes(x = case_nr, y = power, colour = reprod )) +
geom_line() +
geom_point() +
facet_wrap(~units_per_resample, scales = 'free_x') +
ggtitle('Weakly informative prior')
sim_pwr
```

And the results after 1000 runs per case are below.

```
case_nr units_per_resample reprod repeatab mu power
<int> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 c(6, 3) 0.9 0.875 1.87 0.062
2 2 c(6, 3) 0.1 0.875 1.87 0.068
3 3 c(6, 12) 0.9 0.875 1.87 0.144
4 4 c(6, 12) 0.1 0.875 1.87 0.32
```

If we compare with a simple power analysis of 9 samples we get

```
power.t.test(
n = 9,
sd = 0.9,
delta = 1.87,
alternative = "two.sided",
type = "one.sample",
sig.level = 0.05)
One-sample t test power calculation
n = 9
delta = 1.87
sd = 0.9
sig.level = 0.05
power = 0.9996893
alternative = two.sided
```

How is possible to get so big differences in the outcome of the two methods (6% vs >99%)?