I need to justify the choice of my sample size so I did a design/power analysis by simulating datasets with the expected effects and calculated the power with which the existing effects would be detected in, say, 1,000 simulations. However, I am unsure whether I should enter any priors (and which) into my model: I first thought about using the expected effects as priors but the resulting distributions of effects are very narrow and I get a power of 100% which seems very unlikely so I am a bit afraid that choosing the exact effects existing in the dataset is not the optimal solution. So now I would like to ask whether I should even enter any prior assumptions in the model or if I better use uninformed priors for the simulations. (For the planned study, I aim to use the priors from these expected effects). I would be very thankful for any help.
If, in essence, you’re doing a parameter recovery analysis where you vary the number of data points to see where the recovery fails, I think you want to use priors that match the distributions from which the parameters (that your simulated data is generated from) are randomly drawn. I think that you also want to make sure that for the number of parameters that you’re using (i.e. the number of parameters Stan returns on each successful step of MCMC), you’re generating the required number of effective samples to achieve your desired sensitivity and tolerance (probably 0.05 for both). You can calculate that using an approximate mESS formula like the one found here and compare that to the ESS of your samples.
I am not formally trained in Bayesian statistics, however, and would like to see what other people with more experience think.
What does it mean to “detect an existing effect” in a Bayesian setting? This sounds like you’re trying to do some kind of frequentist hypothesis testing.
This has come up before, and last time, Andrew Gelman recommended a chapter of his book with Jennifer Hill. Here’s a blog post from Andrew the explains why he doesn’t like traditional power analyses along with some constructive suggestions about what you can do.
In Bayesian analyses, it’s always best to the use the most informative prior you can justify with your prior knowledge. Sometimes you can get away without doing this when the data is informative enough by itself. And you always want to use all your prior information to simulate, even in a frequentist power calculation.