I am interested in how to quantify reductions in uncertainty about the size of an experimental effect over a series of studies which, for hypothetical reasons, preclude the merging of data. I would like to use the posterior parameter estimates from each study to inform the priors of the next study. The model is a simple bayesian regression testing the difference between two groups at a single time point. The model equation is
\begin{align*}
y_i = \alpha + \beta_{B}x_{Bi} + \varepsilon_i
\end{align*}
where y_i is the score on the outcome for participant i, \alpha is the intercept â score in Group A â \beta_{B} is the difference in score between Group B and Group A, x_{Bi} is a binary {0,1} inclusion variable indicating whether participant i was in Group B and \varepsilon_i is the error term.
This model was repeated across three studies. For this example Iâll focus on the \beta_B parameter â the difference in score between Group A and Group B â but the same approach could be applied to the \alpha parameter as well.
Quantifying Incremental Reductions in Uncertainty
Study 1
The prior on \beta_{B} in the first Study was straightforward. Using a strategy employed by Kruschke I used the standard deviation of the outcome y_i to scale the prior distribution on \beta_B. The prior for \beta_B was a normal distribution centred on 0 (i.e. no difference between Group A and Group B), and with a standard deviation five times the standard deviation of post-beverage withdrawal scores. From here on Iâm going to refer to the amount we multiply the standard deviation entered into the prior distribution as a multiplication factor. I chose 5 as the multiplcation factor because I knew it would generate a very wide prior distribution. The standard deviation of scores in Study 1 was 12.5, so the standard deviation for the prior distribution on \beta_B was 5 x 12.5 = 62.5, in other words a very wide, weakly regularising prior, reflecting absence of any prior knowledge about the Group A/Group B difference.
Study 2
Thanks to the results of Study 1, in Study 2 I knew I a little bit more about what Group A/Group B difference to expect. In fact I used the modal estimates for \beta_{B} (6.8) and \varepsilon (10.7) to help me set the prior distribution for \beta_{B} in Study 2. The centre of the prior distribution was easy: I simply centred it on 6.8. The tricky part, and the subject of this post, is what to multiply the estimated standard deviation from Study 2 by to specify the spread of the \beta_{B} prior distribution in Study 2. I created a function to generate the multiplication factors for the spread of priors on \beta_B across the three experimentsâŚ
sapply(0:2, function (i) 5*1/5^(i))
[1] 5.00 1.0 0.2
âŚwhere each new value in the vector is one fifth of the previous value. The function dictated that the second multiplication factor was 5 x 1/5^1 = 1. Therefore the standard deviation of the prior distribution on \beta_B in Study 2 would be 1 x the estimated standard deviation from Study 1 = 1 x 10.7 = 10.7. This prior is still quite wide, but is much less so than the prior on \beta_B in Study 1.
Study 3
In Study 3, I used the same approach as for Study 2: the normal prior distribution on \beta_B in Study 3 is centred on the modal estimate of \beta_B in Study 2 â 7.3 --, and the spread is determined partially by the modal estimated standard deviation of y_i in Study 2 â 14.2 â and partially by the function for the multiplication factor. The multiplication factor in this, the third iteration of the model, was 5 x 1/5^2 = 0.1. Therefore the standard deviation of the prior distribution on \beta_B in Study 3 is 14.2 x 0.1 = 1.4. As you can see in the bottom panel of the figure this is results in a much more precise prior, reflecting the further reduction in uncertainty about the Group A/Group B difference in what is now the third iteration of the experiment.
Here is the figure with the prior and posterior distributions for the \beta_B parameter across all three studies. The green density function is the prior and the pink is the posterior. The prior distribution on \beta_B, almost invisible in Study 1, becomes more narrow over time, reflecting a reduction in uncertainty about the Group A/Group B difference.
My question relates to the function that dictates the reduction in the multiplication factor across studies, essentially that quantifies the reduction in uncertainty. The first function resulted in multiplication factors of 5, 1, 0.2. A function with a steeper reduction in uncertainty (each new iteration one tenth of the previous, starting at 5)âŚ
sapply(0:2, function (i) 5*1/10^(i))
[1] 5.00 0.50 0.05
âŚresults in the following set of estimates.
This function results in a much narrower parameter estimate for \beta_B by the time we get to Study 3, and the prior has âdraggedâ the modal posterior estimate downwards (compared to the Study 3 panel in the first figure). So this seems too certain.
So how do I quantify a reduction in uncertainty mathematically, in an ethical way?
And are there any functions that are better or more sensible than the two I devised?

