Hello. This is more of a conceptual question than a coding one. (Please advise me to switch incase this is not the appropriate category)

I’m modeling recruitment curves using a Hierarchical Bayesian model. There is a key parameter in my recruitment curve, let’s call it P. I have two groups (A and B) of participants of respective size N_A and N_B.

After I have finished fitting recruitment curves, I get estimates for the parameter P: P_A of size (1000, N_A) (for group A) and P_B of size (1000, N_B) (for group B), where 1000 is the no. of posterior samples (post-warmup) that I am collecting.

Now, I want to test the hypothesis that the mean of parameter P for group A is less than that of B.

To test this, I could take the MAP estimates and do a frequentist test. But the problem with this is that MAP estimates are not very reliable in participants where these parameters are not clearly observed. Also, I lose valuable information provided by the posterior samples.

Another way is that, I have setup another Bayesian model where I’m modeling the mean of these parameters.

mu_A, mu_B ~ Normal(50, 100) [My prior knowledge says the mean lies here. But I’m putting the same prior on both mu_A and mu_B, essentially saying that there is no difference in them (Null Model)]

sigma ~ HalfCauchy(1)

P_A ~ Normal(mu_A, sigma)

P_B ~ Normal (mu_B, sigma)

and once I fit them, I’m going to reject the Null Hypothesis if the posterior samples of mu_A and mu_B follow Pr(mu_A < mu_B) > .95.

Is this a valid technique? If so, could you please tell me if it’s been used in literature and where I can read about it more?

Other question I have which is possibly out of scope — how to estimate false positives of such testing procedures (basically, how do I trust such procedures). Also, how sensitive is such a procedure to the number of posterior samples?