I donâ€™t think that your general procedure for thinking about priors is advisable (if I understand correctly).

First, it is a problem to allow your data to inform your prior. This issue is actually really subtle (it *is* technically ok if your data prompts you to reexamine your domain expertise, and to cause you to realize that your previous prior mis-characterized your prior domain expertise). But the procedure that you describe sounds problematic. I think I see the underlying logicâ€“in an intercept-only model, the distribution of the data provides a very weak guess about the location of the mean. But this doesnâ€™t actually stand up to scrutiny. The idea that the distribution of the data provides a weak guess about the location of the mean already presupposes a sufficiently flat prior, and then allows a â€śfirst lookâ€ť at the data to tighten the initial flat prior. If your â€śinitial priorâ€ť (aka â€śthe priorâ€ť) is truly close to flat, you should just use that as your prior in the analysis without tightening it around the sample mean.

Second, the logic breaks down even further when youâ€™re picking a prior for the regression coefficients rather than the intercept of an intercept-only model. I think a toy example is the best way to see this. Imagine youâ€™re fitting a linear regression of the form y = a + bx + Normal(0, sigma). Imagine that in reality x has no effect on y, and this lack-of-effect is consistent with our domain expertise. Imagine further that the distribution of y in our data is approximately normal, centered at mean(y) = 100 and with standard deviation sd(y) = 1.

If I understand your procedure correctly, you are suggesting putting a prior of Normal(100,1) not only on a but also on b, which implies that you are certain as can be that x has a huge positive impact on y.

The main advice that youâ€™ll find on this forum regarding choice of priors is to do your best to choose priors that are consistent with your domain expertise. This doesnâ€™t mean you have to write down priors that precisely encode your current state of belief. If you want weak priors that are consistent with your domain knowledge, choose them so that they cover the full range of values that are remotely conceivable to youâ€“the range outside of which you would assume that something had gone catastrophically wrong with your data collection. If you really donâ€™t want to do this, you can use truly flat (improper) priors, but this is a generally unpopular choice among Stan users (and some have convincingly argued that these flat priors NOT non-informative). But what you must not do is to use priors that are directly informed by the data you are modeling.