I don’t think that your general procedure for thinking about priors is advisable (if I understand correctly).
First, it is a problem to allow your data to inform your prior. This issue is actually really subtle (it is technically ok if your data prompts you to reexamine your domain expertise, and to cause you to realize that your previous prior mis-characterized your prior domain expertise). But the procedure that you describe sounds problematic. I think I see the underlying logic–in an intercept-only model, the distribution of the data provides a very weak guess about the location of the mean. But this doesn’t actually stand up to scrutiny. The idea that the distribution of the data provides a weak guess about the location of the mean already presupposes a sufficiently flat prior, and then allows a “first look” at the data to tighten the initial flat prior. If your “initial prior” (aka “the prior”) is truly close to flat, you should just use that as your prior in the analysis without tightening it around the sample mean.
Second, the logic breaks down even further when you’re picking a prior for the regression coefficients rather than the intercept of an intercept-only model. I think a toy example is the best way to see this. Imagine you’re fitting a linear regression of the form y = a + bx + Normal(0, sigma). Imagine that in reality x has no effect on y, and this lack-of-effect is consistent with our domain expertise. Imagine further that the distribution of y in our data is approximately normal, centered at mean(y) = 100 and with standard deviation sd(y) = 1.
If I understand your procedure correctly, you are suggesting putting a prior of Normal(100,1) not only on a but also on b, which implies that you are certain as can be that x has a huge positive impact on y.
The main advice that you’ll find on this forum regarding choice of priors is to do your best to choose priors that are consistent with your domain expertise. This doesn’t mean you have to write down priors that precisely encode your current state of belief. If you want weak priors that are consistent with your domain knowledge, choose them so that they cover the full range of values that are remotely conceivable to you–the range outside of which you would assume that something had gone catastrophically wrong with your data collection. If you really don’t want to do this, you can use truly flat (improper) priors, but this is a generally unpopular choice among Stan users (and some have convincingly argued that these flat priors NOT non-informative). But what you must not do is to use priors that are directly informed by the data you are modeling.