Standardizing predictors and outputs in a hierarchical model

True , but 1) with sensible weakly informative priors, not very small n, and optimal computation there is likely to be no difference, 2) the effect of assuming something to be known can be smaller than the effect of using bad priors or non-optimal computation and if scaling makes it easier to get more sensible prior or better computation the scaling can better than not scaling. You can also flip this by not scaling data, but scaling priors and thinking how does the scaling affect your inference if you in addition make the prior very tight or quite wide (as in rstanarm). With a wide prior assuming the scale of x and y to be known does not have much difference (unless n is really small). There is no universal agreement on priors, so there can’t be universal agreement on dual case of scaling variables.