Standardizing predictors in the transformed data block

In 25.12 Standardizing predictors and outputs | Stan User’s Guide, predictors and outputs are standardized by centering and rescaling. The standardization is performed in the transformed data block, implying that the mean and standard deviation are known for each variable.

Would “fully Bayesian standardization” perform the standardization in the parameters/model blocks? Presumably the mean and standard deviations themselves should be given priors and so the standardization would then take place there, but perhaps I am missing something. Alternatively, is this a practicality?

There aren’t really any assumptions about the mean and sd being known ahead of time. We just center and rescale the empirical data (typically without adjusting for degrees of freedom, so just the MLE of the sd). This results in a well-formed regression problem and it can be converted back to the original scales for downstream use.

rstanarm goes further and applies a QR decomposition to do the multivariate standardization (so covariance is also zero).

While we might expect that to work a bit better with the true data mean and sd, but I don’t see how building a Bayesian model that only has access to the empirical data is going to help. It also makes it very problematic to convert back to the original scale, as now the transform scale isn’t definite any more.

If you try it, I’d be curious to see what the result is. By that I mean comparing this:

data {
  vector[N] x;
transformed data {
  vector[N] x_std = (x - mean(x)) / sd(x);

to this:

data {
  vector[N] x;
parameters {
  real mu_x;
  real<lower=0> sigma_x;
  vector[N] x_std;
model {
  x_std ~ normal(mu_x, sigma_x);
  mu_x ~ ...???...
  sigma_x ~ ...???...

where there’s a likelihood term that is the same either way:

  y ~ normal(alpha + beta * x_std, sigma);

@bgoodri probably knows what will happen without running it!