Standardizing predictors in the transformed data block

bjg · September 15, 2022, 8:45pm

In 25.12 Standardizing predictors and outputs | Stan User’s Guide, predictors and outputs are standardized by centering and rescaling. The standardization is performed in the transformed data block, implying that the mean and standard deviation are known for each variable.

Would “fully Bayesian standardization” perform the standardization in the parameters/model blocks? Presumably the mean and standard deviations themselves should be given priors and so the standardization would then take place there, but perhaps I am missing something. Alternatively, is this a practicality?

Bob_Carpenter · September 20, 2022, 8:21pm

There aren’t really any assumptions about the mean and sd being known ahead of time. We just center and rescale the empirical data (typically without adjusting for degrees of freedom, so just the MLE of the sd). This results in a well-formed regression problem and it can be converted back to the original scales for downstream use.

rstanarm goes further and applies a QR decomposition to do the multivariate standardization (so covariance is also zero).

While we might expect that to work a bit better with the true data mean and sd, but I don’t see how building a Bayesian model that only has access to the empirical data is going to help. It also makes it very problematic to convert back to the original scale, as now the transform scale isn’t definite any more.

If you try it, I’d be curious to see what the result is. By that I mean comparing this:

data {
  vector[N] x;
  ...
}
transformed data {
  vector[N] x_std = (x - mean(x)) / sd(x);

to this:

data {
  vector[N] x;
parameters {
  real mu_x;
  real<lower=0> sigma_x;
  vector[N] x_std;
}
model {
  x_std ~ normal(mu_x, sigma_x);
  mu_x ~ ...???...
  sigma_x ~ ...???...

where there’s a likelihood term that is the same either way:

  y ~ normal(alpha + beta * x_std, sigma);

@bgoodri probably knows what will happen without running it!

Topic		Replies	Views
Standardizing predictors and outputs in a hierarchical model Modeling	18	5277	April 30, 2019
Converting unstandardized priors for rstanarm function arguments rstanarm priors , rstanarm	2	750	January 7, 2022
How to avoid rescaling predictors and outcomes General	1	441	April 25, 2019
Difference in means / t-test and scaling of data Modeling	2	1259	May 25, 2017
Undo standardization after fit with brms brms	4	628	April 1, 2025

Standardizing predictors in the transformed data block

Related topics