I came across this parametrization in Statistical Rethinking section 4.4.

```
stan_program <- '
data {
int<lower=1> n;
real xbar;
vector[n] height;
vector[n] weight;
}
parameters {
real<lower=0,upper=50> sigma;
real<lower=0> b;
real a;
}
model {
vector[n] mu;
mu = a + b * (weight - xbar);
height ~ normal(mu, sigma);
a ~ normal(178, 20);
b ~ lognormal(0, 1);
sigma ~ uniform(0, 50);
}
```

I’m curious why the author would define a linear model such as \mu_{Y|X=x} = a + b(x - \mu_x). Consider when x = \mu_x, the equation would collapse to \mu_{Y|X=x}=a (just the intercept.)

As I understand it, this would have the effect of linking mu_Y to mu_X; in other words, if you supply the global average X, you get the global average Y in response. (I might be wrong here!)

Why would this design be preferable over the more common \mu_{Y|X=x} = a + b*x? Is this related to model interpretability or to HMC/NUTS sampling/geometry (or perhaps both)? Is it context dependent?

Thanks!