Linear parametrization: `a + b(x - mu) `

I came across this parametrization in Statistical Rethinking section 4.4.

stan_program <- '
data {
  int<lower=1> n;
  real xbar;
  vector[n] height;
  vector[n] weight;
parameters {
  real<lower=0,upper=50> sigma;
  real<lower=0> b;
  real a;
model {
  vector[n] mu;
  mu = a + b * (weight - xbar);
  height ~ normal(mu, sigma);
  a ~ normal(178, 20);
  b ~ lognormal(0, 1);
  sigma ~ uniform(0, 50);

I’m curious why the author would define a linear model such as \mu_{Y|X=x} = a + b(x - \mu_x). Consider when x = \mu_x, the equation would collapse to \mu_{Y|X=x}=a (just the intercept.)

As I understand it, this would have the effect of linking mu_Y to mu_X; in other words, if you supply the global average X, you get the global average Y in response. (I might be wrong here!)

Why would this design be preferable over the more common \mu_{Y|X=x} = a + b*x? Is this related to model interpretability or to HMC/NUTS sampling/geometry (or perhaps both)? Is it context dependent?


You are not wrong here but this is not the result of the parametrization, it is the result of the linear model. If the linear model is correct, y = \mu_y is expected to happen when x = \mu_x. The parametrization is mainly to ease the interpretation of a and sampling efficiency as you say.

1 Like