Affine transformation in Stan

I am confused about the effect of multiplier and offset in declaring data in Stan. Actually after reading the reference manual I am confused about the following example:

As an example, we can give x a normal distribution with non-centered parameterization as follows.

parameters {
  real<offset=mu,multiplier=sigma> x;
}
model {
  x ~ normal(mu, sigma);
}

Recall that the centered parameterization is achieved with the code

parameters {
  real x;
}
model {
  x ~ normal(mu, sigma);
}

What’s the difference between the ‘non-centered parameterization’ and ‘centered parameterization’ here? Are these two Stan code refering to the same model?

The same model, but the sampling happens in different parameter space. The model itself doesn’t change if we make parameter transformation, but the geometry of the transformed space can be easier for the sampler.

4 Likes

I literally did not know this was a feature; is this a relatively recent addition?

https://github.com/stan-dev/stan/releases says v.2.19.0 (20 March 2019)
but it wasn’t much advertised at that date as the interfaces were lagging

To be a bit more concrete, the offset and multiplier code:

parameters {
  real<offset=mu,multiplier=sigma> x;
}
model {
  x ~ normal(mu, sigma);
}

Is equivalent to:

parameters {
  real x_raw;
}
transformed parameters {
  real x = mu + x_raw * sigma;
}
model {
  x_raw ~ std_normal();
}

It makes for a bit more of a concise way to specify a non-centered distribution, and also reduces the number of parameters to be saved (i.e. only need to save x, not x and x_raw) which can be very handy for large models/samples

2 Likes

Got it!

In this case do we need to save ‘mu’ and ‘sigma’ if we use the expression:

parameters {
  real<offset=mu,multiplier=sigma> x;
}
model {
  x ~ normal(mu, sigma);
}

Thx!

So could I understand that they are actually equivalent but using the following will be more efficient since it will only need to sample from a standard Normal rather than Normal(mu, sigma)?

parameters {
  real<offset=mu,multiplier=sigma> x;
}
model {
  x ~ normal(mu, sigma);
}

Also may I ask that do the ‘x’ in paramters and ‘x’ in model have same distribution? I think ‘x’ in paramters follows N(0,1) and ‘x’ in model follows N(mu,sigma), but I am confused since they use the same symbol in the code.

In this case do we need to save ‘mu’ and ‘sigma’

That will depend on what mu and sigma is. For example, these can just be numbers, and won’t be saved:

parameters {
  real<offset=5,multiplier=2> x;
}

They could be passed as data, which also wouldn’t be saved:

data {
  real mu;
  real sigma;
}
parameters {
  real<offset=mu,multiplier=sigma> x;
}

However, if they’re also parameters, then these parameters will then be saved:

parameters {
  real mu;
  real sigma;
  real<offset=mu,multiplier=sigma> x;
}

So could I understand that they are actually equivalent but using the following will be more efficient since it will only need to sample from a standard Normal rather than Normal(mu, sigma)?

That’s correct, they both result in the same distribution for x, except one is a reparameterisation of a standard normal distribution.

Also may I ask that do the ‘x’ in paramters and ‘x’ in model have same distribution? I think ‘x’ in paramters follows N(0,1) and ‘x’ in model follows N(mu,sigma)

I’m not sure what you mean here sorry, x is distributed N(mu, sigma)

Sorry to chime in here, but what would be the vectorized version of the offset specification, i.e, when mu and sigma are themselves vectors?

The closest thing to a vectorised version is being able to specify the offset and multiplier for vectors/matrices:

parameters {
  vector<offset=mu,multiplier=sigma>[N] x;
}

But we don’t currently have the capability to do this with vector-typed mu and sigma, unfortunately