How is Neal's funnel an example of Matt trick being applied?

I’m reading the Stan manual section on Matt trick (Ch 27.6 Reparameterization). The example using Neal’s funnel makes sense, but it isn’t clear how the example corresponds the claims made about the Matt trick.

I’d like to better understand these claims:

  1. “[Matt trick] is a general transform from a centered to a non-centered parameterization”. What’s centered and non-centered in the Neal’s funnel example?

  2. “This reparameterization is helpful when there is not much data, because it separates the hierarchical parameters and lower-level parameters in the prior.”

  • Does this refer to a hierarchical model rather than Neal’s funnel?
  • (Assuming that this is about a hierarchical model) Isn’t the Matt trick about separating hierarchical parameters and lower-level parameters in the sampling statement rather than in the prior?

For reference, below is the Matt trick, presented in the manual.

Before transformation

model {
  y ~ normal(0, 3);
  x ~ normal(0, exp(y/2));
}

After transformation

transformed parameters {
  ...
  y = 3.0 * y_raw;
  x = exp(y/2) * x_raw;
}
model {
  y_raw ~ normal(0, 1); // implies y ~ normal(0, 3)
  x_raw ~ normal(0, 1); // implies x ~ normal(0, exp(y/2))
}

Re: 1, The first model you showed is the centered version, the second is the non-centered. I think Matt’s trick is to implement the model as a non-centered parameterization, though I’m not sure about the roots of either term.

Re: 2, That line is about hierarchical modeling I think (when you want to do pooling to avoid noisy estimates when you don’t have a lot of data), but funnels come up in hierarchical modeling so non-centered parameterizations are really useful for hierarchical models (which is how it’s relevant).

This is another thing worth reading regarding centered vs. non-centered parameterizations: http://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html

Hope that clears things up!

Matt independently discovered the non-centered parameterization. We used to call it “the Matt trick” (and Matt objected, saying it should be called “a Matt trick”). Then we found the rest of the literature. Michael Betancourt’s arXiv paper on hierarchical models is what you want to read. I also go over an example in the case study on repeated binary trials (web site, users >> documentation >> case studies).

1 Like

The Betancourt’s arXiv paper on hierarchical model is very helpful indeed. If possible, I would suggest referring to it in the manual.

Thanks Bob and Ben!

1 Like