Cholesky decomposition

linas · September 23, 2017, 3:48pm

Hi,

Can somebody advice which cholesky decomposition is more effective in stan? Can models be optimized even more?

Also, which model, if any, falls under centered/noncentered parametrization? I would like to grasp the terms how they are applied in multidimensional case.

Linas
chol1.stan (1.4 KB)

chol2.stan (1.4 KB)

bbbales2 · September 23, 2017, 8:07pm

Also, which model, if any, falls under centered/noncentered parametrization?

The centered parameterization for a multivariate normal is when you do:

y ~ multi_normal(zero, Sigma);

The non-centered is when you do:

z ~ normal(0, 1);
y = cholesky_decompose(Sigma) * z;

The specifics of the naming I dunno. I always go to Betancourt’s divergences case study if I get confused about this stuff: Diagnosing Biased Inference with Divergences

Can somebody advice which cholesky decomposition is more effective in stan?

I think hierarchical models without much data → divergences. I think there’s a data dependence there. Easiest way to figure out which parameterization works better for you is to try them both, but if your centered parameterization is running fine (no divergences and diagnostics looking good), I don’t think think there’s a huge call to switch. I might be wrong.

The non-centered parameterizations always trip me up for a bit when I’m reading models, so I prefer to keep them centered for readability if I can.

cholesky decomposition

I think you’re using your Choleskys here correctly.

Can models be optimized even more?

sigma = 2.5 * tan(sigma_unif); // sigma ~ cauchy(0, 2.5)
for (k in 1:K) tau[k] = 2.5 * tan(tau_unif[k]); // tau ~ cauchy(0, 2.5)

Why not just use the cauchy syntax here? I’m not sure with the constraints on tau and sigma this is totally right.

And cauchy priors aren’t all roses (Asymmetric Gaussian Hierarchical Model - #11 by bgoodri). They’re serious when they talk about those heavy tails :P (draw some numbers from cauchy(0.0, 1.0) and just look at them).

bbbales2 · September 26, 2017, 2:22pm

Ignore my comment on the tan thing vs. cauchy. Had the usefulness of that explained to me in another thread ^^.

Bob_Carpenter · October 2, 2017, 7:49pm

This is fine when presenting them, but coding them in Stan (or in BUGS/JAGS) should be based on data size and specificity. With little data, you need the non-centered parameterization in order to draw unbiased samples from the posterior—the funnel-shaped posterior you get from the centered parameterization will defeat Euclidean HMC, Gibbs, and Metropolis.

linas · October 2, 2017, 7:59pm

Is there some definition of specificity? Basically I am looking for a rule
of thumb when centered works better and when non-centered works better. So
far I am hearing that if sample size is small then non-centered is
recommended while sample size is large centered is recommended. Are there
some quantitative definitions of large/small/specific/non-specific?

Linas

Bob_Carpenter · October 2, 2017, 8:49pm

Check out Bentacourt and Girolami’s paper. There’s an arXiv version. They plot curves based on posterior standard deviation vs. breakeven point.

namarks · October 27, 2017, 7:07pm

Hey, could you link me the thread where tan vs cauchy difference was explained to you? I have the same question.

bbbales2 · October 27, 2017, 7:57pm

I forget where, but check “Reparameterizing the Cauchy”, page 339 of the manual.

Topic		Replies	Views
Cholesky decomposition/speed/divergencies Modeling	11	2815	August 19, 2017
Partial non-centered parametrizations in Stan Modeling techniques	7	2810	December 30, 2018
Non-centered parametrization with irregularly missing data Modeling reparametrization	1	237	December 8, 2023
Non centered parameterization on variance parameter Modeling	31	8483	October 22, 2018
Difference in parametrization Modeling	2	347	May 3, 2020

Cholesky decomposition

Related topics