Reparameterization example in manual

IambicPentameter · February 5, 2018, 10:19pm

Hello,

After I got some complaints about divergent transitions and increasing the adapt_delta parameter did not work, I decided to try my luck at reparameterization. Currently on page 287 and 288 of the manual there is an example on how to do that for a Dirichlet prior. This is the suggested reparameterization:

parameters {
simplex[K] phi;
real<lower=0> kappa;
simplex[K] theta[N];
...
transformed parameters {
vector[K] alpha = kappa * theta;
...
}
model {
phi ~ ...;
kappa ~ ...;
for (n in 1:N)
theta[n] ~ dirichlet(alpha);

My questions are:

Why is phi defined if it is not used?
Why is alpha both a function of theta (alpha = kappa*theta) but also a hyperparameter of theta (theta ~dirichlet(alpha))?
can we just write theta~dirichlet(alpha) instead of looping over 1:N?

Kind regards,

bgoodri · February 5, 2018, 10:32pm

Pretty sure it is supposed to be vector[K] alpha = kappa * phi; and no you cannot avoid the loop.

Bob_Carpenter · February 6, 2018, 2:11am

Not yet, but we’d love to get some programming help to vectorize the remaining unvectorized distributions.

Yes, and I’ll fix the manual:

andrjohns · February 7, 2018, 4:28pm

A vectorised dirichlet would actually be quite useful for me, I’ll take that on as a project

Bob_Carpenter · February 7, 2018, 7:40pm

It’d also be nice to have a mean/total parameterization:

DirichletMean(theta | phi, kappa) 
=def=
Dirichlet(theta | phi * kappa)

where theta is a K-simplex, kappa > 0 and phi is a K-simplex. It’s much easier to think of priors this way. And also the beta version.

The other simplex distribution that would be nice would be multi-logit,

MultiLogitNormal(theta | mu, Sigma)
=def=
MultiNormal(logit(theta) | mu, Sigma)

Like lognormal or inverse gamma, this one needs the Jacobian adjustment, but it’s simple because the transform is dimensionwise and hence diagonal.

jfiksel · September 19, 2019, 1:10pm

@andrjohns any progress on the vectorized dirichlet?

andrjohns · September 19, 2019, 2:27pm

Thanks for reminding me! I’ve added an issue in the math library here and will code that up

jfiksel · September 19, 2019, 2:53pm

@andrjohns awesome! I’m using a method where we have a bunch of latent dirichlet random variables, so this will be super useful to me. I’ll be sure to let you know how it works out whenever you get this done!

mans_magnusson · March 21, 2022, 9:35am

Hi!

I have implemented the lpdf (and rng) for the multi_logit_normal. Based on your post here I was thinking of contributing it to Stan. Is there any information anywhere on what is expected of code for new distributions like this or how to open a PR for a new distribution?

/Måns

spinkney · March 21, 2022, 10:00am

Hi @mans_magnusson,

there’s some documentation on adding a new distribution at Stan Math Library: Adding A New Distribution.

I highly suggest that it be the Cholesky parameterization because you can reuse most the current mvn code and it’s more efficient and has derivatives. As @Bob_Carpenter says above, you can keep all the multi_normal_cholesky stuff and just add the jacobian adjustment for the logit transform. Literally it’s a one line adjustment. The harder part is adding chain-rule derivative for the partial wrt the input but it’s just the partial of the input that changes, so also not too difficult.

mans_magnusson · March 21, 2022, 10:09am

Great! Yes. I was actually thinking of adding both Chol and with Sigma? Yes. In both cases, we can use the multivariate normal and just add the jacobian.

A quick additional question then. Is there a standard for reference categories? I now use the last category, but think it is better to synchronize this if there is a Stan standard.

/Måns

spinkney · March 21, 2022, 11:15am

Sorry, I don’t understand, what do you mean by “reference categories”?

mans_magnusson · March 21, 2022, 1:17pm

Hi!

So the multi-logit-normal parameters mu and Sigma are defined on R^(D-1) and R^(D-1)*(D-1). The reference category is usually set to 0 so that the distribution is identified. I think this is usually referred to as the reference category. The question is what to use as the reference category? I now use category D.

See

/Måns

spinkney · March 21, 2022, 1:21pm

Got it, first open a Stan math issue and ask. I don’t have any strong preference but someone might. I think the main thing is that we want this clearly documented for the user.

mans_magnusson · March 21, 2022, 1:21pm

Great! Thanks! Will do!

jsocolar · March 21, 2022, 8:20pm

Note that the brms implementation here defaults to using the first category as reference. Implement the logistic-normal family by paul-buerkner · Pull Request #1295 · paul-buerkner/brms · GitHub

I’ll just mention that SUG 9.5 deals with something like this, but appears to omit the identifying constraint. However, SUG 9.5 cites Blei & Lafferty 2007, which in turn cites Aitchison & Shen 1980, who impose the identifying constraint by applying what they call a logistic transform to an n-1 dimensional normal variate, then deriving the final element by subtracting from 1. This is equivalent to using the last category as the reference category.

Edit: see also SUG 9.5 suggests a degenerate model · Issue #456 · stan-dev/docs · GitHub

Topic		Replies	Views
Dirichlet Priors General	7	916	February 12, 2021
Issue With Transformed Parameter Modeling	3	477	July 5, 2023
Reparametizations and prior distributions of positive parameters Modeling	1	482	March 7, 2023
Beginner request for reparameterization help Modeling	12	1129	January 3, 2018
Reparameterizing the Gamma Modeling	13	3885	June 23, 2017

Reparameterization example in manual

Related topics