# Reparameterization example in manual

Hello,

After I got some complaints about divergent transitions and increasing the adapt_delta parameter did not work, I decided to try my luck at reparameterization. Currently on page 287 and 288 of the manual there is an example on how to do that for a Dirichlet prior. This is the suggested reparameterization:

``````parameters {
simplex[K] phi;
real<lower=0> kappa;
simplex[K] theta[N];
...
transformed parameters {
vector[K] alpha = kappa * theta;
...
}
model {
phi ~ ...;
kappa ~ ...;
for (n in 1:N)
theta[n] ~ dirichlet(alpha);
``````

My questions are:

• Why is phi defined if it is not used?
• Why is alpha both a function of theta (alpha = kappa*theta) but also a hyperparameter of theta (theta ~dirichlet(alpha))?
• can we just write theta~dirichlet(alpha) instead of looping over 1:N?

Kind regards,

Pretty sure it is supposed to be `vector[K] alpha = kappa * phi;` and no you cannot avoid the loop.

1 Like

Not yet, but weâ€™d love to get some programming help to vectorize the remaining unvectorized distributions.

Yes, and Iâ€™ll fix the manual:

A vectorised dirichlet would actually be quite useful for me, Iâ€™ll take that on as a project

Itâ€™d also be nice to have a mean/total parameterization:

``````DirichletMean(theta | phi, kappa)
=def=
Dirichlet(theta | phi * kappa)
``````

where `theta` is a K-simplex, `kappa > 0` and `phi` is a K-simplex. Itâ€™s much easier to think of priors this way. And also the beta version.

The other simplex distribution that would be nice would be multi-logit,

``````MultiLogitNormal(theta | mu, Sigma)
=def=
MultiNormal(logit(theta) | mu, Sigma)
``````

Like lognormal or inverse gamma, this one needs the Jacobian adjustment, but itâ€™s simple because the transform is dimensionwise and hence diagonal.

@andrjohns any progress on the vectorized dirichlet?

Thanks for reminding me! Iâ€™ve added an issue in the math library here and will code that up

@andrjohns awesome! Iâ€™m using a method where we have a bunch of latent dirichlet random variables, so this will be super useful to me. Iâ€™ll be sure to let you know how it works out whenever you get this done!

Hi!

I have implemented the lpdf (and rng) for the multi_logit_normal. Based on your post here I was thinking of contributing it to Stan. Is there any information anywhere on what is expected of code for new distributions like this or how to open a PR for a new distribution?

/MĂĄns

thereâ€™s some documentation on adding a new distribution at Stan Math Library: Adding A New Distribution.

I highly suggest that it be the Cholesky parameterization because you can reuse most the current mvn code and itâ€™s more efficient and has derivatives. As @Bob_Carpenter says above, you can keep all the multi_normal_cholesky stuff and just add the jacobian adjustment for the logit transform. Literally itâ€™s a one line adjustment. The harder part is adding chain-rule derivative for the partial wrt the input but itâ€™s just the partial of the input that changes, so also not too difficult.

Great! Yes. I was actually thinking of adding both Chol and with Sigma? Yes. In both cases, we can use the multivariate normal and just add the jacobian.

A quick additional question then. Is there a standard for reference categories? I now use the last category, but think it is better to synchronize this if there is a Stan standard.

/MĂĄns

Sorry, I donâ€™t understand, what do you mean by â€śreference categoriesâ€ť?

Hi!

So the multi-logit-normal parameters mu and Sigma are defined on R^(D-1) and R^(D-1)*(D-1). The reference category is usually set to 0 so that the distribution is identified. I think this is usually referred to as the reference category. The question is what to use as the reference category? I now use category D.

See

/MĂĄns

2 Likes

Got it, first open a Stan math issue and ask. I donâ€™t have any strong preference but someone might. I think the main thing is that we want this clearly documented for the user.

Great! Thanks! Will do!

Note that the `brms` implementation here defaults to using the first category as reference. Implement the logistic-normal family by paul-buerkner Â· Pull Request #1295 Â· paul-buerkner/brms Â· GitHub

Iâ€™ll just mention that SUG 9.5 deals with something like this, but appears to omit the identifying constraint. However, SUG 9.5 cites Blei & Lafferty 2007, which in turn cites Aitchison & Shen 1980, who impose the identifying constraint by applying what they call a logistic transform to an n-1 dimensional normal variate, then deriving the final element by subtracting from 1. This is equivalent to using the last category as the reference category.

3 Likes