Mode of beta distribution is displaced?

Bonnevie · April 25, 2018, 11:13am

In short, running Stan’s joint optimization routine on a transformed beta seems to recover the wrong mode.
Following the “mode and concentration” parameterization on Wikipedia we can design beta priors with specific modes. Mapping this to another space using an appropriate Jacobian we get a program like

transformed data {
    real<lower=0> conc = 10;
    real<lower=0,upper=1> mode = 0.25;
}

parameters {
    real a;
}

model {
    Phi(a) ~ beta(mode*(conc-2) + 1, (1-mode)*(conc-2) + 1);
    target += normal_lpdf(a | 0., 1.); 
}

The mode should be preserved under a monotonic transformation like Phi/inv_Phi, but if I run model.optimizing in Pystan I get a result that is off from what I expected. I have after some head-scratching managed to identify the proposed optimum as lying between the mean and the median of the density in the original space, but this doesn’t help much (and mean/median are not consistent under transformations).

betatest

Note that the samples are mapped back to the unit interval space on which the prior lives. There seems to be a consistent positive bias, but it seems to vanish if the mode is at 0.5.

Am I doing something wrong or is there a problem with how I am using the optimization routine? Do I need to take extra steps to handle transformed densities?

paul.buerkner · April 25, 2018, 2:25pm

Why are you using the Phi transformation on a? Wouldn’t it be easier to define

real<lower=0,upper=1> a;

and then just

a ~ beta(mode*(conc-2) + 1, (1-mode)*(conc-2) + 1)

Bonnevie · April 25, 2018, 2:50pm

Sure, but in my particular case I am employing the transformation, so it is included here in case that’s where the error is at.

edit: I see now that the implementation could be done in reverse, applying Phi to the sample in e.g. a generated quantities block, so I guess the argument for me using the transformation in the ~ statement is that I avoid managing another auxiliary variable. It’s also convenient for maps lacking analytical inverses, though.

edit: without the transformation it appears to work fine, with mode and optimum matching.

Bonnevie · April 25, 2018, 3:16pm

Okay, I think my basic premise here is wrong. Because of the Jacobian correction, the mode actually does get shifted when you perform a transformation.

paul.buerkner · April 25, 2018, 3:27pm

Ah now that I see you are using the mode, it makes sense. Indeed the mode is not equivariant under non-linear transformations. For instance, the mode of the normal distribution is mu while the mode of the log-normal distribution is exp(mu - sigma^2).

Bonnevie · April 25, 2018, 3:28pm

Yeah, that’s also the distribution I went back to, to verify :)

Here’s the math for it:
If Y\sim g^{-1}(X) and X\sim p_X then we have density

p_Y(Y)=p_X(g(Y))g'(Y)

where g' is the derivative of g. If we take the derivative of the whole density we get

p_X'(g(Y))(g'(Y)^2)+p_X(g(Y))g''(Y)

if we plug in the transformed mode of p_X, Y^*=g^{-1}(X^*), we are left with p_X(X^*)g''(Y^*).

Bob_Carpenter · May 22, 2018, 8:25pm

I wrote a case study about reparameterization and optimization, but I’m not entirely happy with it (not that it’s wrong, but it’s just not clearly enough making the point).

This isn’t a Stan issue per se, it’s just a general issue with changes of variables and mode-based point estimates not having any statistical interpretation.

Topic		Replies	Views
Convention for user-calibrated priors Modeling	2	525	April 7, 2021
Beta regression using LASSO prior (Park, Casella) - Initial values rejected Modeling fitting-issues , specification	6	1358	July 18, 2019
Non-centered parameterization with gamma priors for variance Modeling	6	1803	September 30, 2017
Placing priors with QR reparametrization (regression) Modeling techniques	5	1589	September 4, 2017
Negative Binomial model won't optimize dispersion parameter Modeling optimization , initialization	3	75	March 13, 2025

Mode of beta distribution is displaced?

Related topics