Jacobian adjustment of sigmoid (decentered)

Hello,

I was reading this document of the Jacobian adjustment in case of a logit transformation

http://mc-stan.org/users/documentation/case-studies/mle-params.html

----- First question ----------------------------------------------------

If this is used

theta <- inv_logit(alpha);

The jacobian adjustement for theta ~ uniform(0,1); is:

1) log(logit(alpha)) + log(1 - logit(alpha)) 

or

2)  log(inv_logit(alpha)) + log(1 - inv_logit(alpha)) // as in the document

I’m confused because shouldn’t we take the determinant of the derivative of the inverse of inv_logit (which is logit)

----- Second question ----------------------------------------------------

If I have

gamma = logit(beta);

The Jacobia adjustment would be

log(inv_logit(alpha)) + log(1 - inv_logit(alpha))

?

----- Third question ----------------------------------------------------

If I have

gamma = logit(beta^ ( log( 0.5 ) / log( 0.2 ) ) ); // moves the 0 on 0.2 instead of 0.5.

( I need it because I want to map a simplex to a real space, then a simplex of size 5 will have it’s null state at [0.2, 0.2, 0.2, 0.2, 0.2])

What would be the Jacobian. According to wolfram alpha a pretty big formula, but maybe I am missing something.

Thanks a lot

The transform is from (0, 1) to (-infinity, infinty) by logit. So inv_logit is the inverse. You can remember this because you differentiate the function you apply to go from unconstrained to constrained.

You probably aren’t going to be doing this. The inverse transforms we use go from unconstrained to constrained, whereas logit goes from constrained to unconstrained.

In general, the Jacobian of a (continuous, monotonic) univariate function is just the absolute value of its derivative. We work on the log scale, so we use the log of the Jacobian.

There’s a chapter of the manual that goes over exactly how Stan does the centered stick-breaking transform for the simplex and it works through the Jacobian.

Thanks Bob,

I found this section on the manual, Translated and Scaled Simplex, that uses

beta = beta_scale * (SIMPLEX - 1.0 / K)

instead of what I use

beta = logit( SIMPLEX ^ (log(0.5)/log(1.0/K)) )

Although they both map the same baseline simplex [1/K, 1/K, … 1/K] to [0,0,…0] my version is more problematic (I get asymmetric posterior) and complex.

If I can avoid the use of a “decentered” logit tranformation I avoid the jacobian issue all together (do I?). In the option presented in the manual, for obtaining a nice symmetric posterior over the reals centred at 0, should I use:

SIMPLEX ~ dirichlet(...)

or

beta ~ normal(0, ..)

I was also thinking that depending on beta_scale I will need to choose the right dirichlet parameters(if I put prior on SIMPLEX), and possibly there are infinite combinations of beta_scale/dirichlet-hyper-parameter that produce the same posterior. How to behave in this scenario?

Since I didn’t find this part I assume that you were anyway talking a different section in the manual, btw.

Thanks

I was talking about Chapter 36. Transformations of Constrained Variables. It shows how constrained parameters in Stan are transformed to the unconstrained scale, along with how they are transformed back and the log Jacobian determinant is calculated.

Thanks Bob,

indeed this could be the right alternative for me. I am trying to replicate the function in R

image

> my_logit = function(x) { 
     x_hat = c();  
     for(i in 1:( length(x) - 1)){ 
         z = x[i] / (1 - sum(x[0:(i-1)]));  
         x_hat[i] =   logit(z) - log(1/(length(x)-i)) 
     }  
     x_hat[length(x)] = - sum(x_hat[1:( length(x) - 1)]) 
     return(x_hat) 
     }

However, I could not find a robust numerical implementation. I get periodic numbers that result in non precise mapping. For example.

my_logit(c(0.2, 0.2, 0.2, 0.2, 0.2))
[1]  0.000000e+00  0.000000e+00  2.220446e-16  4.440892e-16 -6.661338e-16

or

my_logit(c(0.25, 0.25, 0.25, 0.25))
[1]  0.000000e+00 -1.110223e-16  0.000000e+00  1.110223e-16

Do anyone happen to know a robust implementation? (unless I am missing/misunderstanding something obvious)

I might be getting wrong, as I’m not 100% sure about the link between the verbal definition and the formula

?? image == 1 - sum(x[0:(i-1)]) ??