I have a nonlinear transformation, similar to this one:
data {
vector[N] x;
vector[N] y;
}
parameters {
real sigma;
real mu;
real sigma2;
real mu2;
vector[N] real_x;
vector[N] real_y;
}
transformed parameters {
vector[N] m = sqrt(real_x.*real_x + real_y.*real_y);
}
model {
m ~ normal(mu, sigma);
real_x - x ~ normal(mu2, sigma2);
real_y - y ~ normal(mu2, sigma2);
}
Since the transformation is not linear, I have to add the jacobian determinant, but I don’t know how to do so.
The jacobian wouldn’t be square, since:
m:\mathbb{R}^2\longrightarrow\mathbb{R}
J = [x/m, y/m]^T
I’m quite lost here and probably the answer is very simple, but I’m stuck.
Thank you in advance!
Hi! Welcome to the Stan Discourse! :)
Tbh your problem is out of my league, but let me try to summon @betanalpha (I hope you don’t mind being tagged [PM me if you do], I thought these kind of questions are right up your alley).
1 Like
Thank you so much! I’m really stuck with this…
You can’t transform probability density functions between two real spaces of different dimensions – the Jacobian update rule is valid only when the input space and output space have the same dimension and the Jacobian matrix is square and the Jacobian determinant well-defined.
What you really have to do instead is buffer your output space with auxiliary variables to define a one-to-one transformation, compute the corresponding Jacobian, and then marginalize if you can.
For your transformation you could try to bring in the angle that complements the radius, i.e. transform from (real_x, real_y)
to m
, theta, but that works only if the possible angles are constrained so that you can avoid the singularity that arises at one of the values of m
or one of the values of theta
. You’ve run into a deeper problem in that you can’t transform densities between spaces of different topologies, and if you try you either get an infinite series (because the map from real_x
and real_y
to m
and theta
is multivalued) or singularities where the Jacobian seems to explode.
Here you can get around those troubles by specifying prior densities in each quadrant of the real_x, real_y
plane and then piecing the resulting Jacobians together with if-statements if they differ by quadrant. You can also try to avoid transforming between densities directly and try to construct cumulative distribution functions on the output space and differentiate those; see for example the derivation of the Rayleigh density functions in https://en.wikipedia.org/wiki/Rayleigh_distribution#Relation_to_random_vector_length.
Anyways, all of this technical trouble is a good indication that this is a weird prior model that will lead to weird posteriors and you may want to reconsider (if your data are weak and the prior on m
dominates then you end up with a highly degenerate, annular posterior density).
5 Likes
Thank you very much, this is really helpful.
It does seem as if the model is not good, though. I’ll follow your advice and try to use other parameters. In any case I understand much better now how this works.