If Jacobian for X is Y, is Jacobian for X/Z still Y?

Hopefully a quick Q:

I have a model that I previously set up with a non-linear change of variables to X on which I put a prior and thus had to do a Jacobian adjustment, which worked out to be Y:

model{
   real X = ... ;
   X ~ std_normal() ;
   target += Y
}

If I now want to instead put a prior on X/Z, since the new transform is linear, do I simply still use the original Jacobian Y, as in:

model{
 real X = ... ;
 real X_div_Z = X/Z ; 
 X_div_Z ~ std_normal() ;
 target += Y
}

Or am I not understanding the proper method by which the correct Jacobian is determined?

Need more information to tell.

If Z is data, then that’s no problem. If Z has parameters, could be a problem.

The question here is how many parameters are you transforming from and how many to? If it’s the same number, the inverse exists, and you can take derivatives, that’s most of the stuff out of the way (more here: 10.1 Changes of Variables | Stan Reference Manual).

So can you describe more about the transformation here?

Thanks and for sure! Both X and Z are single-value real parameters. X is the magnitude of a vector parameter Q of length 2:

parameters{
  real Z;
  vector[2] Q;
}
transformed parameters{
  real X =  sqrt(dot_self(Q)) ;
}
model{
  real X_div_Z = X/Z ; 
  X_div_Z ~ std_normal() ;
  target += ?
}

real X_div_Z = X / Z is two parameters in, one parameter out, so it doesn’t fit into the Jacobian thing. Same with the Q -> X thing. This comes up with transforms from unconstrained space to polar coordinates. The argument against is what does (r, theta) = (0, 0) map to? The argument for is, regardless, it seems to be doing the expected thing.

I searched a bit and didn’t find it, but it’s come up a few times here. If this is relevant, search around a bit. If you can’t find it let me know and I’ll dig more.

Just because you can’t add a Jacobian for your transform doesn’t mean you can’t do this sort of thing in your model, it just means you’re inventing a prior that nobody else is using but that’s not a bad thing necessarily (assuming it normalizes – and it might not, and assuming it better encodes prior information about your model, and assuming it isn’t too hard to communicate).

Ah, cool! And presumably the proper procedure in this case would be to simply sample the prior to double-check there’s no unexpected side-effects? As in:

data{
  int<lower=0,upper=1> sample_prior_only ;
  ...
}
parameters{
  real Z;
  vector[2] Q;
}
transformed parameters{
  real X =  sqrt(dot_self(Q)) ;
}
model{
  real X_div_Z = X/Z ; 
  X_div_Z ~ std_normal() ;
  if(!sample_prior_only){
    //likelihood stuff here
  }
}

You’d probably want to try to check the normalization thing on paper.

Write out your log density in terms of the parameters Z and Q and see if you can integrate over everything. If integrals aren’t analytical, try to bound them with things that are and integrate those. It should give you some insight into what the problems might be, at least.

If it normalizes, then it’s up to you to communicate what you’re doing. Since it’s something non-standard, you should probably put together a couple slides to convince yourself and anyone else coming along in the future that it’s a reasonable thing to do (what are the parameters of this prior, how does it change, etc.).

Ofc., I don’t ever remember how a gamma distribution is parameterized so I gotta go read about them whenever they pop up, but the disadvantage you have here is Wikipedia isn’t gonna have a webpage for what you’re doing.