Gradient Pystan log_prob: constrained versus unconstrained

Ben_Lambert · January 27, 2021, 3:56pm

I’m having an issue where the gradients I calculate across models with constrained and unconstrained parameters don’t seem to match, even after I transform my variables to be on the unconstrained scale.

Suppose I specify a model:

data {
  int<lower=0> N;
  real y[N];
}
parameters {
  real mu;
  real sigma; 
}
model {
  y ~ normal(mu, sigma);
}

then I calculate the gradient in Pystan of the log probability assuming data:

 {'N':1, 'y':[0]}

then I obtain,

stanfit.grad_log_prob([1, 2]) [= (-1/4,-3/8)]

which I can match by hand, so this looks fine.

If I instead allow sigma to be constrained and repeat the exercise using the following model:

data {
  int<lower=0> N;
  real y[N];
}
parameters {
  real mu;
  real<lower=0> sigma; 
}
model {
  y ~ normal(mu, sigma);
}

then calculating the gradients, I obtain:

stanfit.grad_log_prob([1, np.log(2)], adjust_transform=True) [=(-1/4,1/4)]
stanfit.grad_log_prob([1, np.log(2)], adjust_transform=False) [=(-1/4,3/4)]

So neither of the gradients for sigma match what I had when using the unconstrained model, which seems weird.

Am I misinterpreting something here?

nhuurre · January 27, 2021, 7:45pm

The gradient is not with respect to \sigma but the unconstrained u (where \sigma = \exp(u)). Per chain rule

\frac{d}{du}\log p = \frac{d\sigma}{du}\times\frac{d}{d\sigma}\log p = \sigma\times\frac{d}{d\sigma}\log p

If I’m reading this right your adjust_transform=False result should be 2\times-\frac{3}{8}=-\frac{3}{4}.
Double-check the sign, other than that looks as expected.

adjust_transform=True adds the log-determinant of the Jacobian of the constraining transform which in this case is \log \sigma(=u) and that should add +1 to the gradient.

Ben_Lambert · February 1, 2021, 11:50pm

Thank you very much. You’re quite right: it was -3/4.

I feel like perhaps adding this to the documentation for the function would help quite a few users. Namely, that the result of grad_log_prob is on the unconstrained scale (of course, I know that users should probably know this but I feel like it would avoid confusion like mine). Anyway, just a thought!

Topic		Replies	Views
Pystan log_prob unexpected returns Developers	2	350	January 27, 2021
General question about debugging non-finite gradients during optimization Modeling	3	1682	April 6, 2020
Unexpected log_prob() behavior Developers fitting-issues	5	617	April 20, 2023
Gradients in PyStan 3 Developers pystan	2	618	January 20, 2021
Log_prob_grad via csv files for cmdstan CmdStan	7	457	May 12, 2021

Gradient Pystan log_prob: constrained versus unconstrained

Related topics