I’m having an issue where the gradients I calculate across models with constrained and unconstrained parameters don’t seem to match, even after I transform my variables to be on the unconstrained scale.
Suppose I specify a model:
data {
int<lower=0> N;
real y[N];
}
parameters {
real mu;
real sigma;
}
model {
y ~ normal(mu, sigma);
}
then I calculate the gradient in Pystan of the log probability assuming data:
{'N':1, 'y':[0]}
then I obtain,
stanfit.grad_log_prob([1, 2]) [= (-1/4,-3/8)]
which I can match by hand, so this looks fine.
If I instead allow sigma to be constrained and repeat the exercise using the following model:
data {
int<lower=0> N;
real y[N];
}
parameters {
real mu;
real<lower=0> sigma;
}
model {
y ~ normal(mu, sigma);
}
then calculating the gradients, I obtain:
stanfit.grad_log_prob([1, np.log(2)], adjust_transform=True) [=(-1/4,1/4)]
stanfit.grad_log_prob([1, np.log(2)], adjust_transform=False) [=(-1/4,3/4)]
So neither of the gradients for sigma match what I had when using the unconstrained model, which seems weird.
Am I misinterpreting something here?