Hi everyone

I’ve got yet another example of the gradients not being finite, but having a finite likelihood for the initial parameters. I’m reaching out for help because I’ve been trying to debug for days now! I have a very complex model, which I can share, but would like to check if the answer is simple first before asking for more involved help.

If I had to take a guess at my problem, it would be an underflow or overflow issue. I have about a dozen model parameters on very different scales, some best-fit values are around 1e-4 and others around 10,000. All are positive only. My thinking to get around this was to transform the parameters by their prior means i.e.

```
parameters {
real param1;
real param2;
etc
}
transformed parameters {
real<lower=0> param1_tr = exp(param1 - 9);
real<lower=0> param2_tr = exp(param2 + 9);
etc
}
model {
param1 ~ normal(0, 1);
param2 ~ normal(0, 1);
etc
target += normal_lpdf(Y | mu, sigma);
}
```

such that param1_tr is around 1e-4 and param2_tr is around 1e4. These should be very close to the true values.

When I take a look at the gradients using grad_log_prob(fit, upars), where upars is a vector of zeros with a 1 for the sigma, I get

[1] NaN NaN NaN NaN NaN NaN NaN NaN

[9] NaN NaN NaN 31295823

attr(,“log_prob”)

[1] -15653355

There is a magic number for initialisation values, where if they are less than around -0.035, the gradients are finite. This I find quite puzzling.

My question is, would those transformations be the likely culprit? If so, could anyone suggest a better way to rescale the parameters?

Alternatively, would you expect this to be fine? If so, I’ll prepare a simulated dataset and upload the full model.

Many thanks for your help, I love this forum

Chris

Operating System: Ubuntu

Interface Version: rstan 2.21

Compiler/Toolkit: gcc