Estimating variances: always infinite gradient


I’m having trouble estimating the variances in a discrete choice model using ordered probit in Stan 2.18.0. In my model, I know exactly the values of cutoff/cutpoints so I can identify and estimate the value of variances. What I want to estimate is the mean and variance of the normal distribution that links discrete choices to a latent continuous variable.

When I run the code (see below for a toy version of my stan code), I always gets the infinite gradient error:

Rejecting initial value:
  Gradient evaluated at the initial value is not finite.
  Stan can't start sampling from this initial value.
Initialization between (-2, 2) failed after 100 attempts. 

The reason I attribute the infinite gradient issue to estimating the variance rather than something else is that the code would run perfectly once I set the variances (sigma in the code) to a constant, e.g. 1, however under which scenario I can’t estimate the variance.

I tried initialization. Even at the true parameter values, I still got the error. I know that the derivative of normal with respect the variance (std to be precise) is proportional to 1/std which makes std =0 problematic. So I changed the lower bound on the variance to greater than 0, set to 1 in the code below. I still got infinite gradient error. I don’t know why this is the case. Any help is greatly appreciated. Thanks.

Here’s a toy version of my Stan code. It’s essentially the same as the ordered probit example in the Stan manual, adjusted to reflect that I know the cut points and I’m estimating both the mean and the variance. I’ve also uploaded an example data.

data {
  int<lower=1> totalresponse;
  int<lower=1> n;
  int<lower=1> numresponse;
  int<lower=1,upper=numresponse> y[totalresponse];
  ordered[numresponse+1] cutoff;

parameters {
  real<lower=0, upper=1> mu_theta;
  real<lower=1, upper=5> sigma;

model {
  vector[numresponse] X;
  sigma ~ cauchy(0, 3);
  for (j in 1:numresponse){
    X[j] = normal_cdf(cutoff[j+1], mu_theta, sigma) - normal_cdf(cutoff[j], mu_theta, sigma);
  for (i in 1:totalresponse){
    y[i] ~ categorical(X);
} (1.2 KB)
estimating_variance.stan (499 Bytes)

X needs to be a simplex here (entries are non-negative and sum to 1). That’s probably what is failing.

Maybe try categorical_logit(X) and see if the model runs?

Computing the normal_cdf can be a bit tricky as well. There’s a function Phi_approx that you can use to approximately replace these that might work a little better (check the manual for details on this).

Oh, I should read more closely, your X should sum to 1.0 here. I apologize for being lazy.

Looks like you can’t use infs on the input data.

I changed the input to:

cutoff <- 
c(-10000.0, 0.1, 0.3, 0.5, 0.714285714285714, 0.909090909090909, 10000.0)

And your model runs

1 Like

Thanks for the prompt reply, bbbales2. Your suggestion made my code work! I ignored the fact that the cutpoints also affect the derivatives. Thanks again.

1 Like

That makes sense more sense than what I said. Something could definitely end up turning into a NaN accidentally. Good catch.