Predictor variable can't contain -Inf as a value

Hi,

I would like to fit a model to a dataset, namely, I would like to model the response of a system to changes in sound intensity in dB.

The system displays a sigmoidal change in its response with increasing Sound to Noise ratio in dB which I want to model and whose parameters I want to estimate.

I have my recorded data Y(SNR) with several SNRs in dB: -3dB, -5dB, -7dB, and -Inf, which is a “no sound” condition, since Y = 10*log_10(I/I0), with I being the input signal, and I0 the (here, constant) noise.

I can’t have -Inf as a value in a predictor variable in Stan, or am I wrong?

Should I transform my data? I am unsure how to go about handling this issue.

Inf should be legal, although NA is not.

Hi,

In this case I’ll give more information because I can’t wrap my head around how to get this to work. My model is the following:


functions{
}

data {
  int Nobs; // number of data obervations
  real mu_low; //prior knowledge from external data
  real sigma_low; //prior knowledge from external data
  vector[Nobs] dat; // data, one per row
  vector[Nobs] SNR; // experimental factor
}


parameters {
  vector[3] b;// unconstrained parameters
  vector<lower = 0>[4] pos_b; // postive parameters
}

transformed parameters{
vector[Nobs] mu;
vector[Nobs] sigma;
real mu_max;
mu_max = - mu_low;

for(i in 1:Nobs){
     mu[i] = b[1] + ((b[2] - b[1])/(1+exp(-pos_b[2]*(SNR[i] - pos_b[4]))));
     sigma[i] = pos_b[3] + ((pos_b[1] - pos_b[3])/(1+exp(-b[3]*(SNR[i] - pos_b[4]))));
}

}
model {
  
  b[1] ~ normal(mu_low,1); // min
  b[2] ~ normal(mu_max,1); // max
  b[3] ~ normal(0,10); // slope sigma
  pos_b[1] ~ normal(sigma_low,10); // max sigma
  pos_b[2] ~ normal(0,10); // slope
  pos_b[3] ~ normal(sigma_low,10); // slope
  pos_b[4] ~ normal(3.5,1); // threshold
 
  
  //LLH
    target += normal_lpdf(dat| mu,sigma);
}

generated quantities{
  vector[Nobs] log_lik;

    for (i in 1:Nobs){
    log_lik[i] = normal_lpdf(dat[i]|mu[i], sigma[i]);
    }
}

The vector SNR contains values that can be either -Inf or 1, 2, 3, 4 or 5. When I try to run it in rstan, it says that

Chain 1: Rejecting initial value:
Chain 1:   Gradient evaluated at the initial value is not finite.
Chain 1:   Stan can't start sampling from this initial value.
Chain 1:

Which led me to think that it’s just not possible to do this in Stan.
So I guess my question is: is using -Inf as a value in my predictor variable in this context possible? If so, what am I doing wrong?

Thank you!

That is saying target went NaN or \mp \infty. If it were going to object to a predictor, it wouldn’t even have gotten that far. You can put print statements to figure out which element of mu or sigma is problematic.

So, I reduced the model to a minimum, and returned values of mu, sigma and the lp. They are all finite numbers. The model runs when I exclude datapoints for which SNR is not -Inf.

I browsed through the documentation but couldn’t find much help, I’m at a loss here. What steps can I take to figure out what is going on?


functions{
}

data {
  int Nobs; // number of data obervations
  vector[Nobs] dat; // data, one per row
  vector[Nobs] SNR; // experimental factor
}


parameters {
  vector[4] b;// unconstrained parameters
  vector<lower = 0>[1] pos_b; // postive parameters
}

transformed parameters{
vector[Nobs] mu;
vector[Nobs] sigma;

for(i in 1:Nobs){
     // print("pos_b")
     // print(pos_b)
     print("b")
     print(b)
     mu[i] = b[1] + ((b[2] - b[1])/(1+exp(-b[3]*(SNR[i] - b[4]))));
     print("mu")
     print(mu[i])
     sigma[i] = pos_b[1];
     print("sigma")
     print(sigma[i])
}

}
model {
  
  b[1] ~ normal(-0.5,1); // min
  b[2] ~ normal(0.5,1); // max
  b[3] ~ normal(0,10); // slope
  b[4] ~ normal(0,10); // x50
  pos_b[1] ~ normal(0,10); // sigma
 
  
  //LLH
    target += normal_lpdf(dat| mu,sigma);
        print(target())

}

generated quantities{
  vector[Nobs] log_lik;

    for (i in 1:Nobs){
    log_lik[i] = normal_lpdf(dat[i]|mu[i], sigma[i]);
    }
}

Autodiff won’t work with this when SNR=-inf because the derivative with respect to b[3] is zero times infinity. The expression is equivalent to

if (is_inf(-SNR[i]))
  mu[i] = (b[3] > 0) ? b[1] : b[2];
else
  mu[i] = b[1] + ((b[2] - b[1])/(1+exp(-b[3]*(SNR[i] - b[4]))));

This should differentiate properly. The discontinuity is not good news though.

OK, thank you for the fix and the explanation!

Does your last comment mean I can rely on your solution provided that the model gives reasonable outputs, or that I should turn to something else? I could remove the -Inf condition and use it to inform my priors for example. It would be a waste of data, but that’s better than running a bad model.

The model should work correctly as long as there’s no posterior uncertainty about the sign of b[3]. If the sign is uncertain, I think the samples are still correct, just inefficient.

OK, I’ll give it a try. Thank you very much!