I would like to fit a model to a dataset, namely, I would like to model the response of a system to changes in sound intensity in dB.
The system displays a sigmoidal change in its response with increasing Sound to Noise ratio in dB which I want to model and whose parameters I want to estimate.
I have my recorded data Y(SNR) with several SNRs in dB: -3dB, -5dB, -7dB, and -Inf, which is a “no sound” condition, since Y = 10*log_10(I/I0), with I being the input signal, and I0 the (here, constant) noise.
I can’t have -Inf as a value in a predictor variable in Stan, or am I wrong?
Should I transform my data? I am unsure how to go about handling this issue.
In this case I’ll give more information because I can’t wrap my head around how to get this to work. My model is the following:
functions{
}
data {
int Nobs; // number of data obervations
real mu_low; //prior knowledge from external data
real sigma_low; //prior knowledge from external data
vector[Nobs] dat; // data, one per row
vector[Nobs] SNR; // experimental factor
}
parameters {
vector[3] b;// unconstrained parameters
vector<lower = 0>[4] pos_b; // postive parameters
}
transformed parameters{
vector[Nobs] mu;
vector[Nobs] sigma;
real mu_max;
mu_max = - mu_low;
for(i in 1:Nobs){
mu[i] = b[1] + ((b[2] - b[1])/(1+exp(-pos_b[2]*(SNR[i] - pos_b[4]))));
sigma[i] = pos_b[3] + ((pos_b[1] - pos_b[3])/(1+exp(-b[3]*(SNR[i] - pos_b[4]))));
}
}
model {
b[1] ~ normal(mu_low,1); // min
b[2] ~ normal(mu_max,1); // max
b[3] ~ normal(0,10); // slope sigma
pos_b[1] ~ normal(sigma_low,10); // max sigma
pos_b[2] ~ normal(0,10); // slope
pos_b[3] ~ normal(sigma_low,10); // slope
pos_b[4] ~ normal(3.5,1); // threshold
//LLH
target += normal_lpdf(dat| mu,sigma);
}
generated quantities{
vector[Nobs] log_lik;
for (i in 1:Nobs){
log_lik[i] = normal_lpdf(dat[i]|mu[i], sigma[i]);
}
}
The vector SNR contains values that can be either -Inf or 1, 2, 3, 4 or 5. When I try to run it in rstan, it says that
Chain 1: Rejecting initial value:
Chain 1: Gradient evaluated at the initial value is not finite.
Chain 1: Stan can't start sampling from this initial value.
Chain 1:
Which led me to think that it’s just not possible to do this in Stan.
So I guess my question is: is using -Inf as a value in my predictor variable in this context possible? If so, what am I doing wrong?
That is saying target went NaN or \mp \infty. If it were going to object to a predictor, it wouldn’t even have gotten that far. You can put print statements to figure out which element of mu or sigma is problematic.
So, I reduced the model to a minimum, and returned values of mu, sigma and the lp. They are all finite numbers. The model runs when I exclude datapoints for which SNR is not -Inf.
I browsed through the documentation but couldn’t find much help, I’m at a loss here. What steps can I take to figure out what is going on?
functions{
}
data {
int Nobs; // number of data obervations
vector[Nobs] dat; // data, one per row
vector[Nobs] SNR; // experimental factor
}
parameters {
vector[4] b;// unconstrained parameters
vector<lower = 0>[1] pos_b; // postive parameters
}
transformed parameters{
vector[Nobs] mu;
vector[Nobs] sigma;
for(i in 1:Nobs){
// print("pos_b")
// print(pos_b)
print("b")
print(b)
mu[i] = b[1] + ((b[2] - b[1])/(1+exp(-b[3]*(SNR[i] - b[4]))));
print("mu")
print(mu[i])
sigma[i] = pos_b[1];
print("sigma")
print(sigma[i])
}
}
model {
b[1] ~ normal(-0.5,1); // min
b[2] ~ normal(0.5,1); // max
b[3] ~ normal(0,10); // slope
b[4] ~ normal(0,10); // x50
pos_b[1] ~ normal(0,10); // sigma
//LLH
target += normal_lpdf(dat| mu,sigma);
print(target())
}
generated quantities{
vector[Nobs] log_lik;
for (i in 1:Nobs){
log_lik[i] = normal_lpdf(dat[i]|mu[i], sigma[i]);
}
}
Does your last comment mean I can rely on your solution provided that the model gives reasonable outputs, or that I should turn to something else? I could remove the -Inf condition and use it to inform my priors for example. It would be a waste of data, but that’s better than running a bad model.
The model should work correctly as long as there’s no posterior uncertainty about the sign of b[3]. If the sign is uncertain, I think the samples are still correct, just inefficient.