Using cmdstan 2.24.1. The model runs fine (no errors) and the parameter distributions come back as expected. However, for everything declared in the generated quantities block, I get nans, which I’m interpreting to mean overflow.
data {
int<lower=1> n; //number of elements in glc data
int<lower=1> nhol; //number of elements in hol data
int<lower=1> nwet; //number of elements in wet data
int<lower=1> k; //start index
vector[n] x_glc; // x for glc data
vector[n] xerr_glc; // err
vector[nhol] x_hol; // x for hol data
vector[nhol] xerr_hol; // err
vector[nwet] x_wet; // x for wet data
vector[nwet] xerr_wet; // err
vector[n] y_glc; // y for glc data
vector[n] yerr_glc; // err
vector[nhol] y_hol; // y for hol data
vector[nhol] yerr_hol; // err
vector[nwet] y_wet; // y for wet data
vector[nwet] yerr_wet; // err
}
parameters {
real<lower=1e-5> alpha; // slope
real<lower=1e-5> beta; // intcpt
vector<lower=1e-15>[n] xtrue; //true ion
}
model {
xtrue ~ normal(x_glc, xerr_glc);
y_glc[k:n] ~ normal(beta + alpha*xtrue[k:n], yerr_glc[k:n]); // possible to use subset of full data
}
generated quantities {
vector[n] xtrue_glc;
vector[nhol] xtrue_hol;
vector[nwet] xtrue_wet;
vector[n] ytrue_glc;
vector[nhol] ytrue_hol;
vector[nwet] ytrue_wet;
vector[n] pcorr_glc;
vector[nhol] pcorr_hol;
vector[nwet] pcorr_wet;
vector[n] ypred_glc;
vector[nhol] ypred_hol;
vector[nwet] ypred_wet;
for(i in 1:n) {
xtrue_glc[i] = normal_rng(x_glc[i], xerr_glc[i]);
ytrue_glc[i] = normal_rng(y_glc[i], yerr_glc[i]);
pcorr_glc[i] = a * xtrue_glc[i];
ypred_glc[i] = ytrue_glc[i] - pcorr_glc[i];
}
for(i in 1:nhol) {
xtrue_hol[i] = normal_rng(x_hol[i], xerr_hol[i]);
ytrue_hol[i] = normal_rng(y_hol[i], yerr_hol[i]);
pcorr_hol[i] = a * xtrue_hol[i];
ypred_hol[i] = ytrue_hol[i] - pcorr_hol[i];
}
for(i in 1:nwet) {
xtrue_wet[i] = normal_rng(x_wet[i], xerr_wet[i]);
ytrue_wet[i] = normal_rng(y_wet[i], yerr_wet[i]);
pcorr_wet[i] = a * xtrue_wet[i];
ypred_wet[i] = ytrue_wet[i] - pcorr_wet[i];
}
}
Update. There is a mistake in the code above. I have “a” in generated quantities instead of “alpha” as in the model block, but this error occurred while I was copying code here, so it was not the cause of failure. I tried a very similar, stripped down version of this code (see below) and it works fine, meaning I get back the generated quantities. As soon as I add the additional for loops with the quantities described in them, all generated quantities return NaNs despite no errors generated and the model block working fine. This is merely a nuisance in the end as I can take care of the stuff in the longer version of the generated quantities block (seen above) outside of Stan. However, I’m really curious to know what is going on.
Thanks for the response. I have since figured out why the entire generated quantities block was returning NaNs. The issue was a couple of very small negative numbers that were passed in as part of x_hol. It did not occur to me that negative numbers (in a practical sense they were just zeros) would be a problem. Once I dealt with those two data points (by equating them to very small positive numbers), everything worked. What I do not understand is why this caused all generated quantities to bomb instead of just the predictions in the “for i in 1:nhol” loop.
if generated quantities encounters an error, processing stops for that block.
the variables declared in the block are output and those variables have been initialized to NaNs.
no error is raised, and the sampler continues to the next iteration.
this is because the computation done in generated quantities doesn’t change the parameter estimates for the draw; the sampler produced a valid draw of the parameter values from the posterior, therefore you have a valid sample. if some computation in the generated quantities block is invalid due to bad data, well, c’est la vie.
Got it. I wasn’t aware that the computation stopped for the entire block in case of bad data.
In any case, the data in question was not necessarily bad, meaning the 2 data points (out of a few hundred) were essentially zeros within uncertainties - slightly negative real numbers as I mentioned before. This is something that happens quite often with measurements actually. What is the most elegant way to deal with these situations? How to sample in the generated quantities block if some of the data that define your zero intercept (in a linear regression model) are negative?