I am trying to import a dataset (y_raw
) with multiple missing variables into Stan (using cdstan
. I have replaced any missing values (NAs) with an easily recognizable number (-999) outside the observed data’s interval. I created a row vector in the “parameters” block (y_mis
) containing the imputed values, which I then want to replace the missing values in the dataset in “transformed parameters”. For this I created a for-loop inserting a value in the dataset whenever a match for the missing value, -999, is found. In order to do this I created an index variable that will tick upwards every time a missing is found. However, the “transformed parameters” block doesn’t accept integers, and integers are needed as input for the vector index. I therefore created a variable of type real and tried casting it to integer (`to_int(). This approach won’t work because casting to integer for some reason only works on data and not parameters. A different approach is needed. Any suggestions?
data {
int<lower=1> P; // number of variables
int<lower=0> N; // number of observations
array[P] int N_mis; // number of missing observations
array[N] row_vector[P] y_raw; // Missing are assigned -999
}
parameters {
row_vector[sum(N_mis)] y_mis;
}
transformed parameters {
array[P] row_vector[N] y;
real total_missing_so_far = 0.0;
for (p in 1:P) {
// Fix: expression after assignment, remove semicolon
int y_mis_index = to_int(total_missing_so_far + 1)
for (n in 1:N) {
if (y_raw[p, n] == -999) {
y[p, n] = y_mis[y_mis_index];
y_mis_index = y_mis_index + 1;
} else {
y[p, n] = y_raw[p, n];
}
}
total_missing_so_far = total_missing_so_far + N_mis[p];
}
}