model_file = “model_loc_0.stan”
stan_model = stan(model_file, data = gev_data, ini = initf_manual,iter = 3500,
chains = 1, control=list(adapt_delta=0.85))
The san model:
functions{
vector gev_v_log_v(vector y, real mu, real sigma, real xi) {
real inv_xi;
real neg_inv_xi;
real inv_xi_p1;
vector[rows(y)] z;
vector[rows(y)] lp;
int N;
N = rows(y);
z = 1 + (y - mu) * xi / sigma;
inv_xi = 1 / xi;
neg_inv_xi = -inv_xi;
inv_xi_p1 = 1 + inv_xi;
for(n in 1:N){
lp[n] = log(sigma) + inv_xi_p1*log(z[n]) + pow(z[n],neg_inv_xi);
}
return -lp;
}
}
data {
int<lower=1> ns; // number of sites
int<lower=1> nt; // number of times
vector [nt] y[ns]; //data
}
parameters {
real<lower=0> mu[ns];
real<lower=0> sigma[ns];
real<lower=-1,upper=1> xi[ns];
}
model {
vector[nt] lp[ns];
for (i in 1:ns){
lp[i] = gev_v_log_v(y[i], mu[i], sigma[i], xi[i]);
}
}
I think the typical route is to declare your missing data separately to your observed data . Your observed data is declared in the data block, missing data is declared in the parameters block and treated as a parameter with the same distribution as your observed data. The example at the very top of the link above should give you a good sense of what to do.
Note also, so far as I can see you model is incomplete since it doesn’t use a sampling statement or target+=.
You could use the GNU R package AMELIA II to impute your dataset. From that you have something defined to work on and use that for comparison of your Stan Imputation.
You could also have a look at Bayesian PPCA for missing value imputation in the case of multivariate data with missing values. I personally don’t know a way to do imputation for multivariate data in Stan directly.