I am using the functions defined here: Extreme value analysis and user defined probability functions in Stan for modeling the data with a generalized pareto distribution, but my problem is that my model is in a for-loop and expects three real valued arguments, whereas, the gpd functions assume a vector, real, real argument.
I’m not so sure that my model chunk is so amenable to being vectorized, and so I was thinking I would need to have the gpd functions take in real valued arguments (but maybe I’m wrong).
I’d appreciate any help with switching the code around to achieve this. Here is my stan code
functions {
real gpareto_lpdf(vector y, real k, real sigma) {
// generalised Pareto log pdf
int N = rows(y);
real inv_k = inv(k);
if (k<0 && max(y)/sigma > -inv_k)
reject("k<0 and max(y)/sigma > -1/k; found k, sigma =", k, sigma)
if (sigma<=0)
reject("sigma<=0; found sigma =", sigma)
if (fabs(k) > 1e-15)
return -(1+inv_k)*sum(log1p((y) * (k/sigma))) -N*log(sigma);
else
return -sum(y)/sigma -N*log(sigma); // limit k->0
}
real gpareto_lcdf(vector y, real k, real sigma) {
// generalised Pareto log cdf
real inv_k = inv(k);
if (k<0 && max(y)/sigma > -inv_k)
reject("k<0 and max(y)/sigma > -1/k; found k, sigma =", k, sigma)
if (sigma<=0)
reject("sigma<=0; found sigma =", sigma)
if (fabs(k) > 1e-15)
return sum(log1m_exp((-inv_k)*(log1p((y) * (k/sigma)))));
else
return sum(log1m_exp(-(y)/sigma)); // limit k->0
}
}
data {
// the input data
int<lower = 1> n; // number of observations
real<lower = 0> value[n]; // value measurements
int<lower = 0, upper = 1> censored[n]; // vector of 0s and 1s
// parameters for the prior
real<lower = 0> a;
real<lower = 0> b;
}
parameters {
real k;
real sigma;
}
model {
// prior
k ~ gamma(a, b);
sigma ~ gamma(a,b);
// likelihood
for (i in 1:n) {
if (censored[i]) {
target += gpareto_lcdf(value[i] | k, sigma);
} else {
target += gpareto_lpdf(value[i] | k, sigma);
}
}
}