I am having problems in the initialization phase of a collection of models. I have a likelihood for my data (generalized extreme value, GEV) and I am modeling some parameters as time series -sometimes I consider the parameters constant, others autoregressive of order 1 and others I include a seasonality term-, which I believe to be where the problem lies.
The following model -an example of the set- is failing to initialize. The location parameter is an autoregressive model of order 1 (AR(1)) with a seasonality term of order 12. The scale parameter is an AR(1).
#include ../gev_vvv.stan
data {
int<lower=0> N;
vector<lower=0>[N] y;
int idxs[N];
real<lower=3> nu; // Number of sigmas to avoid zero valued parameters
}
parameters {
// Location
real<lower=0> locm;
vector<lower=0>[N] loct;
real<lower=0> sigmal;
real<lower=0, upper=1> Phil;
real<lower=0, upper=1> Laml;
// Scale
real<lower=0> escm;
vector<lower=0>[N] esct;
real<lower=0> sigmae;
real<lower=0, upper=1> Phie;
// Shape
real<lower=-.5, upper=.5> form;
}
transformed parameters {
vector<lower=-.5, upper=.5>[N] fort;
// LOCATION
real<lower=0, upper=1> sigmaol= nu * sigmal / sqrt((1 - Phil^2) * (1 - Laml^2)) / locm;
real<lower=0> sigmasl = sqrt(sigmal^2 / (1 - Laml^2));
// SCALE
real<lower=0, upper=1> sigmaoe = nu * sigmae / sqrt((1 - Phie^2)) / escm;
// SHAPE
real<lower=0, upper=1> formplushalf = form + .5;
for (i in 1:N) {
fort[i] = form;
}
}
model {
// LOCATION
locm ~ gamma(5, 1./20);
sigmaol ~ beta(2, 4);
Phil ~ beta(2, 4);
Laml ~ beta(2, 4);
// SCALE
escm ~ gamma(4, 1./10);
sigmaoe ~ beta(2, 4);
Phie ~ beta(2, 4);
// SHAPE
formplushalf ~ beta(4, 4);
// LOCATION
loct[1] ~ normal(locm, sigmaol * locm / nu);
loct[2:13] ~ normal(Phil * loct[1:12] + (1. - Phil) * locm, sigmasl);
loct[14:N] ~ normal(Phil * loct[13:(N-1)] + Laml * loct[2:(N-12)] - Phil * Laml * loct[1:(N-13)] + (1. - Phil - Laml + Phil * Laml) * locm, sigmal);
// SCALE
esct[1] ~ normal(escm, sigmaoe * escm / nu);
esct[2:N] ~ normal(Phie * esct[1:(N-1)] + (1. - Phie) * escm, sigmae);
// SHAPE
y[idxs] ~ gev(loct[idxs], esct[idxs], fort[idxs]);
// LOCATION
target += -log(sigmaol) + log(sigmal);
// SCALE
target += -log(sigmaoe) + log(sigmae);
// SHAPE
}
I am afraid that I may have severely misunderstood some of the concepts related to bounds, priors and changes of variables, and I cannot wrap my head around the problem, so I would really appreciate any helping hand.
My thinking goes as follows:
When I define a parameter as an AR(1), this implies that I have the following parameters:
(1) the average of the process (\mu_m, locm in the code)
(2) the autocorrelation coefficient (\Phi_l, Phil in the code)
(3) the standard deviation of the noise process (\sigma_l, sigmal in the code).
In the example I have an additional parameter introduced by the seasonal component, but its behavior is similar, so I exclude it from this explanation.
The standard deviation of the AR(1) is \sigma^* = \dfrac{\sigma_l}{\sqrt{1 - \Phi^2}} and assuming my AR(1) process is going to always be within 5 \sigma^* of the mean[^1], I can define \sigma_{ol} = \dfrac{5 \sigma_l}{\mu_{m}\sqrt{1 - \Phi^2}}, bound it to be between 0 and 1, and give it a beta prior distribution.
[^1]: I know this it not always true, but I believe it may be a good enough approximation -I may be wrong though-.
If I give a prior and bounds to \mu_m, \Phi and \sigma_{ol}, does Stan automatically sample \sigma_l in such a way that all bounds are respected? Or as \sigma_l is not upper bounded it will be able to get \infty values despite all the specified priors?
I do not understand why when Stan tries to initialize \sigma_{ol}, it is producing values larger than 1, which in my mind should be impossible as the variable is bounded between 0 and 1, and assigned a beta prior. What have I not understood here? [I know the answer would be _so many things_ , but I would really appreciate a minimal set here :D]
Also, as I am giving a prior to \sigma_{ol} instead of to \sigma_{l} I am including the determinant of the jacobian of the inverse transformation. Is that the correct procedure? Because in some cases, the models seem to initialize if I don’t include the determinant, which seems counterintuitive to me.
I hope to have exposed my case clearly, but if not, please feel free to require as many clarifications as needed.
I really appreciate any help and hint that you can give me.