Hello!
I’m currently trying to build a model with an AR(1) time series in the parameter space, as my simpler model (with simple hierarchical pooling on each of the regression parameters) requires additional regularisation (frequent across chain disagreement as to the location of the posterior).
I am getting into trouble for a number of reasons:
- Based on the advice I have seen on the various stan discussion forums, I am parameterising this time series by the innovations. However, I have some constraints in place:
- I have positivity constraints that are motivated from the physical nature of the process / data.
- I have an ordering constraint on the midpoint of the tanh component for identifiability.
- I cannot, ahead of time, constrain the innovations to respect these constraints (at least, I do not think I can?)
As such, it is easy to generate an initial set of innovations that lead to parameter values that violate the above constraints (leading to the model being difficult to initialize). Specifying my own initial values doesn’t help a great deal as (i suspect) appropriate values/proposals for the innovations are difficult to generate.
The combination of this parameterisation and set of constraints is proving difficult to deal with.
I am beginning to suspect that I am pushing the limits of what this parametric model can fit, and a major reparameterization / new model is on my horizon, but I’d really like to make this one work. Are there any major suggestions for ways to constrain / parameterise the innovations differently so that I can sample this model?
functions {
real ar_next(real prev, int index, vector kappa, vector phi, vector innovation) {
real result;
result = kappa[index] + (phi[index] * prev) + innovation[index];
return(result);
}
}
data {
// Number of depths
int<lower = 0> n_depths;
// Number of time points
int<lower = 0> n_times;
// Vector of depths measurements were taken at
vector[n_depths] depths;
// Matrix of measurements, each row is a set of measurements at one time point
// These measurements must be on the ORIGINAL scale in order for the priors
// to make any kind of physical sense.
matrix[n_times, n_depths] densities;
}
parameters {
real<lower = 0> beta_zero_t0;
real<lower = 0> beta_one_t0;
real<lower = 0> beta_three_t0;
real<lower = 0> beta_six_t0;
// beta_two and beta_five
// need to be ordered for identifiability.
positive_ordered[2] beta_midpoint_t0;
// ar parameters
// they are all positive (even though the _could_ be negative) as there is
// zero empircal chance they are negative
vector<lower = 0, upper = 1>[6] phi;
// ar constant, might require stabilising.
// these processes all have positive mean, so kappa must be positive
vector<lower = 0>[6] kappa;
// ar error distribution
vector<lower = 0>[6] beta_sigma;
// innovation container
// I don't know how to make these innovations satisfy the positive_ordered
// constraint that we require????
matrix[n_times - 1, 6] innovations;
real<lower = 0> sigma_curve;
}
transformed parameters {
matrix[n_times, n_depths] fitted_values;
real<lower = 0> beta_zero[n_times];
real<lower = 0> beta_one[n_times];
real<lower = 0> beta_three[n_times];
real<lower = 0> beta_six[n_times];
positive_ordered[2] beta_midpoint[n_times];
beta_zero[1] = beta_zero_t0;
beta_one[1] = beta_one_t0;
beta_three[1] = beta_three_t0;
beta_six[1] = beta_six_t0;
beta_midpoint[1, 1] = beta_midpoint_t0[1];
beta_midpoint[1, 2] = beta_midpoint_t0[2];
for (tt in 1:n_times) {
if (tt >= 2) {
beta_zero[tt] = ar_next(beta_zero[tt - 1], 1, kappa, phi, innovations[tt - 1, ]');
beta_one[tt] = ar_next(beta_one[tt - 1], 2, kappa, phi, innovations[tt - 1, ]');
beta_midpoint[tt, 1] = ar_next(beta_midpoint[tt - 1, 1], 3, kappa, phi, innovations[tt - 1, ]');
beta_three[tt] = ar_next(beta_three[tt - 1], 4, kappa, phi, innovations[tt - 1, ]');
beta_midpoint[tt, 2] = ar_next(beta_midpoint[tt - 1, 2], 5, kappa, phi, innovations[tt - 1, ]');
beta_six[tt] = ar_next(beta_six[tt - 1], 6, kappa, phi, innovations[tt - 1, ]');
}
fitted_values[tt,] = (beta_zero[tt] - beta_one[tt] * (tanh((depths + beta_midpoint[tt, 1]) / beta_three[tt]) + tanh((depths + beta_midpoint[tt, 2]) / beta_six[tt])))';
}
}
model {
for (tt in 1:n_times) {
// vectorwise over individuals.
densities[tt,] ~ normal(fitted_values[tt,], sigma_curve);
if (tt < n_times) { // otherwise we will out-of-bounds the innovations matrix
innovations[tt, ] ~ normal(0, beta_sigma);
}
}
// ar priors
phi ~ beta(3, 1);
// I really have no idea what on earth this should be
kappa ~ normal(100, 500);
// this might be too tight? We shall see, again unsure as to the
// scale of these in this time series structure.
beta_sigma ~ normal(0, 10);
// initial value priors
beta_zero_t0 ~ normal(1023, 2);
beta_one_t0 ~ normal(1, 2);
beta_three_t0 ~ normal(42, 3);
beta_six_t0 ~ normal(45, 3);
beta_midpoint_t0[1] ~ normal(50, 5);
beta_midpoint_t0[2] ~ normal(150, 5);
// It's tiny.
sigma_curve ~ normal(0, 0.1);
}
Unfortunately I cannot share the original data, however I have attached an R script that generates data that is very similar. I have also attached an R script with some appropriate initial values / model fitting code.
Some other tidbits:
- I can’t rescale the data, as the parameter estimates have physical interpretation / priors, and I’m not sure how to translate them onto a different scale, subject to the nonlinear regression function.
- I have fitted simpler models & have the typical swath of diagnostics, if you’re interested in how I’ve ended up needing time dependency in the parameter space.
Unrelated: I’ve been using Stan for 3 years now, and 99.9% of the time it’s been great! So I’d just like to say thanks for all the work that’s gone into it!
datagenerator.R (2.0 KB)
fitter.R (1.8 KB)