Hi Stan Users,
I’m attempting to apply Stan to sampling a nonlinear inverse problem (below) and am running into some issues during warmup. Specifically, my stepsize drops to a super low value during warmup (e.g. 1e-5), leading to high autocorrelation and poor mixing. Skipping adaptation and manually setting the stepsize to something a bit bigger (~ 0.01) produces fairly accurate posterior distributions for a synthetic dataset, but creates a number of divergences (somewhere around 15%).
My intuition is that there must be some pathological areas in my posterior which are forcing the warmup to take a small stepsize, but I haven’t been able to figure out which parameters are the main culprits (and how to reparameterize them if needed). I’ve tried non-centered parameterizations for all of the terms in the below model but keep running into the same issue. Does anyone have any ideas for things I can modify to help improve the warmup/sampling? I’ve attached a pairs plot from a model run with a bigger stepsize and included my model below.
Thanks for the help!
Cheers,
Dorran
Problem:
- Basic Tarantola (2005) style inverse problem where we have some observed data, d_o, that is related to our model parameters by some nonlinear function, g, such that g(m) = d. In this case our data is observations of surface uplift rate, and our function g(m) takes the geometry of a geologic fault, m, and predicts the rate at which the surface would be uplifted for that geometry given a known horizontal velocity.
Model:
// Not including the forward model function "predict_uplift", since it's a bit long
data{
int nobs;
int nx;
int n_segments;
vector[nx] x;
vector[nx] z;
vector[nobs] x_obs;
vector[nobs] z_obs;
vector[nobs] u;
vector[1] gamma_guess;
real min_theta;
real max_theta;
real theta_prior_scale;
real theta_prior_location;
real convergence;
real x1u;
real z1u;
}
transformed data {
real x_r_placeholder[1];
int x_i_placeholder[1];
real x_max;
// Model Width
x_max = max(x);
// Placeholders for required inputs for the algebra solver
x_r_placeholder[1] = 0.0;
x_i_placeholder[1] = 1;
}
parameters {
// The model domain for our fault system looks something like:
//
//Data: ^ Rates of Uplift (mm/yr)
// | ^ |
// | ^ | | ^ ^
// X-o------------------------------------------------------------X (x=max(x),d=0) (earth's surface)
// | \ segment 1 w/ inclination theta[1], length[1]
// | \
// | \ theta[2], length[2]
// | o-----o
// | \ theta[3], length[3] <---- motion along fault surface (mm/yr)
// | \
// | \ theta[4], length[4] |
// | o----------------------------------------------o fault
// x
// Our model has three sets of parameters
// 1. theta -> Dip Angles (radians) of each fault segment
// 2. w -> horizontal length of each fault segment (defined as a fraction of our model length)
// 3. Modelization Uncertainty
vector[n_segments] theta;
simplex[n_segments] w_raw;
real<lower=0> sigma_modelization_raw;
}
transformed parameters {
vector[n_segments] w;
vector[n_segments] lengths;
real<lower=0> sigma_modelization;
// Scale widths from fraction to true distances
w = w_raw * x_max;
// Compute fault length from fault dip and widths
lengths = w ./ cos(theta);
// Non-centered parameterization adjustments
sigma_modelization = sigma_modelization_raw*5;
}
model {
vector[nobs] upred;
// Run forward model calculation to convert parameters to our observed data (uplift rates)
upred = predict_uplift(x,z,x_obs,z_obs,nobs,n_segments,x1u,z1u,convergence,
theta,w,gamma_guess,x_r_placeholder,x_i_placeholder);
upred = upred*1000; // m/yr -> mm/yr conversion
// ----------------------- MODEL PRIORS -------------------------------
// Inclination of each segment cannot exceed a range of [-1deg,29deg], otherwise our forward calculation will crash and return -inf log-probability
for (i in 1:n_segments) {
theta[i] ~ normal(theta_prior_location, theta_prior_scale) T[min_theta,max_theta];
}
target += dirichlet_lpdf(w_raw | rep_vector(1,n_segments));
target += normal_lpdf(sigma_modelization_raw | 0,1);
// ----------------------- Log-Likelihood -----------------------------
target += normal_lpdf(u | upred,sigma_modelization);
}
'''