Dear Stan community,
I’m struggling to fit a model in which linear models are hierarchically attached to parameters of a nonlinear model. The issue is that the model sometimes runs smoothly and produces sensible estimates, but sometimes one or two chains get stuck and hit max treedepth and/or produce some divergent transitions. I’ve tried to employ non-centered parameterizations for the linear models and used prior predictive simulations to inform prior choice, but sampling is still unstable. I’m not sure what to do (or what can be done) now, so I wanted to ask for advice/help here.
Here is a slightly simplified version of the model:
data{
int<lower=1> n; // number of observations
vector<lower=0, upper=1>[n] w; // predictor for nonlinear parameters (scaled to 0--1)
array[n] int<lower=0> x; // predictor for outcome
array[n] int<lower=0> y; // outcome
}
transformed data {
vector<lower=0>[n] w2;
w2 = square(w);
}
parameters{
// linear model for L
real intercept_L;
real beta_L1;
real beta_L2;
real<lower=0> sigma_L;
// linear model for A
real intercept_A;
real beta_A;
real<lower=0> sigma_A;
// nonlinear model for y
real<lower=0> phi;
// z-scores
vector[n] z_L;
vector[n] z_A;
}
model {
vector[n] mu;
vector[n] L;
vector[n] log_L;
vector[n] mu_L;
vector[n] A;
vector[n] log_A;
vector[n] mu_A;
intercept_L ~ normal(4, 2);
beta_L1 ~ normal(0, 2);
beta_L2 ~ normal(0, 2);
sigma_L ~ exponential(1);
intercept_A ~ normal(0, 1);
beta_A ~ normal(0, 1);
sigma_A ~ exponential(1);
1/phi ~ exponential(1);
z_L ~ normal(0, 1);
z_A ~ normal(0, 1);
for (i in 1:n) {
// linear model for L
mu_L[i] = intercept_L + beta_L1 * w[i] + beta_L2 * w2[i];
log_L[i] = mu_L[i] + z_L[i] * sigma_L;
L[i] = exp(log_L[i]);
// linear model for A
mu_A[i] = intercept_A + beta_A * w[i];
log_A[i] = mu_A[i] + z_A[i] * sigma_A;
A[i] = exp(log_A[i]);
// nonlinear model for y (L > 0, A > 0)
mu[i] = L[i] / (1 + A[i] * x[i]);
}
y ~ neg_binomial_2(mu, phi);
}
I suspect that the sampling difficulties partially stem from the exp()
's which are used to constrain L and A to be positive as required by the nonlinear model. Wrapping the linear models in exp()
has also made it difficult to set sensible priors. I’ve observed that the sampling issues can become worse with even minor changes to priors that I don’t feel are that drastically different in terms of their prior predictions (e.g., expanding to intercept_A ~ normal(0, 2)
, beta_A ~ normal(0, 2)
). Perhaps the model is too complex for the data?
I would appreciate any advice/suggestions on how I might be able to improve this model and possibly resolve my issues. Thank you very much!
System info: macOS 13.4, R v4.4.1, cmdstanr
v0.8.1, CmdStan v2.35.0