Persistent divergent transitions in simple model

(This refers to the same model mentioned in “Error codes from rstan::optimzing”.) I have a fairly simple model that has been giving me a lot of divergent transitions, and I can’t seem to make them go away, despite

• cranking up adapt_delta to 0.9999,
• doing a lot of reparameterization,
• modifying priors in various ways.

The model is basically this:

\mu_t = a + b \cdot t
a_t \sim \mbox{AR(1) process}
y_t = \mu_t + a_t \cdot \mu_t ^ x

All of x, a, b, and the AR(1) parameters are being estimated, with x restricted to the interval (0,1) and the autoregressive coefficient \phi restricted to (0,1). The parameterization used is a, b, \mathrm{logit}(\phi), \mathrm{logit}(x), and \log(\sigma_{\epsilon} \cdot \mu_u^x), where u is the midpoint of the time series.

Besides my original data set, I’ve been testing it on synthetic data generated from the intended model.

Here is a pairs plot from a run that used iter=4000, adapt_delta=0.9999, and max_treedepth=15. I’ve managed to reduce posterior correlations to moderate levels. There is a bit of funnel behavior in the interaction between b and logit_phi, but it’s not all that strong, and there are plenty of divergent transitions out in the broad part. Any suggestions?

Dear Kevin,

could you post the Stan model, data and calling instructions? That would be so helpful in analyzing the problem.

Here they are.
linear-ar1.stan is the stan model file.
example-data.csv is the data.
example.R is a (simplified) script to run the model.
linear-ar1.stan (2.3 KB)
example.R (1.1 KB)
example-data.csv (4.9 KB)

wouldn’t need that. I think it’s the preferred way, see AR(1) in Stan’s Usermanual for details.

One statement that really needs improvement is

  real a;
real b;


following:

real log_mu_mid = log(a + tmid * b);

You have to specify:

real<lower=0> a;
real<lower=0> b;

Due to time restrictions, I’m unable to fully check your Jacobian Adjustments. I believe that the model needs to be slightly modified in terms of the Stan Usermanual.

Thanks for catching the missing restrictions on a and b. Once I put those in, the divergent transitions went away.

BTW, I think what you’re seeing as unusual in the model is what I had to do to allow for missing data. The zero-centered AR(1) equation to use when there are no data points between t and t+n is

a_{t+n} \sim \mathrm{normal}(\phi^n a_t, \sigma_n),
\sigma_n = \sigma_{\epsilon} \cdot \left(\sum_{i=0}^{n-1} \phi^{2i}\right)^{1/2}.

Are you trying to perfom a stochastic trend ar(1) process?

Cause the x parameter might affect your model fit.