Hi all,

I’m trying to fit a simple hierarchical model but keep getting messages warning about a small number of divergent transitions (~ 10) and recommending that I increase `adapt_delta`

. These divergent transitions appear below the pairs() diagonal - which the manual suggests may indicate a need for reparameterization. In addition, with just a small sample I’m getting very high run-times. I know this is a small number of divergent transition, but combined with a long runtime my hunch is that I’m just not writing the model correctly and wondering if anyone has better suggestions.

I’ve attached a reproducible example. The model is as follows:

I have i=1, \dots, n subjects over time t=1, \dots, 24. For each subject, i record measurements on 5 outcomes, \{y_{1it}, \dots, y_{5it} \}. I also have a subject-level treatment indicator, A_i. The goal is to estimate an overall treatment effect, \delta^*, as well as outcome-specific treatment effects, \{\delta_k\}_{k\in1:5}, while building in the prior belief that the outcome-level treatment effects are distributed around a common overall treatment effect.

y_{itk} \sim N \Big( \alpha_i + \beta + \delta_k A_i, \phi_k \Big) for k=1, \dots, 5

The \alpha_i are subject-level random effects. I have a common intercept \beta for now - but will let this vary with k eventually.

For priors, I have that

\delta_k \sim N( \delta^*, \phi_\delta )

With a Normal hyper-prior on \delta^* and half-Cauchy on \phi_\delta. The full Stan model is:

```
data {
int<lower=1> Npats;
int<lower=1> N;
int<lower=1> K;
vector<lower=0, upper=1>[N] A;
int<lower=0> id[N];
real y1[N];
real y2[N];
real y3[N];
real y4[N];
real y5[N];
}
parameters {
vector[K] delta_raw;
real delta_star;
real beta ;
vector[Npats] alpha;
vector<lower=0>[K] phi_y;
real<lower=0>phi_alpha;
real<lower=0>phi_delta;
}
transformed parameters{
vector[N] mu1;
vector[N] mu2;
vector[N] mu3;
vector[N] mu4;
vector[N] mu5;
vector[K] delta;
// non-centered parameterization
delta = delta_star + delta_raw*phi_delta;
// compute linear combination for each outcome model
mu1 = alpha[id] + delta[1]*A + beta;
mu2 = alpha[id] + delta[2]*A + beta;
mu3 = alpha[id] + delta[3]*A + beta;
mu4 = alpha[id] + delta[4]*A + beta;
mu5 = alpha[id] + delta[5]*A + beta;
}
model {
phi_delta ~ cauchy(0, 10000);
phi_alpha ~ cauchy(0, 10000);
phi_y ~ cauchy(0, 10000);
alpha ~ normal(0, phi_alpha);
delta_star ~ normal(0, 2000);
delta_raw ~ std_normal();
beta ~ normal(0, 10000);
y1 ~ normal(mu1, phi_y[1]);
y2 ~ normal(mu2, phi_y[2]);
y3 ~ normal(mu3, phi_y[3]);
y4 ~ normal(mu4, phi_y[4]);
y5 ~ normal(mu5, phi_y[5]);
}
```

Even if I simulate a small dataset with n=100 and t=1, \dots, 5, it takes several minutes to run. In reality I’ll have a few thousand observations and dozens of time points. It takes hours to get just 5,000 posterior samples (after 5,000 warmup). If this is just Stan being Stan, that’s fine. But if it’s due to my parameterization, I’d love to my the necessary edits to speed things up.

test_forum.r (1.4 KB) test.stan (1.1 KB)