Hi everyone,
I am currently running a simulation that involves fitting a hierarchical model (as part of a multilevel regression and poststratification [MRP] protocol) to a sample drawn from a simulated population. This is the relevant workflow:
- Generate the population data
- For i in 1:100
a. Draw a sample from the population data
b. Fit a hierarchical model to the sample
c. Extract estimates from model
d. Do more non-Stan related things; delete fitted Stan model - Summarize results
I am running rstan with 4 chains, each with 2000 iterations (1900 warmup, 100 sampling).
The issue is that the amount of time required for rstan to complete its sampling monotonically increases with i. Specifically, rstan’s runtime is about 10 seconds per chain when i=1, and steadily increases to 70 seconds per chain when i = 4. See the graph below, where I plotted the runtime of rstan against i. (dropped the first observation since it didn’t record the time correctly).
This is surprising, as I would expect the runtime to be independent of i. It may be relevant that I am using stratified cluster sampling in this simulation with 5 strata, 80 cluster per strata, and 400 clusters total with around 8 clusters per strata sampled; rstan did not give me issues when I ran a very similar model-fitting procedure on samples drawn from a non-complex sampling design from a separate experiment.
Why might this be happening? I can provide more details and/or try to write a reproducible example if needed. Rstan code included at the end of this post. Thank you in advance!
The data generating process depends on a continuous X_1, binary X_2, their interaction, and a cluster-specific effect. For cluster i and subject j:
\begin{align*}Y_{ij} &= \alpha_0 + \alpha_1 X_{1ij} + \alpha_{X2[ij]} + \alpha_{X1*X2[ij]}X_{1ij} + \alpha_{clus[i]} + \epsilon_{ij}\\
&X_{1ij} \sim N(0,1)\\
&X_{2ij} \sim Bernoulli(expit(0.6X_{1ij}))\\
&\alpha_{clus[i]} \sim N(0,1), \quad i \in \{1, \ldots, N\}\\
&\epsilon_{ij}\sim N(0,1)\end{align*}
Because my endgoal is to use MRP, I use a categorical version of X_1 (X1_{cat}) in the outcome model fitted by rstan:
\begin{align*}
\theta_u &= \beta_0 + \beta_{X1cat[u]} + \beta_{X2[u]} + \eta_{X12[u]} + \eta_{clus[u]}\\
Y_{ij} &\sim N(\theta_{u[ij]}, \sigma_y^2)\\
\beta_{var} &\sim Uniform(-\infty, \infty)\\
\eta_{var} &\sim N(0, \sigma^2_{var})\\
\sigma^2_{var}, \sigma^2_y &\sim N(0,1)
\end{align*}
data{
int<lower=0> Max2; // number of X1:X2 groups (2x2 = 4)
int<lower=0> Mu; // number of sampled clusters
int<lower=0> J_stan; // number of cellmeans
int<lower=0> n; // total sample size
real y[n]; // continuous outcome
vector<lower=0, upper=1>[J_stan] agroup; // X1 index at cell level
vector<lower=0, upper=1>[J_stan] x2group; // X2 index at cell level
int<lower=1, upper=Max2> ax2group[J_stan]; //A:X2 index at cell level
int<lower=1> ugroup[J_stan]; // cluster effect index at cell level
int<lower=1, upper= J_stan> cell_label[n];
}
parameters{
vector[3] beta; //global intercept, X1a, X2
vector[Max2] ax2z;
vector[Mu] u;
real<lower=0> sigmay; // sd of the observations
real<lower=0> sigma_ax2; //sd of X1:X2 interaction
real<lower=0> sigma_u; // sd of cluster effects
}
transformed parameters{
vector[Max2] alphaax2 = ax2z * sigma_ax2; // X1:X2 interaction
vector[Mu] alphau = u * sigma_u;
vector[J_stan] cellmean = beta[1] + beta[2]*agroup + beta[3]*x2group + alphaax2[ax2group]+alphau[ugroup];
}
model{
ax2z ~std_normal();
u ~ std_normal();
sigma_ax2 ~std_normal();
sigma_u ~std_normal();
sigmay ~ std_normal();
y~ normal(cellmean[cell_label], sigmay);
}