I have a logisitic (binary) model with two levels of clustering - group and individual. The group level random effects are actually time-specific group effects, which are correlated within group. When I run a model with these group effects alone, it takes a while - maybe 20 minutes on my 2017 Macbook Pro. (That is with 24 groups and 12 time periods, so 12 effects per group.)
When I include an individual level effect (forget about introducing any correlation at the individual level), the estimation time is on the order of hours - maybe it will take over night, perhaps longer. Some individuals have data for a single time period, whereas others might have up to 20 periods. There are 5000+ individuals, I imagine that is a problem. I also saw from the diagnostics that the sampling wasn’t particularly stable for the variance of the individual effects.
Here’s my code - and if anyone can see something obvious, let me know. Or if it is what it is, that would be good to know as well. (And if you need more information, let me know.)
data {
int<lower=1> I; // number of unique individuals
int<lower=1> N; // number of records
int<lower=1> K; // number of predictors
int<lower=1> J; // number of clusters
int<lower=0> T; // number of periods
int<lower=1,upper=I> ii[N]; // id for individual
int<lower=1,upper=J> jj[N]; // group for individual
int<lower=1,upper=T> tt[N]; // period of indidvidual
matrix[N, K] x; // matrix of predictors
int<lower=0,upper=1> y[N]; // vector of outcomes
}
parameters {
vector[K] beta; // model fixed effects
real<lower=0> sigma_S; // site variance (sd)
real<lower=-1,upper=1> rho; // correlation
real<lower=0> sigma_I; // individual level varianc (sd)
vector[T] ran_S[J]; // site level random effects (by period)
vector[I] ran_I; // individual level random effects
}
transformed parameters{
cov_matrix[T] Sigma;
vector[N] yloghat;
vector[T] mu_S = rep_vector(0, T); // mod
// Random effects with exchangeable correlation
real sigma_S2 = sigma_S^2;
real rho_sigma_S2 = rho * (sigma_S^2);
for (j in 1:(T-1))
for (k in (j+1):T) {
Sigma[j,k] = rho_sigma_S2;
Sigma[k,j] = Sigma[j, k];
}
for (i in 1:T)
Sigma[i,i] = sigma_S2;
//
for (i in 1:N)
yloghat[i] = x[i]*beta + ran_S[jj[i], tt[i]] + ran_I[ii[i]];
}
model {
sigma_I ~ exponential(0.25);
sigma_S ~ exponential(0.25); // mod
rho ~ uniform(-1, 1);
ran_S ~ multi_normal(mu_S, Sigma); // mod: vectorized
ran_I ~ normal(0, sigma_I);
y ~ bernoulli_logit(yloghat);
}
generated quantities {
real sigma_site2;
real sigma_ind2;
sigma_site2 = sigma_S^2;
sigma_ind2 = sigma_I^2;
}