Optimising stan fit for fixed effects and interactions

Hi Stan community!

I have a multivariate model that I am trying to optimise. I have a model without fixed effect strucuture with 4 random effects per model equation that takes ~40 minutes to fit with 4 chains, 2000 warmup and 3000 sampling iterations.

When I extend this model with 4 fixed effects, and 3 two-way and one three-way interaction the model fit becomes a lot slower and takes up to 5 hours. The dataset I use has ~5000 observations, and three of the fixed effects are binary that are coded as -0,5 and 0.5 and a covariate that is centred. The response variables are scaled for efficiency.

I use the following code for the fixed effects and the interactions, is there a way to optimise the code to make the fit faster? Or is this simply taking a lot of time because the model has difficulty estimating the interactions?

"data {
  // Predictors
  vector[N] fixef1; 
  vector[N] fixef2; 
  vector[N] fixef3; 
  vector[N] fixef4; 
  // Response variable
  array[N] vector[2] zs; 
}
parameters {
  // Fixed effects
  // eq1
  real mu;
  real B_fixef1;
  real B_fixef2;
  real B_fixef3;
  real B_fixef4;
  // eq2
  real mu_2; 
  real B_fixef1_2;
  real B_fixef2_2;
  real B_fixef3_2;
  real B_fixef4_2;
  // Interactions
  // eq1
  real int1;
  real int2; 
  real int3; 
  real int4;
  // eq2
  real int1_2;
  real int2_2; 
  real int3_2; 
  real int4_2;
}
transformed parameters {
  // Expected trait values for each observation
  vector[N] z_exp; 
  vector[N] z2_exp; 
  //Model equations:
  // Equation 1:
 z_exp = mu + B_fixef1*fixef1 + B_fixef3*fixef3 + 
               B_fixef2*fixef2 + B_fixef4*fixef4 + 
               int1*(fixef3.*fixef2) + 
               int2*(fixef3.*fixef4) + 
               int3*(fixef4.*fixef2) +
               int4*(fixef3.*fixef4.*fixef2); 
  // Equation 2: 
  z2_exp = mu_2 + B_fixef1_2*fixef1 + B_fixef3_2*fixef3 + 
                B_fixef2_2*fixef2 + B_fixef4_2*fixef4 + 
                int1_2*(fixef3.*fixef2) + 
                int2_2*(fixef3.*fixef4) +
                int3_2*(fixef4.*fixef2) + 
                int4_2*(fixef3.*fixef4.*fixef2);
}
model {
  // Priors
  // Fixed effects
  // eq1
  mu ~ normal(0, 1); 
  B_fixef1 ~ normal(0,1); 
  B_fixef3 ~ normal(0,1); 
  B_fixef2 ~ normal(0,1); 
  B_fixef4 ~ normal(0,1); 
  int1 ~ normal(0,1); 
  int2 ~ normal(0,1); 
  int3 ~ normal(0,1); 
  int4 ~ normal(0,1);
  // eq2
  mu_2 ~ normal(0, 1); 
  B_fixef1_2 ~ normal(0,1); 
  B_fixef3_2 ~ normal(0,1); 
  B_fixef2_2 ~ normal(0,1); 
  B_fixef4_2 ~ normal(0,1); 
  int1_2 ~ normal(0,1); 
  int2_2 ~ normal(0,1); 
  int3_2 ~ normal(0,1); 
  int4_2 ~ normal(0,1); 

  // Cholesky multinormal likelihood function
  array[N] vector[2] mus;
  for (o in 1:N) 
  mus[o] = [z_exp[o], z2_exp[o]]';
  
  zs ~ multi_normal_cholesky(mus, L_sigma);
}

HI, @CdeGroot and sorry for not getting to this sooner. It’s taken me forever to catch up on the forums after a week of vacation.

This looks incomplete—I don’t see where L_Sigma was defined.

The problems you’re running into are likely because of non-identifiability in your likelihood. I’m guessing you’ll find high posterior correlations among many of your parameters if you do a pairs plot.

What is the scale of your posteriors? Your priors are all unit scale and if the posteriors aren’t compatible with the priors, it causes the sampler to struggle (in general, the better specified a model is for the data, the faster it runs).

I would strongly urge you not to name variables int1_2—it’s just so confusable with our built-in int data type. Especially when it’s a real value! I’d also urge you to put things into arrays; rather than

  real int1;
  real int2; 
  real int3; 
  real int4;

you can use

vector[4] interval;

(Stan won’t let you name a variable int since it’s a built-in data type.)

In the definition software z_exp and z2_exp you can use a single line, as in:

vector[N] z_exp
  = mu
  + B_fixef1 * fixef1 
  + B_fixef3 * fixef3
  + ...;

It helps immensely with readability to include space around operators.