Modelling individual specific error-terms in three-level multilevel model


Currently, I am constructing a three-level model which looks like this: y_{ijt} = \beta_{0ij} + x_{ijt}'\beta_{fixed} + \epsilon_{ijt}. I constructed the model with the assumption that \varepsilon_{ijt} \sim N(0,\sigma^2), but now I want to model the variance \sigma^2 of the error-term \varepsilon_{ijt} to be i and j specific, so \sigma_{ij}^2. I do this by using the following model:

data {
  // Define variables in data
  int<lower=1> error_index[Ni];
  // Continuous outcome
  int y[Ni];

parameters {
  // Define parameters to estimate
  // Population intercept (a real number)
  real beta_0;
  // Fixed effects
  vector[p] beta;
  // Level-1 errors
  real<lower=0> sigma_e0[Npars];
  // Level-2 random effect
  real u_0jk[Npars];
  real<lower=0> sigma_u0jk;
  // Level-3 random effect
  real u_0k[Nk];
  real<lower=0> sigma_u0k;

transformed parameters  {
  // Varying intercepts
  real beta_0jk[Npars];
  real beta_0k[Nk];
  // Individual mean
  real mu[Ni];
  // Varying intercepts definition
  // Level-3 (10 level-3 random intercepts)
  for (k in 1:Nk) {
    beta_0k[k] = beta_0 + u_0k[k];
  // Level-2 (100 level-2 random intercepts)
  for (j in 1:Npars) {
    beta_0jk[j] = beta_0k[countryLookup[j]] + u_0jk[j];
  // Individual mean
  for (i in 1:Ni) {
    mu[i] = beta_0jk[countryMonthLookup[i]] + fixedEffectVars[i,]*beta;

model {
  // Prior part of Bayesian inference
  // Random effects distribution
  u_0k  ~ normal(0, sigma_u0k);
  u_0jk ~ normal(0, sigma_u0jk);
  sigma_u0jk ~ student_t(3, 0, 10);
  sigma_u0k ~ student_t(3, 0, 10);

  sigma_e0 ~ student_t(3, 0, 141);
    // Likelihood part of Bayesian inference
  for (i in 1:Ni) {
    y[i] ~ normal(mu[i], sigma_e0[error_index[i]]);

However, my model now takes hours more of computation time than when modelling only a constant variance \sigma^2 for the error-terms (replacing sigma_e0[error_index[i]] by just sigma_e0). Next to that, I get the message that all transitions after warm-up exceeded the maximum treedepth (10). Does anyone know how to solve these problems. Next to that I do not set priors on the covariance of the error-terms but that should not matter much, right?