Changing parameters to local parameters causes initialisation errors

Hi all,
I’m trying to speed up a model that I am writing (I’m aware this is a model that can be fitted already using existing code, however I’m writing this as a sub-model that will later be used as a block in the full model).

I want to move the normally distributed zero mean random effects (labelled b2 and b3) from the parameters block to be local parameters in the model block - so that they are not recorded at each iteration which might speed up model fit?

However, when I move the random effects, the model cannot initialise (an error message is printed - “Chain 1: Rejecting initial value:
Chain 1: Error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Location parameter[1] is nan, but must be finite! (in ‘string’, line 71, column 2 to column 119)”

I’ve tried a few different specifications but get a similar message to this each time.
The model runs (slowly) if b2 and b3 are in the parameter block

Any tips on how to correctly move these matrices of random effects (b2 and b3) to be local only, so that they are not returned would be really helpful.

Aside - I have tried editing brms code to edit fast running code rather than writing from scratch, however I need the code to remain flexible in the number of possible random effects at each level, and the brms code seems tailored to a specific number of random effects?

Model below, moved lines have been commented out. I’m running the model using rstan.

functions{
}
data {
  //dimensions of data
  int<lower=1> M;      //number of observations
  int<lower=1> N;      //number of individuals
  int<lower=1> K;      //number of studies
  int<lower=1> pl;     //number long fixed
  int<lower=1> ql;     //number long indrand
  int<lower=0> rl;     //number long studrand

  //id variables
  int<lower=1> idnum[M];   //long ind id variable
  int<lower=1> studnum[M]; //long stud id variable

  //longitudinal outcome information
  vector[M] y1;         //longitudinal outcome

  //longitudinal fixed information
  matrix[M,pl] X_l;            //design matrix long fixed
  real<lower=0> beta_l_sig;    //sd for fixed effects prior
  real<lower=0> sigma_e_sig;   //involved in prior for sigma_e - sig for longerror
  real<lower=0> sigma_e_nu;    //involved in prior for sigma_e - sig for longerror

  //longitudinal indrand information
  matrix[M,ql] Z2_l;          //design matrix long indrand
  vector[ql] b2_ones;         //vector of ones for use indrand
  real<lower=0> tau_Dl_sig;   //involved in prior for tau_Dl - sd for indrand
  real<lower=0> tau_Dl_nu;    //involved in prior for tau_Dl - sd for indrand
  real Omega_Dl_eta;          //involved in prior for Omega_Dl - corr mat for indrand

  //longitudinal studrand information
  matrix[M,rl] Z3_l;          //design matrix long studrand
  vector[rl] b3_ones;         //vector of ones for use studrand
  real<lower=0> tau_Al_sig;   //involved in prior for tau_Al - sd for studrand
  real<lower=0> tau_Al_nu;    //involved in prior for tau_Al - sd for studrand
  real Omega_Al_eta;          //involved in prior for Omega_Al - corr mat for indrand
}
transformed data{
  vector[ql] mu_b2;     //mean vector for longindrand
  vector[rl] mu_b3;     //mean vector for longstudrand

  mu_b2 = rep_vector(0,ql);  //used for mean vector for long indrand
  mu_b3 = rep_vector(0,rl);  //used for mean vector for long studrand
}
parameters{
  vector[pl] beta_l;      //fixed effects for longitudinal component
  real<lower=0> sigma_e;  //longitudinal error term (sd)

  //matrix[N,ql] b2;                      //longitudinal indrand effects
  cholesky_factor_corr[ql] Omega_Dl;    //Correlation Matrix long indrand
  vector<lower=0>[ql] tau_Dl;           //variance terms long indrand

  //matrix[K,rl] b3;                     //longitudinal studrand effects
  cholesky_factor_corr[rl] Omega_Al;   //Correlation Matrix long indrand
  vector<lower=0>[rl] tau_Al;          //variance terms long indrand
}
transformed parameters{
  //vector[M] linpred_y1;  //longitudinal linear predictor
  //matrix[ql,ql] Dl;     //covariance matrix for long indrand
  //matrix[rl,rl] Al;     //covariance matrix for long studrand

  //linpred_y1 = X_l*beta_l + (Z2_l .* to_matrix(b2[idnum,]))*b2_ones+ (Z3_l .* to_matrix(b3[studnum,]))*b3_ones;
  //Dl = diag_pre_multiply(tau_Dl, Omega_Dl);
  //Al = diag_pre_multiply(tau_Al, Omega_Al);
}
model{
  matrix[ql,ql] Dl = diag_pre_multiply(tau_Dl, Omega_Dl);
  matrix[rl,rl] Al = diag_pre_multiply(tau_Al, Omega_Al);
  matrix[N,ql] b2;
  matrix[K,rl] b3;

  //Priors
  tau_Dl ~ student_t(tau_Dl_nu,0,tau_Dl_sig);     //prior for sig for long indrand
  Omega_Dl~ lkj_corr_cholesky(Omega_Dl_eta);               //prior for corr matrix for long indrand

  tau_Al ~ student_t(tau_Al_nu,0,tau_Al_sig);   //prior for sig for long studrand
  Omega_Al~ lkj_corr_cholesky(Omega_Al_eta);       //prior for corr matrix for long studrand

  beta_l  ~ normal(0,beta_l_sig);                    //prior for fixed effects
  sigma_e ~ student_t(sigma_e_nu,0,sigma_e_sig);     //prior for longitudinal error term

  //Likelihood
  y1 ~ normal(X_l*beta_l + (Z2_l .* to_matrix(b2[idnum,]))*b2_ones+ (Z3_l .* to_matrix(b3[studnum,]))*b3_ones,sigma_e);     //update longitudinal outcome
  for(i in 1:N){
   b2[i] ~ multi_normal_cholesky(mu_b2, Dl); //update long indrand
  }
  for(j in 1:K){
    b3[i] ~ multi_normal_cholesky(mu_b3, Al); //update long studrand
  }
}
//generated quantities{
  //}

Any advice greatly appreciated

I don’t think that what you desire is possible. You can declare local variables in the model block, but not when these variables are the variables by which your model is parameterized. That is, the only variables that you can declare locally are those that occur on the left-hand-side of an equals sign. Stan needs to know that you wish to construct a probability density parameterized by b2 and b3, and so these need to be declared as parameters.

FWIW, if writing these parameters to disk is substantially slowing down your model fit, then I think you must already be in a regime where your fit is pretty fast.

Also FWIW, if you’re able to use rstan rather than cmdstan then you can avoid writing unwanted parameters to disk via the pars argument. I think (?) that the development branch of rstan matches most or all of cmdstan’s features at this point.