Multivariate probit model: Cannot compute ELBO using the initial variational distribution

Hello, I am currently implementing a multivariate probit model, where I take the Stan user’s guide as the main source:

The difference within my model is that the regressors can have different values per equation, resulting in a 2-dimensional matrix of regressors, where the variables are stacked horizontally. Such that the first J columns represent the entries for the first variable for all groups, and so on. This allows for some adjustments, which I’m unsure if done correctly/efficiently.

I’m currently getting the error message that it cannot start sampling with the current initialization values. After this, I changed the init parameter to a lower value. Whereafter I obtain the error that the ELBO cannot be computed.

I suppose the following code blocks are mostly causing the problem. But to be sure, the whole .stan file is attached! Also, the z variable is defined exactly as in the source provided above. I have the feeling this is causing the problem, but as I replicated the code (to some extent), I do not know if this can be the exact cause.

parameters {
  matrix[J, K] beta;                                //Model coefficients
  cholesky_factor_corr[J] L_Omega;                  //Cholesky factor of correlation matrix (no scaling necessary for identification)
  vector<lower=0>[N_pos] z_pos;                     //z-variables with positive value --> y=1
  vector<upper=0>[N_neg] z_neg;                     //z-variables with negative value --> y=0
}

transformed parameters {
  matrix[N,J] z;                                    //Initialization of the z variable
  for (n in 1:N_pos) {
    z[n_pos[n], j_pos[n]] = z_pos[n];               //Restricting z variables corresponding to y=1 to be positive
  }
  for (n in 1:N_neg) {                              //Restricting z variables corresponding to y=0 to be negative
    z[n_neg[n], j_neg[n]] = z_neg[n];
  }
}

model {
  //Priors:
  L_Omega ~ lkj_corr_cholesky(2);
  for (j in 1:J){
    beta[j] ~ normal(0,1);
  }
  
  //likelihood:
  for(n in 1:N){
    array[J] vector[K] x_j;
    for(j in 1:J){
      int start_col = (j-1)*K+1;
      int end_col = j*K;
      x_j[j] = x[n, start_col:end_col]';
    }
    vector[J] z_n;
    for(j in 1:J){
      z_n[j] = z[n,j];
    }
    vector[J] mu_n;
    for(j in 1:J){
      mu_n[j] = beta[j] * x_j[j];
    }
    z_n ~ multi_normal_cholesky(mu_n, L_Omega);
  }
}

Help would be much appreciated, thanks in advance!

model1.stan (3.0 KB)

1 Like