Error code 70 in cmdstanpy model

Hello. I am getting an error code 70 error in a reduce_sum implementation of a multinomial regression model.

The inputs are large-ish: 4871 observations and 1328 categories. If I downselect the data to 100 obs and 10 categories, the model samples.

If I up to 100 obs and 100 categories, I receive the error (as do I with the original data).

Is there a process to understand what the error code means?

Thanks

Hi, do you get the error at sampling stage?

Can you do this and see the corresponding stdout files. What do they say?

import os
import platform
import subprocess

def open_file(path):
    if platform.system() == "Windows":
        os.startfile(path)
    elif platform.system() == "Darwin":
        subprocess.Popen(["open", path])
    else:
        subprocess.Popen(["xdg-open", path])

open_file(cmdstanpy._TMPDIR)

So I run that function (I am on a macbook), and then run the .sample() method. A JSON file appears, then disappears when the error occurs

But the intermediate file (if I open it before it disappears) is just the data block values.

For posterity, The full model is below (it is a wallenius multivariate non-central hypergeometric likelihood)

functions {

  real compute_optimal_r(vector omega, vector m, vector y, int colors)
  {
   // find r to center peak of integrand at 0.5
   int flag = 1;
   int num_iterations = 0;
   real LN2 = 0.693147180559945309417;

   // find highest omega
   real a;
   real b;
   real lastr;
   real rrc;
   real z;
   real zd;
   real rt;
   real r2;
   real r21;
   real omax = max(omega);
   real omaxr = 1. / omax;
   vector[colors] omeg = omaxr*omega;
   real dd = sum(omeg .* (m-y));
   real dr = 1. / dd;

   real rr = omax;

   if (rr <= dr)
   {
    rr = 1.2 * dr;  // initial guess
   }
   
   // Newton-Raphson iteration to find r
   while (flag || fabs(rr-lastr) > rr * 1.E-5 )
   {
      flag = 0;
      lastr = rr;
      rrc = 1. / rr;
      z = dd - rrc;                    // z(r)
      zd = rrc * rrc;                  // z'(r)
      for (i in 1:colors) 
      {
         rt = rr * omeg[i];
         if (rt < 100. && rt > 0.) 
         {   // avoid overflow and division by 0
           if (fabs(rt) > 0.1)
           {
              r2 = exp(rt*LN2);
              r21 = 1.0 - r2;             
           }
           else 
           { // Use expm1
              r21 = expm1(rt*LN2);
              r2 = r21 + 1;                    
              r21 = -r21;                        
           }
            //r21 = pow2_1(rt, &r2);     // r2=2^r, r21=1.-2^r
            a = omeg[i] / r21;         // omegai/(1.-2^r)
            b = y[i] * a;              // x*omegai/(1.-2^r)
            z  += b;
            zd += b * a * r2 * LN2;
            //printf("z,zd,a,b,r21,r2 %lf %lf %lf %lf %lf %lf\n",z,zd,a,b,r21,r2);
         }
      }
      if (zd == 0) {
        print("can't find r in function compute_optimal_r");
        return 1.2* dr;
      }

      rr -= z / zd;                    // next r
      if (rr <= dr)
      {
        rr = lastr * 0.125 + dr * 0.875;
      }
      if (++num_iterations == 70)
      {
        print("convergence problem searching for r in function compute_optimal_r");
        return 1.2* dr;
        }

   }
   
   return rr * omaxr;

}

  real approximate_wallenus_integral(vector omega, int[] y, int[] m) 
  {
    int num_categories = size(y);
    int num_points = 250;
    vector[num_categories] vec_y = to_vector(y);  // number of balls of each category that were drawn
    vector[num_categories] vec_m = to_vector(m); // number of balls of each category in the urn

    real d = sum(omega .* (vec_m-vec_y));
    //real r = 33.0; // TODO: need to do this optimally
    real r = compute_optimal_r(omega, vec_m, vec_y, num_categories);
    real dt = 1./num_points;

    vector[num_points-1] ret_val;

    for (i in 1:num_points-1) 
    {
      ret_val[i] = (r*d-1.0)*log(i*dt) + sum(vec_y .* log1m_exp((r*omega)*log(i*dt))); 
      // should do trapz?
    }
    return (log(r) + log(d) + log_sum_exp(ret_val));
  }

  real MWNCHG_lpmf(int[] y, vector omega, int[] m) {
    // Multivariate Wallenius Non-Central Hypergeometric pmf
    // y = items drawn in each category
    // m = total number of items in each category
    // omega = weights of each category

    int num_categories = size(y);

    real theta[num_categories] = to_array_1d(omega);
    int one_array[num_categories] = rep_array(1,num_categories);

    real first_part;
    real second_part;

    first_part = sum(
      lgamma(to_vector(m)+to_vector(one_array)
        ) - lgamma(to_vector(y)+to_vector(one_array)
        ) - lgamma(to_vector(m)-to_vector(y)+to_vector(one_array))
        );
    second_part = approximate_wallenus_integral(omega,y,m);
    return  first_part + second_part;

  }

  real partial_sum(
    int[ , ] selected_slice,
    int start, 
    int end,
    data int[ , ] incount,
    vector omega
    ) 
  {
    real ret_sum = 0;
    for (n in start:end)
    {
      ret_sum += MWNCHG_lpmf(selected_slice[n] | omega, incount[n]);
    }
    return ret_sum;
    
  }
  
}

data{
  int<lower=1> num_obs; // number of observations
  int<lower=1> num_categories; // number of categories 
  int<lower=0> selected[num_obs,num_categories]; // number selected  data
  int<lower=0> incount[num_obs,num_categories]; // how many of each color
  
  int<lower=0, upper=1> run_estimation;
}

parameters {
  simplex[num_categories] omega;  // the inferred weights of the Multivariate Wallenius Non-Central Hypergeometric Distribution
}

model {
  int grainsize = 1;
  omega ~ dirichlet(rep_vector(1.0, num_categories));

  if (run_estimation==1){
    target += reduce_sum_static(partial_sum, selected, grainsize, incount, omega);
  }
}

Okay, stdout files show an initialization error. I will submit better initial values.

1 Like

error code 70 means that the inference algorithm ran into insurmountable problems and is quitting as gracefully as possible.
usually there’s an accompanying error message that is written to one of the output streams, and CmdStanPy needs to do a better job of getting this back to the user.
there are issues filed on CmdStanPy - investigate granularity of exceptions required for subprocesses as well as CmdStan return codes - are they useful? · Issue #22 · stan-dev/cmdstanpy · GitHub, chain 1 returned error code -11 · Issue #246 · stan-dev/cmdstanpy · GitHub

there’s also a docs issue to explain these error codes return codes defined and documented · Issue #228 · stan-dev/docs · GitHub - now that the CmdStan User’s Guide is online, an appendix section on error codes would be good. the full set of error codes that CmdStan could return are defined here: https://github.com/stan-dev/stan/blob/develop/src/stan/services/error_codes.hpp

how big is the stdout file? would printing the whole thing to the console be useful?

Oh I think so yes!