Debugging `gp_matern32_cov`

As part of using Gaussian processes on graphs (cf. Scalable Gaussian process inference), I would like to evaluate the Matern 3/2 covariance function. Unfortunately, I receive an error message gp_matern32_cov: length scale is nan, but must be not nan!, but the length scale is definitely not nan as I have hard-coded it to be 0.7.

I’ve tried to produce a minimal reproducible example here, but it is not very minimal because even slight changes to the code result in the error message disappearing. Unfortunately, in my actual code, I need to use the form that results in an error. The code is shown below. In short, it iterates over nodes in a directed acyclic graph and evaluates the covariance between parents or predecessors of each node.

transformed data {
  // Dimensions of the covariate space.
  int p = 1;
  // Number of nodes in the graph.
  int num_nodes = 10;

  // Dummy locations for each node.
  array[num_nodes] vector[p] x = rep_array(zeros_vector(p), num_nodes);

  
  // Number of edges in the graph. 
  int num_edges_ = (num_nodes * (num_nodes - 1)) %/% 2;
  // Edge list of {{parents ...}, {children ...}}.
  array[2, num_edges_] int edges;
  // Number of outgoing edges towards parents for each node.
  array [num_nodes] int degrees;

  // Create a complete directed acyclic graph and store the degrees.
  int edge = 1;
  for (i in 1 : num_nodes) {
    degrees[i] = i - 1;
    for (j in 1 : i - 1) {
      edges[1, edge] = j;
      edges[2, edge] = i;
      edge += 1;
    }
  }
}

generated quantities {
  int offset_ = 1;
  // Iterate over all nodes.
  for (i in 1 : size(x)) {
    // Get the parents of the node.
    int k = degrees[i];
    array [k] int predecessors = segment(edges[1], offset_, k);
    // Nothing to be done here if there are no parents.
    if (k == 0) {
        continue;
    }
    // Print out the variables for debugging.
    print(
      "k=", k,
      "; x=", x,
      "; predecessors=", predecessors,
      "; x[predecessors]=", x[predecessors]
    );
    // Evaluate the covariance.
    matrix[k, k] cov22 = gp_matern32_cov(x[predecessors], 1.3, {0.7});
    // Print that we've made it past the evaluation.
    print("ok");
    offset_ += k;
  }
  real success = 1;
}

The model manages to get to the end of the loop but then fails on the last iteration. Here’s the output.

method = sample (Default)
  sample
    num_samples = 1
    num_warmup = 1
    save_warmup = 0 (Default)
    thin = 1 (Default)
    adapt
      engaged = 1 (Default)
      gamma = 0.05 (Default)
      delta = 0.8 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
      save_metric = 0 (Default)
    algorithm = fixed_param
    num_chains = 1 (Default)
id = 1 (Default)
data
  file = /tmp/tmppmvkzr3_/fyxjwv2b.json
init = 2 (Default)
random
  seed = 15348
output
  file = /tmp/tmppmvkzr3_/matern-error-directnqpdttga/matern-error-direct-20240614200649.csv
  diagnostic_file =  (Default)
  refresh = 100 (Default)
  sig_figs = -1 (Default)
  profile_file = profile.csv (Default)
  save_cmdstan_config = 0 (Default)
num_threads = 1 (Default)
Iteration: 1 / 1 [100%]  (Sampling)
k=1; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1]; x[predecessors]=[[0]]
ok
k=2; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2]; x[predecessors]=[[0],[0]]
ok
k=3; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3]; x[predecessors]=[[0],[0],[0]]
ok
k=4; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4]; x[predecessors]=[[0],[0],[0],[0]]
ok
k=5; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4,5]; x[predecessors]=[[0],[0],[0],[0],[0]]
ok
k=6; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4,5,6]; x[predecessors]=[[0],[0],[0],[0],[0],[0]]
ok
k=7; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4,5,6,7]; x[predecessors]=[[0],[0],[0],[0],[0],[0],[0]]
ok
k=8; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4,5,6,7,8]; x[predecessors]=[[0],[0],[0],[0],[0],[0],[0],[0]]
ok
k=9; x=[[0],[0],[0],[0],[0],[0],[0],[0],[0],[0]]; predecessors=[1,2,3,4,5,6,7,8,9]; x[predecessors]=[[0],[0],[0],[0],[0],[0],[0],[0],[0]]
Exception: gp_matern32_cov: length scale is nan, but must be not nan! (in 'matern-error-direct.stan', line 27, column 4 to column 70)
 Elapsed Time: 0 seconds (Warm-up)
               0 seconds (Sampling)
               0 seconds (Total)

Surprisingly, making minor changes to the code results in it succeeding. E.g.,

  • gp_matern52_cov works if used instead of gp_matern32_cov (see here).
  • using a scalar length scale 0.7 instead of {0.7} works (see here).
  • increasing the dimensions to p = 2 works on my laptop (macOS 14.4.1, M1 silicon) (see here), but it doesn’t work on Linux.

A few runs of the models are available here.

Do you know what might be happening here? Thank you for the time!

It looks like there might be a typo here: math/stan/math/prim/fun/gp_matern32_cov.hpp at develop · stan-dev/math · GitHub

I think that needs to be l_size, not x_size? Not sure if that is the cause here

That’s the bug, thank you for the fast reply! I’ve verified it disappears by patching my local copy of the code. I was about to send a PR but then realized that my knowledge of C++ testing is rather limited.

2 Likes

I’ll patch it come Monday. Nice catch, thanks for the easy example

2 Likes