Gaussian process predictive inference memory pressure

I recently ran into something interesting. I had coded up the predictive inference in a Gaussian process with the latent variable formulation per the Stan users guide

transformed parameters {
  vector[N] f;
  {
    matrix[N, N] L_K;
    matrix[N, N] K = cov_exp_quad(x, alpha, rho);

    // diagonal elements
    for (n in 1:N) {
      K[n, n] = K[n, n] + delta;
    }

    L_K = cholesky_decompose(K);
    f = L_K * eta;
  }
}

But I made what I thought was a fairly innocent change by deleting what looked like redudant brackets:

transformed parameters {
  vector[N] f;
  matrix[N, N] L_K;
  matrix[N, N] K = cov_exp_quad(x, alpha, rho);
  
  // diagonal elements
  for (n in 1:N) {
    K[n, n] = K[n, n] + delta;
  }
  
  L_K = cholesky_decompose(K);
  f = L_K * eta;
}

The two versions sample in about the same amount of time but my altered version would get stuck after sampling completed (when I assume cmdstanr is writing the .csv’s to disk), run out of memory, and crash. The version from the user’s manual runs fine and I notice that L_K and K are not retrievable as draws (not that I want them to be). Can someone explain what is going on here - I assume it has something to do with L_K and K but I don’t understand exactly what causes the difference?

The deleted curly braces have the effect that any variables declared inside the block are scoped only locally within the block, and are not saved as transformed parameters. Your second version is writing out L_K and K as transformed parameters. cmdstan writes them to disk as sampling proceeds, but you’re getting stuck when cmdstanr tries to read them back from disk, because you have a huge number of parameters getting read (the csv file sizes will be much larger).

1 Like

Great, super useful to know!