Possible INLA optimization step concerns, sparsity requirements, and Stan features for large gaussian process inference

bbbales2 · September 1, 2018, 1:58pm

We never ended up talking about this, but I only came up with two kinda rudimentary designs:

Pass in a “key” value to every optimization in the program. This key value is used internally for iteration-to-iteration storage.

So something like:

model {
  for(n in 1:N) {
    y = optimize(..., n); // The model specifies an identifier for each optimization
  }
  ...
}

So your optimize function might look like:

auto optimize(auto a, ..., int i) {
  Eigen::VectorXd initial_guess;

  load_guess(i, guess);
  ...
  save_guess(i, guess);
}

Where that map from ids -> initial guesses would live is up in the air, but I figure it wouldn’t be too difficult to nail it down.

The downside to this is that it’d be hard to get these indexes right if your optimizations were buried in weird places:

functions {
  vector myfunction(real a, real b, int n) { // Ugh -- have to carry index around manually
    return optimize(..., n);
  }
}
...
model {
  ...
  for(n in 1:N) {
    // What if myfunction sometimes used 2 optimizations and sometimes 1?
    y[n] = myfunction(a, b, n);
  }
  ...
}

So that’s not great. But in any case a wrong index got passed in and the initial guess didn’t exist or wasn’t the right size, then the regular initial conditions could be used.

The other easy thing is maintain two stacks in each leapfrog step: one that gets written to (that will be used by the next leapfrog step), and one that gets read from (which was created in the last leapfrog step). So inside the C++ maybe you have something like:

auto optimize(auto a, ..., int i) {
  Eigen::VectorXd guess;

  pop_from_front_guess(guess);
  ...
  push_to_back_guess(guess);
}

In this way you assume the order and number of your Stan model function evaluations don’t change.

Neither thing is terribly inspirational, but either would probably cover the bulk of the use cases. I don’t think the second thing would work with threading (the first one should be do-able, just annoying). The first thing probably wouldn’t interact well with MPI though.

In either case the initial conditions to the model would be a bit weird. Maybe in terms of nomenclature they’d be more like fallback initial conditions.

In either case we could use the Parameter packs for (de)serialization stuff to make the optimization internals look nice.

Topic		Replies	Views
Scalable Gaussian process inference Publicity gaussian-process	27	2295	March 25, 2024
Gaussian Process with Binomial Likelihood General gaussian-process	6	1360	January 20, 2022
Gaussian process predictive inference memory pressure Modeling	2	163	February 19, 2024
Gaussian process gradient matching for inference in dynamical systems Modeling	3	959	June 12, 2018
Tutorial on Monte Carlo EM and variants for MML and MMAP Algorithms	16	3714	October 22, 2018

Possible INLA optimization step concerns, sparsity requirements, and Stan features for large gaussian process inference

Related topics