Testing MPI code

wds15 · November 30, 2017, 5:14pm

Hi!

As proposed I am opening a discussion on setting up the google test framework for the MPI code. The goal is to be able to test the MPI code in action. My current way of testing requires a dedicated main function which sets up and tears down the MPI environment as I need it. Steps are:

Setting up the MPI cluster (root + workers)
Deactivating output collection from the workers
Sending workers into listen mode such that they can recieve and execute work (rank != 0 nodes)
Start the RUN_ALL_TESTS macro only on the rank = 0 root node

I don’t think this can be achieved with the fixture concept.

A good example is the test here:

github.com

stan-dev/math/blob/feature/concept-mpi-2/test/unit/math/prim/mat/functor/map_rect_mpi_test.cpp

#include <stan/math.hpp>
#include <gtest/gtest.h>

#include <test/unit/math/prim/mat/functor/mpi_test_env.hpp>

#include <test/unit/math/prim/mat/functor/hard_work.hpp>
#include <test/unit/math/prim/mat/functor/faulty_functor.hpp>

#include <iostream>

STAN_REGISTER_MAP_RECT(1, faulty_functor)
STAN_REGISTER_MAP_RECT(2, faulty_functor)

STAN_REGISTER_MAP_RECT(0, hard_work)

struct MpiJob : public ::testing::Test {
  Eigen::VectorXd shared_params_d;
  std::vector<Eigen::VectorXd> job_params_d;
  const std::size_t N = 10;
  std::vector<std::vector<double> > x_r = std::vector<std::vector<double>>(N, std::vector<double>(1,1.0));

This file has been truncated. show original

To get this test running just get that branch, follow the notes in MPI_NOTES which is in the top of the branch and you also need a make/local file where I have

O=0
BOOST=~/local/boost_1_65_1
GTEST_CXXFLAGS+=-g -DSTAN_HAS_MPI
LDFLAGS=-L$(HOME)/local/lib -lboost_serialization -lboost_mpi
CXX=mpic++
CC=mpic++

which simply makes sure that MPI stuff is used, linked in, etc. Then one can use make as usual to build the test. To execute the test one has to use mpirun to start the test. So a

mpirun -np 4 ..path...to...test

will start 4 processes.

So unless we figure out a better way, my suggestion is to do the following:

introduce an additional MPI specific google test file containing the special main to be used for MPI tests
MPI tests would be called ..._mpitest.cpp which the make system recognizes and then does the right thing (special main + start tests with mpirun).

Best,
Sebastian

syclik · November 30, 2017, 5:15pm

I’ll take a look in a week. I know this is for the mechanics of including MPI code into our test framework, but mind letting me know exactly what you’d be testing with this framework?

wds15 · November 30, 2017, 5:23pm

Well, I would like to see that the mpi stuff works as I want. Example tests in that branch are:

test/unit/math/prim/mat/functor/map_rect_mpi_test.cpp
test/unit/math/rev/mat/functor/map_rect_mpi_test.cpp

There are a number of things special when we are in MPI mode:

static data is cached
exceptions are handled with much greater care
after the first successful call all function output sizes are considered to stay the same for every future call => I want to test that an error is raised in case this is not given
after all I want to see that data is correctly transferred
special flags must be transferred correctly
prevent nested calls to map_rect

So a number of things make MPI tests certainly necessary. Not all of the above is tested yet, but this is what is on my mind.

wds15 · December 2, 2017, 12:00pm

Some progress here. I found a way how we can live with the current test system given we are willing to accept a few hacks. I introduced the file

github.com

stan-dev/math/blob/feature/concept-mpi-2/test/unit/math/prim/mat/functor/mpi_test_env.hpp

#pragma once

#include <gtest/gtest.h>

// sets up the MPI environment. All tests have to be skipped on
// non-root nodes with a
// if(rank != 0) return
// line at the test start

// moreover, the MPI tests have to be setup with the MPI_TEST_F macro
// which will disable all MPI tests whenever no STAN_HAS_MPI is
// defined

#ifdef STAN_HAS_MPI

#define MPI_TEST_F(test_fixture, test_name)     \
  TEST_F(test_fixture, test_name)

#include <stan/math/prim/arr/functor/mpi_cluster.hpp>

This file has been truncated. show original

which sets up a number of globals which make things work. Note that the google test doc recommends to create a dedicated main over defining globals. This is compromise 1. The second pill to swallow is that it is not possible to prevent tests to be executed on the non-root processes. Hence, I have to start every test with the line

if(rank != 0) return;

which disables the test on non-root processes (the non-root process is just used by the root process and tests are executed in the context of the root).

What remains to solve is:

compiling the MPI tests still requires additional dependencies linking with boost mpi+boost serialization+MPI base libraries and definition of the STAN_HAS_MPI macro.
calling the test binaries will happen without the mpirun wrapper such that only a single process will be started. This will execute the test, but no actual transfers will take place (at least the code will be run, but not in a true multi-process mode).

A solution to the above is to use the STAN_HAS_MPI macro to disable all MPI tests in case the macro is not defined. To run the MPI tests one would have to configure compiler & libraries accordingly and then just rerun the tests.

It comes down to the question when, how often + where should MPI tests be run?

wds15 · December 16, 2017, 11:15am

Some progress: I figured how we can have tests of map_rect_mpi in our test system in a smooth way. So I am using compile time properties (based upon definition of STAN_HAS_MPI) which set up the tests slightly differently, see here.

Fo tests with MPI the mpi_test_env.hpp referred to above setups a global test environment which initializes the MPI systems. Moreover, it defines a MPI_TEST_F macro which is just an alias for TEST_F such that tests execute normally.

On the other hand, the header acts different when NO STAN_HAS_MPI is defined. In that case, no global test environment is created and the MPI_TEST_F is defined such that the test name is prefixed with DISABLED. Doing so tells googletest to not run the given test.

I think that should work OK. With all these tricks there are no special requirements to the current test system. Should MPI tests need to be run, then one has simply to configure the build system accordingly.

Bob_Carpenter · December 22, 2017, 1:53am

I’ve been out of touch on Discourse for a couple weeks and I missed that there were a bunch of other MPI threads. Great to come back and find problems followed by solutions.

Topic		Replies	Views
Testing special code (MPI, GPU, ...) Developers	1	568	August 13, 2017
MPI roadmap Developers	16	1814	March 11, 2018
MPI todo list / Stan language specifics Developers	1	688	January 27, 2018
Testing map_rect & its implementations Developers	6	763	February 27, 2018
Understanding MPI parallelism wiki page CmdStan	10	1293	October 16, 2018

Testing MPI code

Related topics