Cmdstan 2.18 MPI

Thanks for the suggestions.

I still have some questions.

  1. Is the right way to compile cmd stan with CXXFLAGS += -DSTAN_THREADS -pthread in make/local?

Probably I didn’t hit the right manual because most of my knowledge is based on Linear, parallell regression thread.

I compiled cmdstan with CXXFLAGS += -DSTAN_THREADS -pthread in make/local as was suggested in the thread.

Yes, I used map_rect and it works. I was able to run 10, 15, and 20 shards and actually checked with htop that corresponding # of threads are running. I didn’t use mpirun though. The command line I was using was:
export STAN_NUM_THREADS=15
time ./NNsigma.2.18.mpi sample …

However, when I tried 100 shards I found that only first node is active. All others were idle.

Obviously I tried mpirun with 20 shards. Here the output about the progress was repeated 20 times.