RStan (PyStan) & MPI / GPU

I’m pretty sure you really want emplace_back

Otherwise you’re constructing twice as many threads as you want, I thought there was a bigger reason…

Good point. If that does a copy and the thread’s actually called in the copy. What happens in the original code? Isn’t that also making a copy into a new memory location and thus calling the constructor or do arrays behave differently?

You initialise twice the threads you want, half of them are default
constructed but not sure if they’re deleted. … It might work fine but be
wasteful.