DevOps

@seantalts, I know someone is helping with devops. Mind introing the person?

I’d also like to start a discussion about how to make the testing go quicker on Math and it’d be good to have that discussion on the forums.

I’ve hired a contractor who isn’t on the forums named Nicusor Serban. This is his resume: https://www.toptal.com/resume/serban-nicusor

His purview is really the Jenkins side of things, and I suspect most of the wins for Math testing will be in C++ land (other than what he’s already working on with the EC2 Spot Fleet), so probably doesn’t make sense to add him to the discussion?

1 Like

Thanks! That’s helpful.

If it crosses into Jenkins-land, you can loop him in if you think it’s useful for there to be a discussion. Btw, I think a lot of it has to do with Jenkins and caching; is he managing the resources on Jenkins?

He is managing the resources on Jenkins. It’s hard to estimate workload but I think he probably has another few weeks of critical work to do on the infrastructure before we have time to experiment with new features like that, but maybe he could provide input quickly on what’s possible or not. For example, relying on a cache will be complicated because 1) the physical machines we own don’t have much disk space and 2) the spot instances we lease are spun up and down as needed, so we’d need to build some kind of external cache server.

1 Like

Got it… I guess it’d help to just know what resources we have available and then I (or the Math devs) can think about how to use those resources.

I don’t think I have a good idea of what is and isn’t available on the resources Columbia owns and I really don’t know how the AWS stuff is spun up.

Yeah, makes sense. I’ll ask him if he wouldn’t mind making an account here and being involved in the thread w.r.t. Jenkins and compute resources once the discussion gets to that point.

Just a question, but do you use docker for testing?

not really, why? it’s on the list but given how slow it’s reported as being it might not make sense with our 7+ hour test suites.

Slow?

I was just thinking about AWS and having clearly defined testing env.

With DockerHub everything else could be precompiled and then only Stan related compilation would be done on the image. That said, I think it would need some clever filehandling (e.g. some specific stan version + update with git).

Edit. I think with docker you could split the tests in n-chunks

This is now done at the AMI level instead, which will probably be checked in once we go monorepo.

I have heard anecdotally that Docker slows down computation done within the container. This makes sense to me as it’s an intermediate layer, but I’m not sure what the penalty is for C++ compilation specifically.

What would splitting the tests help with? And why couldn’t we do that without Docker?

I think this depends on the hardware (not sure). Docker container should have “real” access to CPU.

You could probably do that without docker. Docker is just easy to copy n times and run tests with identical environment (assuming your hardware has resources for it).

Wait, we are going to monorepo? More details, please.

Thanks. I remember this but didn’t know it’s about to be deployed.

Well, it’s been an ongoing project for the past 4 months or so - we have someone working on it part time. I can ask him to post updates to that thread if that would help?