DevOps

syclik · April 23, 2019, 4:34am

@seantalts, I know someone is helping with devops. Mind introing the person?

I’d also like to start a discussion about how to make the testing go quicker on Math and it’d be good to have that discussion on the forums.

seantalts · April 23, 2019, 1:24pm

I’ve hired a contractor who isn’t on the forums named Nicusor Serban. This is his resume: https://www.toptal.com/resume/serban-nicusor

His purview is really the Jenkins side of things, and I suspect most of the wins for Math testing will be in C++ land (other than what he’s already working on with the EC2 Spot Fleet), so probably doesn’t make sense to add him to the discussion?

syclik · April 23, 2019, 1:28pm

Thanks! That’s helpful.

If it crosses into Jenkins-land, you can loop him in if you think it’s useful for there to be a discussion. Btw, I think a lot of it has to do with Jenkins and caching; is he managing the resources on Jenkins?

seantalts · April 23, 2019, 1:35pm

He is managing the resources on Jenkins. It’s hard to estimate workload but I think he probably has another few weeks of critical work to do on the infrastructure before we have time to experiment with new features like that, but maybe he could provide input quickly on what’s possible or not. For example, relying on a cache will be complicated because 1) the physical machines we own don’t have much disk space and 2) the spot instances we lease are spun up and down as needed, so we’d need to build some kind of external cache server.

syclik · April 23, 2019, 1:47pm

Got it… I guess it’d help to just know what resources we have available and then I (or the Math devs) can think about how to use those resources.

I don’t think I have a good idea of what is and isn’t available on the resources Columbia owns and I really don’t know how the AWS stuff is spun up.

seantalts · April 23, 2019, 2:30pm

Yeah, makes sense. I’ll ask him if he wouldn’t mind making an account here and being involved in the thread w.r.t. Jenkins and compute resources once the discussion gets to that point.

ahartikainen · April 23, 2019, 4:22pm

Just a question, but do you use docker for testing?

seantalts · April 23, 2019, 7:15pm

not really, why? it’s on the list but given how slow it’s reported as being it might not make sense with our 7+ hour test suites.

ahartikainen · April 23, 2019, 7:31pm

Slow?

I was just thinking about AWS and having clearly defined testing env.

With DockerHub everything else could be precompiled and then only Stan related compilation would be done on the image. That said, I think it would need some clever filehandling (e.g. some specific stan version + update with git).

Edit. I think with docker you could split the tests in n-chunks

seantalts · April 23, 2019, 7:38pm

This is now done at the AMI level instead, which will probably be checked in once we go monorepo.

I have heard anecdotally that Docker slows down computation done within the container. This makes sense to me as it’s an intermediate layer, but I’m not sure what the penalty is for C++ compilation specifically.

What would splitting the tests help with? And why couldn’t we do that without Docker?

ahartikainen · April 23, 2019, 7:46pm

I think this depends on the hardware (not sure). Docker container should have “real” access to CPU.

You could probably do that without docker. Docker is just easy to copy n times and run tests with identical environment (assuming your hardware has resources for it).

yizhang · April 23, 2019, 8:06pm

Wait, we are going to monorepo? More details, please.

seantalts · April 24, 2019, 6:03pm

yizhang · April 24, 2019, 6:13pm

Thanks. I remember this but didn’t know it’s about to be deployed.

seantalts · April 24, 2019, 6:46pm

Well, it’s been an ongoing project for the past 4 months or so - we have someone working on it part time. I can ask him to post updates to that thread if that would help?

Topic		Replies	Views
Jenkins load and job durations Developers maintenance	3	873	August 21, 2017
I'm going to be online less this week Developers math	4	551	March 12, 2019
New Jenkins node and feature Developers	3	439	September 20, 2018
Math develop failed on Jenkins Developers	12	600	June 8, 2018
Continuous integration (Jenkins) server down? Developers maintenance	4	678	April 30, 2019

DevOps

Related topics