How we reduced our builds times by running tasks in parallel

August 27, 2017
Continuous Integration Software Engineering

Introduction

We, at Enhancv, are using continious integration systems for our development process since the very beginning. Initially we picked Codeship, because of their free plan and since we were using it only to do a single deploy to Heroku, it was perfectly fitting our needs and the time between commiting to our git and seeing the changes on production was less than 1-2 minutes. At that time we were rushing our MVP, so we didn’t bother writing tests. As the time passed, our build process became much more complex - we added client and server unit tests as well as integration tests for most of our features (we are having over 1500 JSX and SCSS files). As we added 2 more AWS Lambdas, the deployment time exceed 15 minutes. Considering the facts, that each branch has its own staging environment (automatically created and deleted as you push/close your branch) and on average each developer is commiting 5 times a day, we were wasting more than 1 hour of our time each day waiting for the builds to complete. Our build looked like this:

Enhancv old deployment process

and everything was running in series. So we thought that we would probably save a lot of time if we run the tasks in parallel. Codeship supports parallel pipelines, but they are pretty complicated, so we switched to TravisCI as they have a beta feature called “Stages” and it was going to allow us to run many tasks in parallel.

Travis CI build stages

Unfortunately on paper it looks much better, than in reality. First thing we noticed is that Travis containers were much slower than Codeship. We weren’t really happy, but gave it a try and prepared the parallel configuration. The desired deployment process was something like this:

- Stage 1: Install
- [parallel] client install
- [parallel] server install
- [parallel] integration install

- Stage 2: Test
- [parallel] client testing
- [parallel] server testing
- [parallel] integration testing

- Stage 3: Deploy
- [parallel] heroku deploy
- [parallel] lambda 1 deploy
- [parallel] lambda 2 deploy

- Stage 4: Apply changes
- Update heroku to point to the freshly deployed lambdas

Unfortunately, each stage was running in its own VM and for a task that takes 15 secs to complete (client install for example), we had to wait 2 minutes for the VM to spawn. So even after we configured our build process to run in parallel, our build time increased by almost 100%. We contacted their support as well as few build engineers, but unfortunately they didn’t have a solution for this problem yet.

At that time we started looking for different solutions. The codeship VMs were running pretty decent hardware and we moved back. We came up with the idea to prepare our custom parallel system. Initially GNU Parallel caught our eyes, but as its interface wasn’t that easy to use, we picked the npm module concurrently. Finally, all we had was a single build file, which did all the work. Thanks to the concurrently’s --names option, we made our log pretty readable. The installation step mutated from:

(cd client && yarn --silent);
(cd server && yarn --silent);
(cd integration && yarn --silent);

to

concurrently --kill-others-on-fail --names "client,server,integration" \
"cd client && yarn --silent" \
"cd server && yarn --silent" \
"cd integration && yarn --silent"

the test step mutated from:

if [[ $GIT_BRANCH != master ]]; then
    bash ./bin/prepare_heroku_staging.sh $APP_NAME
fi;

(cd client && yarn test);
(cd server && yarn test);
(cd integration && yarn build && yarn test);

to

if [[ $GIT_BRANCH != master ]]; then
    HEROKU_STAGING="bash ./bin/prepare_heroku_staging.sh $APP_NAME"
fi;

concurrently --kill-others-on-fail --names "client test,server test,integration test,heroku staging" \
"cd client && yarn test" \
"cd server && yarn test" \
"cd integration && yarn build && yarn test" \
"$HEROKU_STAGING"

and the log looked like

Parallel building script

For the integration tests(WebDriverIO) we found out that they’re performing faster if you don’t limit your PhantomJS instances to the CPU cores count, but put higher value instead. We’re also trying to silent all stdout outputs in order to have more clean log. Now, all stage tasks we running in parallel. Our deploy is also in parallel and when everything is completed, we just change the endpoint to the freshly deployed microservices. We managed to reduce our build times from 15-20 minutes to just 3-4 minutes.

As conclusion I would like to say that I really like Travis and I’m using it for all of my open source projects, but their “Stages” isn’t the best fit (yet) for our use case (should notice that it’s in beta). Concurrently is a great tool that may be used to speed-up all kind of different tasks, as it has great API. In the begining, I didn’t believe the logs would be readable, but they’re much better then expected.

PS: We’re looking for a Software Engineer to join product team. You can find more information about the position here. Most of our code is open-sourced in GitHub in case you’re interested in our technology.

Read more

Home automation with Raspberry Pi, Node and React
Bypassing legacy capchas combined with email confirmation

Hey! I just started blogging and I'll post regularly about tech. If you enjoy my posts, leave your email below and I'll try to send you a newsletter once a month. I won't spam you :)