As part of productivity hack blog series, this post is aimed towards Gerrit admins/Jenkins CI owners. This post will talk about how they can configure/tune systems they’re maintaining such that Gerrit servers will have less load caused by CI systems and make them capable of serving human users better.
Jenkins usage pattern which can create load on Gerrit server
We all understand importance and benefits of a good Continuous Integration system. It all begins immediately with having a new source code change in the source code repository. We would want our CI system such as Jenkins to grab it and verify its quality as soon as possible. We will have our Jenkins job configured to poll SCM repository and whenever it discovers a new change, it will trigger a new build.
Wonder what is happening behind the scene? Well this Jenkins job, asks Gerrit server every 2 minutes about head of all branches (using git ls-remote) and compares SHA1 of the configured branch(es) the to the last build it has ran. If SHA1 doesn’t match, it will trigger the new build else it won’t.
Still sounds like not a big deal! Well, in a practical situation, there can be only one central Git (Gerrit) server per organization and multiple Jenkins CI servers (possibly one CI server per team). Within each Jenkins server, there can be more than one job, requesting similar information (check for new commit). What makes things worse, all that can happen at the same time. In a nutshell, tons of jobs running on a significantly big number of CI servers, will talk to one poor Git server at the very frequent interval. This setup is capable of creating a huge load on a Git server which will ultimately throttle git operations of human users aka developers and affect their productivity.
We’ve seen such a usage pattern quite often at many of our customers. In the SW development organizations, developers’ productivity matters most. Simple tune-ups and adjustments into Jenkins/Gerrit configuration can prevent higher load on Gerrit server. We’ll go over them one by one.
What can you do?
Use less frequent polling with quiet period
Frequently polling scm is not always the best strategy. You may argue that I want my CI build happen as soon as new code arrives into repo. Fair enough, but it is worth checking some facts and better plan your CI builds, such that CI servers not causing a huge load on Gerrit server. You can improve the situation significantly by knowing your developers’ Git usage patterns like how often a new source code is being pushed? What is the time of the day when most of the activities happen? And of course how long the Jenkins job will take to build? For an example, a team of developers works mostly between 7 AM in the morning 7PM on weekdays and average frequency to push new code is ~15 minutes on weekdays. For such a team, Jenkins job(s) can be configured to poll SCM with somewhat below schedule.
MINUTE HOUR DOM MONTH DOW H/15 7-19 * * 1-5
Here H is advised to use as it adds hashing factor to the cron schedule and ensures that none of the jobs are scheduled at the same time within Jenkins server.
By polling less frequently and only when it is needed, you can avoid unnecessary requests to the Gerrit server. Another very interesting configuration Jenkins has is Quiet Period, it can be used along with poll scm option to avoid burst of commits. Advantages are nicely described in this blog post from Jenkins community.
Replace polling with Gerrit trigger events
What if Jenkins is notified whenever there is a new commit(s) pushed to the Git repository instead of Jenkins checking itself periodically? That’s certainly possible if your Git server happens to be Gerrit (TeamForge uses Gerrit as a Git server). Gerrit has stream-event capability, which is nicely used in Jenkins Gerrit Trigger Plugin . If Gerrit is your Git server, you might want to use this plugin on your Jenkins server already now! This plugin is capable of doing many things; the most important one is verifying code review related workflows. However it can be also used as an alternative to Poll SCM option in Jenkins. Setup Gerrit trigger as shown in the screenshot below. (More detail on how to setup)
Next step is adjust each of the job configuration where in Source Code Management section, select Gerrit Trigger for strategy for choosing what to build option as shown. Don’t forget to remove Poll SCM option from your old configuration.
Once that is done, as shown in the screenshot below, simply configure Gerrit Event as build trigger and configure build Gerrit Trigger section. Most important here is the Trigger On option. Set it to Ref Updated event type, which basically makes the Jenkins job subscribed to the event which is: when new commit arrives on ‘curl’ repository’s ‘master’ branch generate a trigger. That trigger causes a build of the job. Basically it is a sophisticated replacement for Poll SCM! In a nutshell this configuration notifies Jenkins whenever event happens on Gerrit server as opposed to Jenkins fetching constantly (and mostly unnecessarily) according to the schedule.
Partition your user base between CI and human user
Despite setting up Gerrit trigger and optimization of polling schedule, there are high chances that load on a Gerrit server will go up proportionally to the number of CI servers interacting with it. It may eventually happen that more CI systems are keeping Gerrit server busy by making many parallel requests and leave little or no room for human users and block them from the service.
Configure Git plugin & Gerrit trigger to make use of replica
The tips mentioned so far have been solely focused on reducing load on a single Gerrit server. Interestingly, Gerrit server can have its own replica(s). Gerrit has a replication plugin and TeamForge ships with Enterprise grade Git replication for its Gerrit servers. These replicas can shoulder responsibility of serving all requests coming from CI users and shedding good amount of load off from the Gerrit server. Jenkins jobs can be configured in such a way that they will utilize replica server while human user will enjoy exclusive access to the main Gerrit server.
How to achieve that? If you have replication setup working for your Gerrit server, it is really not cumbersome. If you have configured Jenkins jobs as per Replace polling with Gerrit trigger events section above, you will have to do some additional configuration changes in each of your Jenkins’ Gerrit trigger section. Create a new Gerrit trigger server which will listen to Gerrit replica server instead.
You will have to do additional configuration changes in each of the job which uses Gerrit trigger. In the job’s Source Code Management section, change the repository url to contain Gerrit replica server instead.
In the Build Trigger section, select your Gerrit replica trigger that you’ve just configured.
That’s it. From now on, whenever new change is pushed to the Gerrit server, it will get replicated to the Gerrit replica server which will then notify Jenkins server. This notification will in hand trigger the new build new build of the job which will fetch from the replica server. Congratulations! you’ve just made main Gerrit server and its human user happy.
Variation for Gerrit Review Verification Jobs on Jenkins
In case if you’re using Gerrit trigger plugin on Jenkins to verify code review requests (i.e commenting on reviews and give Verified+1/-1 votes), you cannot solely rely on Gerrit replica server as code review and verification takes place on master Gerrit server only. However what you can do is: configure master Gerrit trigger such that it will get notified once replication finishes and the job(s) triggered make use of replica instead. In other words, get notified from master Gerrit server but perform Git operation with Gerrit replica server. In order to do that, configure you master Gerrit server trigger as described.
Go to the Gerrit trigger configuration page on your Jenkins server. Select your already configured Master Gerrit Trigger server and edit configuration.
At the very bottom of the configuration page, press “Advanced” button which will lead to the section “Replication Events”. Select Block Build until Replication is Complete and specify configs for your Gerrit Replica and save the configuration.
Now next step is to configure each of the code review verification jobs which makes use of Gerrit trigger. Basically change git repository URL to the replica repository url and save the job configuration.
From now on, Gerrit code review verification jobs will get triggered as soon as new code review request is made but the new build will be scheduled only when replication of the code review request finishes. Scheduled build will then make use of Gerrit replica server.
We’ve covered how to efficiently use Poll SCM feature within Jenkins and suggested an alternative to use Jenkins Gerrit Trigger plugin. We’ve also discussed how we can make use of Gerrit replica to off load quite some responsibility of serving demanding Jenkins from the main Gerrit server. These tips are actionable by Jenkins admins/team who owns Jenkins. We’ve also gone through a tuning option, actionable by Gerrit admin which enables Gerrit to distinguish between CI requests and human requests. All in all, these hacks will help improve performance Gerrit servers and help improve overall productivity of developers. Feel free to give your feedback on these tips and if you’ve some other tips from your experience maintaining your Gerrit/Jenkins tooling, share them in the form of comments below.