Published November 24, 2019
by Doug Klugh
Shorten the Cycle
Make frequent, small releases so as to not impede the progress of others.1 Avoid modifications to large portions of code, which promote large, infrequent releases. Big changes will make merging difficult, increase the risk of deployment, and impede your productivity along with your teammates. Use Feature Toggles to enable Trunk-Based Development, which will help shorten deployment cycles, reduce ceremony, and promote CI/CD. But keep in mind, to deploy in short cycles, you have to build in short cycles and test in short cycles. And if you are going to deploy continuously, you must build continuously and test continuously.
Never Break the Build
Never allow the build to break. To support Continuous Integration (CI), the build must always be kept in a healthy and deployable state. Compile failures, code analysis issues, and integration, build, and test failures must all be caught prior to pushing code to the trunk. Code must never make its way to the trunk if any of these failures occur.
If get into the habit of ignoring build failures, you will start to get used to them, and eventually, you will ignore them all together. Then at some point, to put an end to all those failure alerts, you will likely decide to disable all of those failing tests and promise to go back and Fix Them Later. Yeah, right — fix them later. That's a good one!
And this is where your test suite begins to look like swiss cheese — filled with holes, left from the failing tests you never went back and fixed. You should have enough confidence in your tests that if they all pass, you can immediately deploy that code. If you don't have that level of confidence, then you should work on improving your tests to the point where you do. That is an absolute must for achieving a mature DevOps model for continuous software delivery.
CI Fundamentals
Whether your work is ready for primetime or not, it should be integrated into your production code base numerous times each day — at the very least, once daily. In many cases, the feature you’re working on will not be complete and will need to be disabled or “turned off” using Feature Toggles to be able to integrate it with production code without affecting other functionality.
Trunk-Based Development can be an easy practice if your code adheres to the Open/Closed Principle (OCP). This enables you to extend your code without modifying it — so you’re actually changing very little code in the production code base.
Whether you’re following the OCP or not, it is important that the work be decomposed into very small slices of functionality. This Agile principle promotes continuous delivery by deploying and releasing functionality often.
Every time code is committed to the trunk it should kick off an automatic compile, static code analysis, integration, and build. This ensures that none of the changes made on the trunk break the build and remains deployable at all times. If the build fails, it should immediately trigger the Andon Cord and initiate an immediate response by every person required to solve the problem. And if the problem cannot be resolved in a brief, predetermined amount of time, the changeset should be rolled back to keep the build in a healthy and deployable state.
Once the build completes successfully, it should immediately kick off all automated regression test suites — to verify the quality of the build, including the new changes. These tests must serve as quality gates to ensure confidence in the changes we just committed. If any of these gates fail, they should immediately trigger the Andon Cord (as with the build). If the problem cannot be resolved quickly, the changeset should be rolled back to prevent the error from progressing downstream.
The build must always be kept healthy through clean compiles, healthy code analysis, and no integration or build failures. Both the build process and the build itself must be maintained in such a way that is builds quickly and executes fast — certainly within performance requirements.
Any and all failures related to integrating, verifying, and deploying work from the trunk need to be highly visible to every person in the value stream to promote swarming and enable accountability by everyone on the team.
When failures occur, they must trigger an immediate response by every person required to solve the problem. If those people are in the middle of working on something else, they must drop what they’re doing and swarm to resolve the problem.
It is critical that problem be resolved as quickly as possible to prohibit the same failure from arising in future changes and to prevent new failures from being introduced. This also prevents the error from progressing downstream where the cost and effort to resolve the issue would be much greater — not to mention the addition of Technical Debt.
We can avoid major problems by resolving smaller problems earlier in the development lifecycle. And swarming provides numerous learning opportunities and prevents the loss of critical information as time passes and people’s memories continue to fade.
This swarming response is a critical component to Continuous Integration and requires a culture that not only makes it safe but encourages team members to pull the Andon Cord when problems arise — even small ones.