Published September 19, 2021
by Doug Klugh
Supporting CI/CD
Promote rapid product delivery by maintaining a build process that runs within ten minutes. Fast builds support continuous integration, enable frequent feedback, and reduce risk by keeping the delta between releases small. Employ Dependency Management strategies that support building, testing, and deploying only those modules that have changed — minimizing build time. Keep the build healthy and fast through clean compiles, healthy code analysis, and no build or commit-test failures. Facilitate small, frequent releases by building, testing, and integrating continuously.
The goal of continuous integration is to always keep the software in a working state. This requires fast (frequent) feedback. And the key to fast feedback is automation.
Software systems are composed of executable code, configuration, host environment, and data. A change to any one of these components can lead to a change in the behavior of the system. We must therefore ensure these components are under control and that changes to any of them are verified. This verification (testing) process should be fully automated to facilitate frequent feedback. These automated tests should verify the proper implementation of the code through unit tests, verify Compliance with Non-Functional Requirements such as capacity, availability, and security, and verify successful code analysis to ensure compliance with expected test coverage, coding standards, and secure coding and configuration practices. All functional acceptance tests should validate that customer expectations are being fulfilled.
CI Fundamentals
Whether your work is ready for primetime or not, it should be integrated into your production code base numerous times each day — at the very least, once daily. In many cases, the feature you’re working on will not be complete and will need to be disabled or “turned off” using Feature Toggles to be able to integrate it with production code without affecting other functionality.
Trunk-Based Development can be an easy practice if your code adheres to the Open/Closed Principle (OCP). This enables you to extend your code without modifying it — so you’re actually changing very little code in the production code base.
Whether you’re following the OCP or not, it is important that the work be decomposed into very small slices of functionality. This Agile principle promotes continuous delivery by deploying and releasing functionality often.
Every time code is committed to the trunk it should kick off an automatic compile, static code analysis, integration, and build. This ensures that none of the changes made on the trunk break the build and remains deployable at all times. If the build fails, it should immediately trigger the Andon Cord and initiate an immediate response by every person required to solve the problem. And if the problem cannot be resolved in a brief, predetermined amount of time, the changeset should be rolled back to keep the build in a healthy and deployable state.
Once the build completes successfully, it should immediately kick off all automated regression test suites — to verify the quality of the build, including the new changes. These tests must serve as quality gates to ensure confidence in the changes we just committed. If any of these gates fail, they should immediately trigger the Andon Cord (as with the build). If the problem cannot be resolved quickly, the changeset should be rolled back to prevent the error from progressing downstream.
The build must always be kept healthy through clean compiles, healthy code analysis, and no integration or build failures. Both the build process and the build itself must be maintained in such a way that is builds quickly and executes fast — certainly within performance requirements.
Any and all failures related to integrating, verifying, and deploying work from the trunk need to be highly visible to every person in the value stream to promote swarming and enable accountability by everyone on the team.
When failures occur, they must trigger an immediate response by every person required to solve the problem. If those people are in the middle of working on something else, they must drop what they’re doing and swarm to resolve the problem.
It is critical that problem be resolved as quickly as possible to prohibit the same failure from arising in future changes and to prevent new failures from being introduced. This also prevents the error from progressing downstream where the cost and effort to resolve the issue would be much greater — not to mention the addition of Technical Debt.
We can avoid major problems by resolving smaller problems earlier in the development lifecycle. And swarming provides numerous learning opportunities and prevents the loss of critical information as time passes and people’s memories continue to fade.
This swarming response is a critical component to Continuous Integration and requires a culture that not only makes it safe but encourages team members to pull the Andon Cord when problems arise — even small ones.