Published January 4, 2022
by Doug Klugh
Always check code in cleaner than you checked it out. This principle of Continuous Refactoring will extend the value of your software by continuously improving the design and implementation. Always refactor the cruft identified by code smells that are part of the feature under development by applying engineering principles and patterns. Unless they require very basic changes, code smells in other parts of the system should be managed as Technical Debt.
Improve the design of existing code without changing external behavior. As code is modified over time, its integrity organically degrades. Refactoring reverses this process to maintain integrity and code quality. If structure is lacking within the code, be sure to add structure prior to extending functionality. And if unit tests are missing, be sure to add them prior to refactoring. You will want to know the moment you break something.
The first step in cleaning up code is detecting smelly code — symptoms that indicate the possibility of an underlying problem. Some obvious ones (noted at the bottom of this post) may include Anemic Domain Models, Assertion Roulette, Boolean Arguments, Train Wrecks, large methods or classes, multiple responsibilities, etc. These smells should cause you to look deeper to determine if there really is a issue. A code smell by itself is not a problem — rather an indicator of a likely (underlying) problem. Once the problem has been confirmed, then it should be fixed as part of the current feature development or recorded as Technical Debt.
Many of us avoid attempts to improve design because we are afraid of breaking existing functionality or introducing defects. But we can mitigate this fear by making sure we have a comprehensive set of mature unit tests that serve as a Safety Net. Running these tests after every code change will ensure that we catch errors as soon as they are introduced into the code. Then fixing those errors, or simply backing them out, becomes a trivial task. Press CTRL-Z a few times and you are right back to where you started before the error was introduced. And if your tests really are Unit Tests — meaning they are verifying small, isolated units of functionality — they should run fast enough to allow you to run the entire suite after changing each line of code. This is how you alleviate the fear of improving your design and avoid introducing defects.
Promoting Clean Code
Following are some engineering principles and patterns that promote clean code, followed by a few examples of code smells and anti-patterns:
Every object should have a single responsibility and that responsibility should be entirely encapsulated by the class.
Write code that is open for extension but closed for modification.
Build classes in such a way that a derived class can always replace its base class without changing behavior or side effects of the base class.
Don't force classes, modules, or components to depend on (or even know about) methods they don't need.
High level policies should not depend on low-level details.
Design components to be large enough to justify the cost of managing the release cycle. These management activities include assigning release numbers, maintaining release notes with features and known issues, keeping track of dates, committing to grandfather old versions, etc. Others cannot reuse your components without them. Considering the time and effort these activities require, it is unlikely you will be able to manage a large number of components. Better to plan a small number of strategic components that conform to the other Component Cohesion Principles.
Design components by grouping classes together that change for the same reason, while separating those that change for different reasons. This will help minimize the number of components that change when requirements change. Since the classes within a component all have the same Single Responsibility, they all serve the same actor. Therefore, these classes are closed to all but that one responsibility and to the needs of every other actor. So when requirements do change, we will know which components are affected and which are not.
Create components where if one class in the component is used, they’re all used.
Design your software in such a way that the dependencies between your components do not form a cycle. The resulting component dependency structure will form a Directed Acyclic Graph — such as a tree structure. Draw your component graph as a tree and make sure all the arrows point downward. Then rest assured that your dependencies do not form a cycle.
Design your software in such a way that any given component depends on other components that are more stable than it is. In other words, the dependencies within a component architecture should be pointing in the direction of increasing stability. This will ensure that components that need to change frequently will be easy to change and that the stable components that others depend on will not change nearly as often.
Design your software in such a way that components are as abstract as they are stable. This supports the Stable Dependencies Principle and promotes flexibility by ensuring that stable components are easy to extend, even though they are hard to change. This is achieved by applying the Open/Closed Principle, which enables us to easily extend functionality without modifying existing code. This approach also works best when adhering to the Dependency Inversion Principle, which states that dependencies should point towards abstractions and not concretions. This provides very effective dependency management for software components.
Manage side effects within your code by controlling how and where they occur. Write functions such that if they change state, they do not return values. And if they return values, they do not change state. For example, instead of returning error codes, throw exceptions to indicate a failed state change (such as a database update or user login).
Build code that has limited knowledge of other parts of the system and interacts directly only with closely related classes. Methods should be protected from understanding the internal structure of other methods or classes. This is also known as the Principle of Least Knowledge, a proven strategy for promoting information hiding and building high quality, loosely coupled software. This can be achieved through the practice of Tell, Don't Ask.
Building resiliency into your system by identifying exceptions in the requirements, then defining exception handlers during analysis.
Write code in a way that makes it easy to read and easy to understand. Reading code is an inevitable part of maintaining and extending an existing code base. Since developers spend much more time reading code than they do writing it, writing readable code contributes greatly to productivity. Applying many of the prior DevTips will help to achieve effective abstraction, encapsulation, and Separation of Concerns that will greatly promote readability.
Build software that is flexible, reliable, and easy to understand. In other words, build software that is simple. Do not over-engineer the initial design (YAGNI). Start with the simplest solution, then work towards complexity (only as warranted). Ensure your code passes all of its tests, reveals its intent, contains no duplication, and is as small as possible. Writing simple code can often be difficult. It is not about writing code by the simplest means. Simple != easy.
Deliver good software by releasing code that is free from defects in both structure and behavior. A good structure that is free from defects will easily support and encourage changing requirements. It is more important to make software right than it is to make it work. Software that is Flexible can easily be made to work and support long-term enhancements. However, software that works but is Rigid will be difficult to change and will inhibit your ability to support your customers’ future needs.
To decouple functionality, write methods that tell objects what to do. Do not ask for their state, then perform functions on behalf of those objects. This will reduce query functions, encapsulate knowledge, and ensure compliance with The Law of Demeter. While not always feasible, this should be followed as much as possible to promote adaptability.
Design and implement solutions that are needed today; do not invest in capabilities or features that may never be needed. While this may require more effort tomorrow, time is more often wasted by investing in requirements that change or become irrelevant. Building for tomorrow will likely slow down delivery for today. Keep your codebase as small as possible. This will promote simplicity of design, implementation, and validation.
Build abstract classes to encapsulate the responsibility of creating and composing families of related objects. This will help control the classes of objects that a system creates and promotes independent deployability by enabling the system to create objects without depending on those objects. This also enforces consistency among classes of objects by forcing the system to use objects from only one family at a time. This approach enables you to swap factories in and out at runtime, providing real-time flexibility as to how objects are created.
Promote the reuse of existing classes by changing their interfaces into ones that are compatible with other classes. This can be accomplished by creating a new class (Adapter) that implements the interface (Target) and delegates to the original implementation (Adaptee) through object composition. This will dynamically connect the Client to the Adaptee without either of them having knowledge of each other — providing a high level of isolation and flexibility.
Extend functionality by adding properties and behaviors to individual objects instead of entire classes. While inheritance can be used to statically add responsibilities to every subclass instance, decorating individual objects dynamically provides much greater flexibility. This is achieved by enclosing a component within a decorator, which conforms to the interface of the component and provides transparency to the clients of that component. This transparency enables an unlimited number of added responsibilities.
Reduce software complexity by minimizing the communication and dependencies between subsystems. By interposing a Facade object between low-level objects and their clients, you can control what the clients can do and how the low-level objects can respond. This helps to maintain high-level abstraction when applying the Single Responsibility Principle. The clients will have no knowledge that functionality has been encapsulated within smaller objects, each having only one responsibility. The functions represented by the facade can be implemented across numerous objects without the clients ever knowing.
Define an interface for creating objects so that the instantiation of new objects is delegated to methods within the implementing class. This will decouple application-specific classes from your code and enable those factory methods to provide extended versions of objects. Unlike the Abstract Factory Pattern, it is not necessary to instantiate separate factory classes. The downside is you're stuck with just one type of factory and you cannot change it at runtime.
Make tightly coupled code testable by extracting the logic into a separate, easy-to-test component that is decoupled from its environment. This pattern is often applied at the boundaries of a system where logic is difficult to test due to framework dependencies or asynchronous code. Extracting this logic decouples it from the boundary making those objects that interact with the boundary so humble they only need to be tested as part of the build process.
Use the Strangler Pattern to build reusable software by enhancing existing behavior, creating better abstractions, and replacing code that is not quite right. This is an effective method for eliminating technical debt and adhering to the Open/Closed Principle when your code is used by clients outside of your system. Use an [Obsolete] attribute to generate compiler warnings notifying clients that a particular class will eventually be deprecated.
Build flexible software systems by encapsulating functionality within interchangeable classes. This requires related classes that differ only in their behavior. This pattern provides an alternative to conditional statements for selecting desired behavior and eliminates the need to recompile and redeploy a module (or an entire application) any time a condition is added or changed. Use this pattern to leverage polymorphism and promote compliance with the Open/Closed Principle and the Dependency Inversion Principle.
Keep your functions small by extracting code fragments into their own functions. If you have to spend even a small amount of time figuring out what a code fragment does, then it should be extracted into its own function and given a name that describes what it does. Then when you read it again, the purpose of the code should be obvious. This will clarify the intention of your code (separate from the implementation), while making it self-documenting, more maintainable, and more testable.
Use local variables to name complex expressions (or portions of them) to help decompose those expressions into more manageable pieces of logic and provide clarity around the purpose of that code. These simple abstractions help to simplify those expressions, making them easier to read and easier to understand — not to mention how easy it is to provide a hook for debugging.
Refactor code before optimizing to make your software more adaptable to performance tuning. Building software that is well-factored without attention to performance will produce finer granularity for performance analysis — providing effective identification of performance hot spots. Even when refactoring impacts performance, you can apply more effective performance-tuning enhancements with well-factored code. First, write clean code that facilitates performance tuning, then optimize for time and footprint.
Refactor your code by grouping related statements together. This will promote Readability while setting you up for the next refactor — usually Extract Function. It is important to examine the code you are sliding and the code you are sliding over to ensure they do not interfere with each other in a way that would change the observable behavior of the system. As always, ensure reliable test coverage prior to refactoring and rerun tests often.
When a loop is doing multiple things, split it into multiple loops with each doing just one thing. If you need to modify one of the loops, this will ensure that you need to understand only the behavior that you need to modify. This also sets you up for the next refactor — usually Slide Statements or Extract Function. While this forces iterations of multiple loops, it is important to first Refactor, Then Optimize. If the loops turn out to be a performance bottleneck (which is rarely the case), it is easy enough to merge them back together.
As software is developed by making behavioral and structural changes, do not attempt to do both at the same time. Wear one hat to add new capabilities (including tests) without changing existing code, then wear the other hat to restructure the code without changing behavior. Build software by swapping hats frequently to develop and test very small capabilities, then refactor to improve the design and the overall quality of the code.
Verify the expected behavior of unit and component tests by comparing them to the indirect outputs of the System Under Test (SUT) as they occur. This is necessary when the expected results are transient and cannot be verified through State Verification. This requires us to intercept the behavior at an observation point between the SUT and a dependent component. Procedural Behavior Verification can be used to capture indirect outputs of the SUT using Spy Objects. Otherwise, we can load the Expected Behavior Specification into Mock Objects to verify the method calls.
Define the state of your test environment by designing and building a test fixture that supports a single run of a single test. Build a Transient Fixture when its lifetime should persist no longer than the duration of the test method and is initialized prior to each method. Building a Persistent Fixture that survives the life of the test must be torn down within the test method to avoid side effects and keep the fixture fresh.
Avoid conditional test logic by replacing if statements with assertions that fail when the conditions are not satisfied, preventing execution of statements that would cause test errors. This will promote Defect Localization, document pre-conditions enforced by the guard assertions, and minimize the effort required to verify your tests. If you have to write tests for your tests, when would it ever stop?
Now that you’ve mastered the basics of Test-Driven Development, consider the two primary schools of TDD. The London School takes an outside-in, behavior-based approach, which fosters Command-Query Separation, and relies heavily on test doubles — ceding (somewhat) brittle tests. The Chicago School takes an inside-out, state-based approach, which promotes high cohesion, with a greater emphasis on design patterns — although YAGNI is a risk.
Write tests such that each produces the same result from a given initial state without any manual intervention between runs. Unit tests must verify Single Test Conditions by executing a single code path through the System Under Test (SUT) and executing the same, exact code path each time it runs. Verifying one condition for each test helps to minimize Test Overlap and ensures we have fewer tests to maintain if we later modify the SUT. Isolating the SUT ensures that we only have to focus on code paths through a single object.
Build tests in such a way that the number of tests impacted by any change is small. This is achieved by minimizing overlap and dependencies between tests and ensuring changes to the test environment have no affect on our tests. Isolate the System Under Test (SUT) and verify just one condition for each test. Changes that affect test fixtures should be encapsulated behind test utility methods.
Define the state of your test environment by designing and building a Persistent Test Fixture to share system resources across a suite of tests. This will help optimize test execution by managing resources that are expensive to allocate (such as database connections) that can be setup once and shared across multiple tests. Be sure not to persist resources with side effects as this will create coupling between tests and inhibit independent and concurrent testing.
When writing Self-Checking Tests, inspect the state of the System Under Test (SUT) after it has been exercised to ensure it matches the expected state. If the SUT is stateful, this can be achieved through Procedural State Verification or Expected State Specification. Otherwise, Behavior Verification can be used to verify indirect outputs of the SUT as it is being exercised when the state does not change or there is no state to verify. To determine the best approach, we must ascertain whether the expected outcome is a change of state or if we need to examine what occurs while the SUT is being exercised.
Ensure high test coverage by writing each unit test prior to writing the associated production code. Then the only production code that should be written is to pass that failing test. Promote Defect Localization by writing just enough of a test to demonstrate a failure. Ensure good code verification by writing only enough production code to pass the test. Employing these Test-Driven Development (TDD) practices will promote a comprehensive test suite which is essential for Continuous Integration (CI).
Apply this pattern only as a last resort by modifying the System Under Test (SUT) to introduce test-specific behavior. This approach may be required in order to utilize test doubles when dynamic binding is not available. This pattern can be used to bring legacy code under test and provide a Safety Net to facilitate refactoring for even greater testability. These test hooks should be gradually removed as the system evolves into a loosely-coupled, flexible system.
Anemic Domain Models are objects that contain data with little or no associated functionality. Domain models should always be constructed by encapsulating functionality (business logic, calculations, validations, etc.) along with attributes and properties. They should not contain just data — that's the purpose of data structures. This will help to keep your service layer thin. The service layer should only be used to orchestrate behavior contained within domain objects.
Avoid writing tests that verify multiple assertions within a single test method. Numerous assertions make it difficult to determine which one caused a test to fail — inhibiting Defect Localization. This anti-pattern is often caused by Eager Tests that try to minimize the number of tests we need to write. Apply the Extract Function refactoring to tease apart these tests into single-condition tests so that the reasons for test failures are obvious.
Avoid CI Theatre by applying the fundamental practices of Continuous Integration (CI), along with supporting engineering principles, to achieve a mature DevOps Model for continuous software delivery. Precursors of CI Theatre include large user stories, difficult code merges, lengthy code reviews, poor test coverage, long-lived branches, siloed code ownership, broken builds for more than several minutes, and practicing Continuous Isolation by running CI on feature branches. CI is more about culture and mindset than any collection of tools.
Avoid this code smell by ensuring that your test methods do not verify too many test conditions. This often occurs when we try to minimize the number of tests we need to write. A unit test should verify only one condition by executing a single code path through the SUT. We achieve this by minimizing the Cyclomatic Complexity of the test and ensuring the SUT is isolated through Dependency Inversion. Acceptance tests should also be kept as short as possible. Eliminating this code smell will promote Defect Localization and minimize Test Overlap.
Avoid this code smell by encapsulating functionality with the data it changes most often. Feature Envy occurs when a function frequently interacts with data or functions in other modules. Consider the Common Closure Principle for functions that query other components and the Single Responsibility Principle for functions that query other objects within the same component. Apply the principle of Tell, Don't Ask to reduce query functions and encapsulate knowledge. When needed, refactor whole functions by applying the Move Function pattern or split functions using the Extract Function pattern.
Avoid this design smell by building software that is highly modular, highly cohesive, and loosely coupled. A change to one part of your system should never break another part that is completely unrelated. High level policies (i.e. business rules) should never be impacted by changes to low level implementations (i.e. data persistence). Even related functionality should be decoupled enough to extend functionality without affecting related components. There are many engineering patterns and practices that facilitate flexibility, extensibility, adaptability, along with many other qualities that inhibit fragility.
Avoid this design smell by building software from which it is easy to extract and reuse internal components in new environments. This smell is often caused by dependencies that are tightly integrated with other parts of the system. Promote mobility by decoupling components from low-level implementations, such as data persistence, logging, user interfaces, etc. For example, business rules should be encapsulated within components to enable reuse across multiple systems.
Avoid one of the biggest technical debt patterns around — the monolith. Build modular systems by employing engineering practices to decouple your architecture and encapsulate functionality, while embracing modern cloud platforms to optimize reliability, resiliency, durability, scalability, and security for compute and storage systems. Software systems, especially cloud-native applications, should be built as a composite of small components and/or services that can be autonomously developed and independently deployed.
This code smell occurs when a test depends on mysterious external resources, of which the purpose is not well understood. This makes it difficult to understand the relationship between the test fixture and the expected behavior of the test.
Avoid building software that obscures its intent, contains duplication, or provides hooks in anticipation of future extensions. Instead, build an architecture that exposes its use cases, keep your code base as small as possible, and do not over-engineer your initial design (YAGNI). If your code is rigid, you are likely to add a lot of anticipatory design elements to minimize the amount of changes you will need to make later. Evolve your software from a Simple Design that is flexible, reliable, and easy to understand.
Avoid this test smell by writing tests from Simple Designs and refactor as needed to keep your test code clean. Good tests are understood at a glance and clarify the behavior they verify. Structure Matters as much for tests as it does production code. Apply design principles and patterns to enhance the structural quality of tests — as they are every bit as important as production code and should be given the same amount of attention. Behavioral quality can also be enhanced to ensure correctness of the tests by inhibiting errors from hiding within obscure code.
Avoid this design smell by building software that is flexible and easy to change. Rigidity is often observed when a small change forces a complete rebuild and redeploy. Small changes should be able to be built, tested, and deployed very quickly and independently of each other. Long build times are a symptom of high coupling. To promote flexibility, manage the dependencies between modules to ensure when one module changes, the others remain unaffected.
Build your software and your tests in ways that minimize test execution time. Common causes for long running tests include over-engineered test fixtures, asynchronous code, components with high latency, Test Overlap, and too many tests due to a tightly coupled architecture. This causes bottlenecks in Continuous Integration, inhibits rapid feedback facilitated by automated testing, and constrains frequent code merges that are required for Trunk-Based Development.
Think of switch statements as missed opportunities to be polymorphic. Instead of changing or extending a switch (or nested if-else statements), use Polymorphic Dispatch to invert the downstream dependencies. This is easily accomplished by replacing the argument of the switch with an abstract class containing a method that corresponds to the operation of the switch. Then simply create derived classes and implement that method for each of the cases. This will eliminate the need to recompile and redeploy that module (or possibly the entire application) any time a case is added or changed. This approach provides alignment with the Open/Closed Principle — greatly enhancing maintainability, adaptability, extensibility, and agility.
Minimize the number of tests that depend on a particular piece of functionality such that each test condition is covered by exactly one test. Tests that verify the same functionality will usually fail at the same time and require the same maintenance when the functionality is modified. Give attention to tests that verify the same code in different ways as these may indicate different test conditions. Avoid Eager Tests that verify too many test conditions. Picking the right tests is essential in employing a risk-based approach to testing.
Avoid stringing method calls together on a single line of code to invoke methods within different objects. This type of call chain violates the Law of Demeter, while increasing complexity - inhibiting our ability to understand, maintain, and reuse the code. Instead of building a series of "train cars", apply the principle of Tell, Don't Ask to resolve all of these issues.
Avoid this design smell by building loosely coupled software systems through effective dependency management. A Viscose system is one in which development activities (such as building, testing, and deploying) are difficult to perform and take a long time to execute. When components span multiple layers of a system, it is often viscose because even the simplest change is costly to make. Apply the Component Design Principles to help manage the structure of large object-oriented systems.