Code Refactoring

Code refactoring is a direct effort to improve code without creating new modules or features, therefore leading to more user-friendly, clean, and readable code.

Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems. In an agile context, it can mean the difference between meeting or not meeting an iteration deadline.

Refactoring code ruthlessly prevents rot, keeping the code easy to maintain and extend. This extensibility is the reason to refactor and is the measure of its success. But note that it is only “safe” to refactor the code this extensively if we have extensive unit test suites of the kind we get if we work test-first. Without being able to run those tests after each little step in a refactoring, we run the risk of introducing bugs. If you are doing true test-driven development (TDD), in which the design evolves continuously, then you have no choice about regular refactoring, since that’s how you evolve the design.

Code hygiene

A popular metaphor for refactoring is cleaning the kitchen as you cook. In any kitchen in which several complex meals are prepared per day for more than a handful of people, you will typically find that cleaning and reorganizing occur continuously. Someone is responsible for keeping the dishes, the pots, the kitchen itself, the food, the refrigerator all clean and organized from moment to moment. Without this, continuous cooking would soon collapse. In your own household, you can see non-trivial effects from postponing even small amounts of dish refactoring: did you ever try to scrape the muck formed by dried Cocoa Crispies out of a bowl? A missed opportunity for two seconds worth of rinsing can become ten minutes of aggressive scraping.

Specific “refactorings”

Refactorings are the opposite of fiddling endlessly with code; they are precise and finite. Martin Fowler’s definitive book on the subject describes 72 specific “refactorings” by name (e.g., “extract method,” which extracts a block of code from one method, and creates a new method for it). Each refactoring converts a section of code (a block, a method, a class) from one of 22 well-understood “smelly” states to a more optimal state. It takes awhile to learn to recognize refactoring opportunities, and to implement refactorings properly.

Refactoring to patterns

Refactoring does not only occur at low code levels. In his recent book, Refactoring to Patterns, Joshua Kerievsky skillfully makes the case that refactoring is the technique we should use to introduce “gang of four” design patterns into our code. He argues that patterns are often over-used, and often introduced too early into systems. He follows Fowler’s original format of showing and naming specific “refactorings,” recipes for getting your code from point A to point B. Kerievsky’s refactorings are generally higher level than Fowler’s, and often use Fowler’s refactorings as building blocks. Kerievsky also introduces the concept of refactoring “toward” a pattern, describing how many design patterns have several different implementations, or depths of implementation. Sometimes you need more of a pattern than you do at other times, and this book shows you exactly how to get part of the way there, or all of the way there.

The flow of refactoring

In a test-first context, refactoring has the same flow as any other code change. You have your automated tests. You begin the refactoring by making the smallest discrete change you can that will compile, run, and function. Wherever possible, you make such changes by adding to the existing code, in parallel with it. You run the tests. You then make the next small discrete change, and run the tests again. When the refactoring is in place and the tests all run clean, you go back and remove the old smelly parallel code. Once the tests run clean after that, you are done.

Refactoring automation in IDEs

Refactoring is much, much easier to do automatically than it is to do by hand. Fortunately, more and more integrated development environments (IDEs) are building in automated refactoring support. For example, one popular IDE for Java is eclipse, which includes more auto-refactorings all the time. Another favorite is IntelliJ IDEA, which has historically included even more refactorings. In the .NET world, there are at least two refactoring tool plugins for Visual Studio 2003, and we are told that future versions of Visual Studio will have built-in refactoring support.

To refactor code in eclipse or IDEA, you select the code you want to refactor, pull down the specific refactoring you need from a menu, and the IDE does the rest of the hard work. You are prompted appropriately by dialog boxes for new names for things that need naming, and for similar input. You can then immediately rerun your tests to make sure that the change didn’t break anything. If anything was broken, you can easily undo the refactoring and investigate.