Friday, December 28, 2012

Code metamorphosis

In this post, I would like to show through an example, how we can always progressively improve even the most badly written piece of code.
I will start from a somewhat unintelligible program and, step by step, rewrite it until it really pleases the eyes. Doing so, I will preserve its external behavior, which is roughly specified as follows:
The program inputs 3 possibly null strings. It shall interpret them as 2 integers and an operator among +, -, * and /, to then evaluate the expression and display the result.
Any invalid input value is signaled by an appropriate error message.
I am assuming the existence of two library methods:
  • Integer readInteger(String) transforms a string into the integer it represents. This method expects a non-null argument, and returns null on any string which does not correctly parse as an integer,
  • void println(String) prints its string argument on the standard display and terminates with a new line.
Here is our starting point, a program which fulfills the specification:
void calculate(String number1, String operator, String number2) {
  Integer value1;
  Integer value2;
  if (null != number1) {
    value1 = readInteger(number1);
    if (null != value1) {
      value2 = readInteger(number2);
      if (null != value2) {
        if ("+".equals(operator)) {
          println(value1 + value2);
        } else if ("-".equals(operator)) {
          println(value1 - value2);
        } else if ("*".equals(operator)) {
          println(value1 * value2);
        } else if ("/".equals(operator)) {
          if (!new Integer(0).equals(value2)) {
            println(value1 / value2);
          } else {
            println("Division by zero.");
          }
        } else {
          println("Unknown operator: " + operator);
        }
      } else {
        println("Invalid argument: " + number2);
      }
    } else {
      println("First argument is not a valid number.");
    }
  } else {
    println("Missing first argument.");
  }
}
Even though it does the job, this program does not conform my personal aesthetics. Here are the reasons why:
  1. Lack of structure. There is a complete absence of separation of concerns in this monolithic, heavily indented piece. This piece does not express its intent at all, which is really to perform 3 distinct steps: validate and parse its input, then perform the computation, and at last present the result (or an error message) to the user.
  2. Difficult code extensions. Nowadays, programs often need to change their behavior (sometimes drastically and very quickly). So they must be built in a way that allows for future evolutions. For instance, in our case, what if the result needs to be output in a message box rather than the standard output? Or what if, error messages must be dumped separately on the standard error? Multiple lines of code would need to be rewritten. What about handling more complex expressions (with an arbitrary number of operators)? Or replacing all integers with floating points? Similarly, adding error reporting for integer overflows in an uniform way is not a trivial matter. This code is not cut for change, mainly because of its high level of coupling. Adding more functionality will also add to the current mess and lead to code decay.
  3. Hard to read. For such a small task, this code is obviously too hard to understand. It has too many lines of code (length) and a too high level of imbrication (width) for one method. There is no quick way of breaking it down. If kept as it is, maintenance will be costly.
  4. Hidden bug. Ignoring the fact that integer overflows are not avoided, there is another bug in this small piece. Did you spot it? Don't worry if you did not: the process of cleaning it up will, at some point, make this bug surface.
  5. Low testability. I am having a hard time believing that this piece of code can be thoroughly covered with automatic unit tests. In particular, since it is immediately dumped on the console, I can not think of any straightforward way of sensing the output value of the program. The presentation layer is buried too deeply inside the method body to be readily replaced by a fake implementation suitable for testing purposes. This code needs a serious testability improvement.
Believe it or not, despite its chaotic appearance, this program has its own logic. Let me venture an explanation as to why you may encounter this type of code more frequently than desirable:
  1. Single return. Of the epic fight against gotos, remains one bad habit: namely to have only one return point per method. The modern motivation for this rule is to prepare a single break-point for every method when inspecting the program execution using a debugger. Being of the school of thought that running the debugger is already a sign you lost control over your code, I completely disregard this constraint.
  2. Error cases come second. Did you notice that else branches always contained the error cases? This is so that the correct execution can be read first.
  3. Lazy programming. The coder was driven only by the result to compute. He did not care about readability, maintainability, evolutivity or testability at all. In consequence, inputs are parsed when needed, results are output as soon as they become available in total disregard of any program structure.
  4. Low level defensive programming techniques. Were you surprised by the peculiar shape of conditionals? There are two distinct tricks here. First, all calls to equals are made on a constant object, rather than on a variable. (e.g."*".equals(operator)). The rationale being to avoid potential exception to be thrown if the variable happens to be null. On the surface, this may seem reasonable. But I do not like this coding style. Again I believe it shows a lack of control: at every point, the programmer should know whether any variable may be null, or not. If it may then a check should be performed first to handle this case once and for all. If not, then there is no need to worry. It should not be necessary to sacrifice conditional expressions' readability. In any case, later in this post, I will present a programming technique to drastically minimize any null value flowing through the program. Second, in order to ensure a value is not null, this programmer believes that writing (null != x) is preferable to the more straightforward (x != null). This may be true in C, since a small typographical error and the comparison transforms into the assignment x = null. But not so in Java which has a distinct type of boolean. The type-checker protects you.
  5. Grouped variable declarations. All variables used in this method are declared together at the beginning of the method. This is a C habit, which is unnecessary in more modern languages such as Java or C#. It is preferable to let variables have the tightest possible scope since it reduces the burden of understanding the meaning of many variables too early. It also decreases risks of misuses. Some would oppose that variables declared inside a loop are inefficient, since they will be allocated and freed multiple times. I honestly don't know if that is true: optimizing compilers most probably know better. In any case, I highly doubt that any performance bottleneck ever arises from such practice.
We are now almost ready to start refactoring this piece of code. Before we do so, I am going to assume there is a non-regression test suite for this method. It is going to be run at the end of each rewriting step in order to make sure the program external behavior is preserved.

 

Step I: Early return and program intent

First, let us insert some early returns. This will enable us to clearly separate the 3 phases of the program: input validation, computation and presentation. For various reasons, some of you may not like early returns. I do. Anyways, as you will see, in this case, they are only a temporary artifact to disappear in a later refactoring phase.
void calculate(String number1, String operator, String number2) {
  // Phase 1/3: validate input 
  if (number1 == null) {
    println("Missing first argument.");
    return;
  }
  Integer value1 = readInteger(number1);
  if (value1 == null) {
    println("First number is not a valid integer.");
    return; 
  }
  Integer value2 = readInteger(number2);
  if (value2 == null) {
    println("Invalid integer: " + number2);
    return;
  }
  if (operator == null) {
    println("Missing operator.");
    return;
  }
  // Phase 2/3: computation
  Integer result;
  switch (operator) {
    case "+":
      result = value1 + value2;
      break;
    case "-":
      result = value1 - value2;
      break;
    case "*":
      result = value1 * value2;
      break;
    case "/":
      if (value2.equals(0)) {
        println("Division by zero."); 
        return;
      }
      result = value1 / value2;
      break;
    default:
      println("Unknown operator: " + operator);
      return;
  }
  // Phase 3/3: presentation
  println(result);
}
Here is a list of things we did in this step :
  • Early returns were introduced in order to eliminate error cases as soon as possible. They allowed us to separate the input validation from the rest of the method.
  • An intermediary variable was introduced to store the computation result and separate the presentation layer from the rest of the method.
  • Variable declarations were delayed until needed. Variable initialization is now extremely close from, if not in the same statement as, its declaration.
  • Conditionals were inverted to feel more natural, and the sequence of if/else-if was replaced by a switch statement to make the alternatives more visual.
Admittedly, the program is now longer, but a lot easier to understand. However, in doing so, we also slightly changed the behavior of the program: in the case where the operator is null, the error message became "Missing operator." instead of the previous "Unknown operator: null".

 

Step II: Extract methods and expressions

Let us stop for a moment, and take some time to listen to our code. As I am gaining experience, I am finding this is a worthy practice. Often, the code knows better what it wants to look like. We just have to take notice, handle it softly and follow its lead. Here, the two first large blocks, each starting with a line of comment are eager to become full-fledged methods. In order to increase readability, we will thus extract two methods:
  • Expression parseInput(String number1, String operation, String number2)
  • Integer evaluate(Expression expression)
To do so, we first need to introduce the Expression: an object which encapsulates the output of the parsing phase and which is then evaluated. The expression consists in two integers combined with an operation:
enum EOperation { Plus, Minus, Mult, Div }

class Expression {
  private Integer value1;
  private EOperation operation;
  private Integer value2;
  Expression(Integer value1, EOperation operation, Integer value2) {
    this.value1 = value1;
    this.operation = operation;
    this.value2 = value2;
  }
  public Integer getValue1() { return this.value1; }
  public EOperation getOperation() { return this.operation; }
  public Integer getValue2() { return this.value2; }
}
The code can now easily be rewritten into :
void calculate(String number1, String operator, String number2) {
  Expression expression = parseInput(number1, operator, number2);
  if (expression == null) return;
  Integer result = evaluate(expression);
  if (result == null) return;
  println(result); 
}

Expression parseInput(String number1, String operator, 
                      String number2) {
  if (number1 == null) {
    println("Missing first argument.");
    return null;
  }
  Integer value1 = readInteger(number1);
  if (value1 == null) {
    println("First number is not a valid integer.");
    return null;
  }
  Integer value2 = readInteger(number2);
  if (value2 == null) {
    println("Invalid integer: " + number2);
    return null;
  }
  if (operator == null) {
    println("Missing operator.");
    return null;
  }
  EOperation operation;
  switch (operator) {
    case "+":
      operation = EOperation.Plus;
      break;
    case "-":
      operation = EOperation.Minus;
      break;
    case "*":
      operation = EOperation.Mult;
      break;
    case "/":
      operation = EOperation.Div;
      break;
    default:
      println("Unknown operator: " + operator);
      return null;
  }
  return new Expression(value1, operation, value2);
}

Integer evaluate(Expression expression) {
  Integer result;
  switch (expression.getOperation()) {
    case EOperation.Plus:
      return expression.getValue1() + expression.getValue2();
    case EOperation.Minus:
      return expression.getValue1() - expression.getValue2();
    case EOperation.Mult:
      return expression.getValue1() * expression.getValue2();
    case EOperation.Mult:
      if (expression.getValue2().equals(0)) {
        println("Division by zero.");
        return null;
      }
      return expression.getValue1() / expression.getValue2();
  }
  throw new IllegalArgumentException();
}
There are admittedly now more methods and the code looks even more verbose. But the gain in structure and modularity should be well worth it.
You may be wondering the need to throw an IllegalArgumentException at the end of method evaluate. That is because the compiler does not know this line of code is unreachable (because all enum cases are covered in the switch). It would therefore otherwise emit a compilation error.
Note also, that we did not include any default case in the preceding switch statement. This is done on purpose, so that, if we are to augment the enum with another operation, we will be warned at compile time of all the cases which are not handled.

 

Step III: Exceptions and clean error report

There are two strong code smells left in the previous piece of code:
  • methods parseInput and evaluate both propagate errors up by encoding them into the null return value,
  • error messages are scattered all around.
There is therefore still a high level of coupling between the code which performs the computation task, error handling and the presentation layer. We are going to remedy this situation with the introduction of two exceptions which will respectively signal parse errors and computation errors :
class ParseError extends Exception { }
class DivisionByZero extends RuntimeException { }

void calculate(String number1, String operator, String number2) {
  try {
    Expression expression = parseInput(number1, operator, number2);
    Integer result = evaluate(expression);
    println(result);
  } catch (ParseError e) {
    println("An error occurred while parsing: " + e);
  } catch (DivisionByZero e) {
    println("Division by zero");
  }
}

Expression parseInput(String number1, String operator, 
                      String number2) throws ParseError {
  if (number1 == null) {
    throw new ParseError("Missing first argument.");
  }
  Integer value1 = readInteger(number1);
  if (value1 == null) {
    throw new ParseError("First number is not a valid integer.");
  }
  Integer value2 = readInteger(number2);
  if (value2 == null) {
    throw new ParseError("Invalid integer: " + number2);
  }
  if (operator == null) {
    throw new ParseError("Missing operator.");
  }
  EOperation operation;
  switch (operator) {
    case "+": 
      operation = EOperation.Plus;
      break;
    case "-":
      operation = EOperation.Minus;
      break;
    case "*":
      operation = EOperation.Mult;
      break;
    case "/":
      operation = EOperation.Div;
      break;
    default:
      throw new ParseError("Unknown operator: " + operator);
  }
  return new Expression(value1, operation, value2);
}

Integer evaluate(Expression expression) {
  Integer result;
  switch (expression.getOperation()) {
    case EOperation.Plus:
      return expression.getValue1() + expression.getValue2();
    case EOperation.Minus:
      return expression.getValue1() - expression.getValue2();
    case EOperation.Mult:
      return expression.getValue1() * expression.getValue2();
    case EOperation.Mult: 
      if (expression.getValue2().equals(0)) {
        throw new DivisionByZero();
      }
      return expression.getValue1() / expression.getValue2();
  }
  throw new IllegalArgumentException();
}
I am now fairly satisfied with the current state of the code. We still can and will go a bit further. However, before doing so, we are first going to eliminate a bug and improve test coverage.


Step IV: Fixing a bug and DRY investment

By now, it should be obvious! The parsing phase has a flaw in that it does not check whether the second number may be null. Let us first write a non-regression test which fails:
public void 
test_parseInput_ShouldThrowParseErrorWhenSecondNumberIsNull() {
  try {
    Program program = new Program();
    program.parseInput("0", "+", null);
    Assert.fail();
  } catch (ParseError e) {
  }
}
Please take note how straightforward that test was to write. Writing a similar test harness with the first, monolithic, version of the code, would have been a lot less fluid. It would have required us to first redirect the standard output into a buffer, in order to then assert the correct error message is printed out. Learning how to write such complicated tests is a good exercise. Especially so if you are working with legacy code. But in the long run, it is not worth it. It is largely preferable to rewrite your code in order to make it easily testable.
Fixing the bug would be equally easy. But we are going to make a bit more work than necessary and take this opportunity to further clean up the code. Faithful to the DRY (Don't Repeat Yourself) motto, we will eliminate any repetition by introducing three more methods :
  • ensureIsNotNull(String input) throws a ParseError on a null input,
  • parseInteger(String input) converts the string representation of an integer into its value, and throws a ParseError to signal failure,
  • parseOperation(String input) converts the string representation of an operation into a EOperation.
Here are the methods whose code is modified:
Expression parseInput(String number1, String operator, 
                      String number2) throws ParseError {
  Integer value1 = parseInteger(number1);
  EOperation operation = parseOperation(operator);
  Integer value2 = parseInteger(number2);
  return new Expression(value1, operation, value2);
}

void ensureIsNotNull(String input) throws ParseError {
  if (input == null) throw new ParseError("Missing input.");
}

Integer parseInteger(String input) throws ParseError {
  ensureIsNotNull(input);
  Integer result = readInteger(input);
  if (result == null) throw new ParseError("Integer expected.");
  return result;
}

EOperation parseOperation(String input) throws ParseError {
  ensureIsNotNull(operator);
  switch (input) {
    case "+": return EOperation.Plus;
    case "-": return EOperation.Minus;
    case "*": return EOperation.Mult;
    case "/": return EOperation.Div;
    default:
      throw new ParseError("Unknown operator: " + operator);
  }
}
I would like to draw your attention to the fact that it is extremely easy (in comparison to a large if else block) to move an unitary instruction such as ensureIsNotNull(number1); around the code. We could for instance, push it up, in an effort to eradicate null values as early as possible. Removing all null values at system boundaries, allows us to safely assume no value can ever be null in the code body. Reducing of that much the risk of null pointer exceptions.

Even more importantly, at the same time we fixed the bug, we limited even more the probability of a similar bug to reappear later. Here, the safeguard consists in having method ensureIsNotNull being called inside method parseInteger. So that, as long as we use this method to parse any other integer, then the same mistake can simply not, by construction, be made.


Step V: Improving testability and seam

Unit tests that assert the correctness of error messages would also be nice to have. These messages may, for instance, be contractual.
One way to do so, would be to redirect the standard output into a buffer (using method System.setOut in Java). Another, simpler way, would be to plug a fake output stream inside the program constructor and write the test as follows:
public void test_calculate_SignalsDivisionByZero() {
  FakeOutpuStream output = new FakeOutputStream();
  Program program = new Program(output);
  program.calculate("10", "/", "0");
  Assert.assertEquals("Division by zero", 
                      output.getLastPrintedLine());
}
There are two necessary ingredients to make such a test possible. First, a FakeOutputStream, that is a custom OutputStream which really does nothing except store the last printed string. Method getLastPrintedLine can then be called to retrieve this string in order to build the assertion. Second and more importantly, it should be possible to choose the output stream used for all outputs of the program. The best way is to have the program constructor accept this dependency as a parameter. After the previous refactoring steps, methods println is now called in very few different places and it is thus very easy to modify the program to this end:
class Program {
  private OutputStream output;

  Program(OuputStream output) {
    this.output = output;
  }

  void calculate(String number1, String operator, String number2) {
    try {
      Expression expression = parseInput(number1, operator, number2);
      Integer result = evaluate(expression);
      this.output.println(result);
    } catch (ParseError e) {
      this.output.println("An error occured while parsing: " + e);
    } catch (DivisionByZero e) {
      this.output.println("Division by zero");
    }
  }

  // ...
}
As often, by making the program more testable, we also improved its design. The fact that this piece of code is printing to a stream can now be deduced looking at the constructor signature only. It also becomes easy to plug different kinds of presentation layer, as long as they extend the OutputStream interface.
To inject the output stream program dependency may have seem fairly easy to do here. Sometimes, it may require more work or may not appear so evidently. In any case, it is a very powerful and sane programming technique. This technique supports the stance that we are not building monolithic programs but rather program components. A major coding attitude shift.
As a final note for this section, let me admit that error messages should ideally not be present as constant strings in the program but rather all read from an external resource. But that is for another story.


Step VI: Evaluation the OO way

Let us now wander into the realm of (hard-core) object orientation, and for some of you maybe superfluous bonuses. This next rewrite step originates from a simple observation:
Method evaluate does not need the instance of Program at all, but it rather extensively accesses fields of its unique argument, the Expression.
Consequently the program would surely gain in encapsulation and modularity, if we were to move this method into class Expression. Going a step further, it would make even more sense to dispatch the treatment for each operator into the operations themselves which would relieve the need for the switch statement. It is actually common knowledge that switch statements rigidify object oriented programs. The conventional remedy is a small amount of object inheritance. Here is how it can be done in practice:
interface IOperation {
  Integer evaluate(Integer value1, Integer value2); 
}

class PlusOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 + value2;
  }
}

class MinusOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 - value2;
  }
}

class MultOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 * value2;
  }
}

class DivOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    if (value2.equals(0)) throw new DivisionByZero();
    return value1 / value2;
  }
}

class EOperation {
  public static final PLUS = new PlusOperation();
  public static final MINUS = new MinusOperation();
  public static final MULT = new MultOperation();
  public static final DIV = new DivOperation();
}

class Expression {
  private Integer value1;
  private IOperation operation;
  private Integer value2;

  Expression(Integer value1, IOperation operation, Integer value2) {
    this.value1 = value1;
    this.operation = operation;
    this.value2 = value2;
  }

  Integer evaluate() {
    return this.operation.evaluate(value1, value2); 
  }
}

class Program {
  // ...
  void calculate(String number1, String operator, String number2) {
    try {
      Expression expression = parseInput(number1, operator, number2);
      Integer result = expression.evaluate();
      this.output.println(result);
    } catch (ParseError e) {
      this.output.println("An error occurred while parsing: " + e);
    } catch (DivisionByZero e) {
      this.output.println("Division by zero");
    }
  }
  // ...
} 
Each operation has now its own class. These classes all inherit from a common interface which declares an evaluate(String, String) method. To ensure some sort of backward compatibility, we kept an EOperation class, which can be used as an enum by providing several static, final, non-modifiable singleton operations. To be honest, I would never use this trick in an industrial setting. (And don't worry, I will get rid of it in the final refactoring step). For all the reasons exposed in my previous post (see A case against static), this is borderline and unnecessary. But I wanted to make the point that the parsing part of the code need not be changed at all in this refactoring phase (plus, for once, I wanted to please all the memory performance junkies out there: doing so avoids a re-allocation of an IOperation at every call to method calculate).

Without any doubt, this rewrite steps improved the code structure. Note how none of the fields in class Expression escapes the private scope anymore. A sure sign of better encapsulation. Furthermore, modification of the code to handle another operation would be largely more localized than in the initial piece.


Step VII: Ultimate polishing step

Before feeling satisfied, we are going to eliminate the last code smells: the presence of a switch statement and a (fake) enum. To do so I will use yet another trick: that of changing control flow code into data. We will simply put all operations into a hash-map indexed by the string representation of the operation. Then parsing any operator simply amounts to getting the corresponding operation into the hash-map. If no operation is found, then it is a parsing error. As a bonus, since the hash-map will be initialized once in the program constructor, there is really no need for the fake EOperation enum anymore. Here are the two methods which change during this rewrite step:
class Program {
  private OutputStream output;
  private HashMap<IOperation> operations;

  Program(OuputStream output) {
    this.output = output;
    this.operations = new HashMap<IOperation>();
    this.operations.put("+", new PlusOperation());
    this.operations.put("-", new MinusOperation());
    this.operations.put("*", new MultOperation());
    this.operations.put("/", new DivOperation());
  }
  // ...
  IOperation parseOperation(String input) throws ParseError {
    ensureIsNotNull(operator);
    IOperation result = this.operations.get(operator);
    if (result == null) 
      throw new ParseError("Unknown operator: " + operator);
    return result;
  }
  // ...
}

 

A word on the final code version

Let us have a look at the final version of the code:
interface IOperation { 
  Integer evaluate(Integer value1, Integer value2); 
}

class PlusOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 + value2;
  }
}

class MinusOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 - value2;
  }
}

class MultOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    return value1 * value2;
  }
}

class DivOperation implements IOperation {
  public Integer evaluate(Integer value1, Integer value2) {
    if (value2.equals(0)) throw new DivisionByZero();
    return value1 / value2;
  }
}

class Expression {
  private Integer value1;
  private IOperation operation;
  private Integer value2;

  Expression(Integer value1, IOperation operation, Integer value2) {
    this.value1 = value1;
    this.operation = operation;
    this.value2 = value2;
  }

  Integer evaluate() {
    return this.operation.evaluate(value1, value2);
  }
}

class Program {
  private OutputStream output;
  private HashMap<IOperation> operations;

  Program(OuputStream output) {
    this.output = output;
    this.operations = new HashMap<IOperation>();
    this.operations.put("+", new PlusOperation());
    this.operations.put("-", new MinusOperation());
    this.operations.put("*", new MultOperation());
    this.operations.put("/", new DivOperation());
  }

  void calculate(String number1, String operator, String number2) {
    try {
      Expression expression = parseInput(number1, operator, number2);
      Integer result = expression.evaluate();
      this.output.println(result);
    } catch (ParseError e) {
      this.output.println("An error occured while parsing: " + e);
    } catch (DivisionByZero e) {
      this.output.println("Division by zero");
    }
  }

  Expression parseInput(String number1, String operator, 
                        String number2) throws ParseError {
    Integer value1 = parseInteger(number1);
    IOperation operation = parseOperation(operator);
    Integer value2 = parseInteger(number2);
    return new Expression(value1, operation, value2);
  }

  void ensureIsNotNull(String input) throws ParseError {
    if (input == null) throw new ParseError("Missing input.");
  }

  Integer parseInteger(String input) throws ParseError {
    ensureIsNotNull(input);
    Integer result = readInteger(input);
    if (result == null) throw new ParseError("Integer expected.");
    return result;
  }

  IOperation parseOperation(String input) throws ParseError {
    ensureIsNotNull(operator);
    IOperation result = this.operations.get(operator);
    if (result == null) {
      throw new ParseError("Unknown operator: " + operator);
    }
    return result;
  }
}

First, if you examine conditionals, you will see they are all of the most simple form: if some condition, then throw some exception. Two of them perform input validation, the other two filter results returned by calls to library functions (parseInteger and get from the HashMap can both return null). To sum up, conditionals are all placed at the system boundaries to eliminate unwanted values. It is no coincidence, but rather a choice to perform defense at the perimeter and to fail brutally the earliest possible.
Collecting a few metrics, we can produce the following tables:

Structure Classes Methods Interfaces Hierarchy depth
First program 1 1 0 0
Last program 6 12 1 1

Complexity Instructions Max. instructions/function Max. imbrication level
First program 26 26 5
Last program 44 7 1

Control flow if else switch throw catch
First program 8 8 0 0 0
Last program 4 0 0 3 2

Features occurrences of null enum generic
First program 3 0 0
Last program 3 0 1

Even though this program has more code than the initial version, methods are all very short. The longest one (method evaluate) has only 7 instructions. And the maximum imbrication depth is only one. Taken in isolation, each method does not amount to much, but it is the combination which proves valuable. Testing multiple small methods is easy. So much so, that some developers will not understand the value of such simple tests: seeing it only as an unnecessary repetition of what is already obvious in the code itself. Adding new behavior will be faster and more localized than in the initial monolithic version.
In brief, that is how I like software: minimalist.


Conclusion

In conclusion, I would like programmers to not satisfy themselves with code written in the style of the first sample of this post. Aesthetic matters.
At another level, I would like managers to understand that code quality is desirable, that change is possible. Not only that, but semantics preserving micro-changes are possible too. We performed a drastic code rewrite a step at a time. Each step was fairly small, allowing small commits and minimizing risks. Success is a journey, not a starting point.

Refactoring code is a multi-facet activity. Let me recapitulate the various techniques that were introduced in this post:
  • separating concerns (step I),
  • inverting conditionals and using early returns to make the code more malleable (step I),
  • slightly altering the program behavior (sometimes even the specification) to open up more factorization opportunities (step I),
  • detecting code smells (step II),
  • replacing comments by well-named methods (step II), 
  • extracting methods to improve structure (step II),
  • using exceptions to reduce cognitive load and fail early (step III),
  • cleaning up code to let bugs surface (step IV),
  • adding test before fixing bug (step IV),
  • DRY (step IV),
  • taking advantage of fixing bugs to clean up code (step IV),
  • making code testable (step IV and V),
  • injecting dependencies as constructor parameters to enable tests (step V),
  • replacing switch statements with inheritance (step VI),
  • increasing object encapsulation by moving methods around (step VI),
  • converting code into data: switch into hash-maps (step VII),
  • eliminating null from the system, filtering it at the boundaries, having defense at the perimeter (final word).