The refactoring rules in Writing Efficient Programs tend to be at a lower level than the refactorings described in M. Fowler's Refactoring book: they're focused more on performance issues, sometimes even at the expense of code readability. (If you use these rules for performance tuning, measure your program to ensure that performance is actually improving: some rules work better "forward" on some machines and "backwards" on others, and some can interfere with what your compiler does.) Not all the rules are strictly performance-based: some will enhance your understanding of what the program does.
We'll discuss these rules in four categories:
The names don't always agree with Bentley's originals; I've tried to use Fowler's approach of using an active verb phrase ("do this").
The approach will be to show a motivating example "before and after", like this:
The part that's going to change (on the left) is show in italics; the part that has changed (on the right) is shown in bold. It will be followed by any discussion.
Some functions can only grow in one direction: "and" conditions can never go from false to true, positive integers added together get bigger. This table suggests a few possibilities:
In the example, we've moved initialization from when the routine is called to class loading time (and possibly compile-time). This table shows initialization time from latest to earliest:
Notice the use of final static in the example - this says to create the value only once, when the class is loaded, and never permit it to be changed after that.
This is an example of "Initialize Earlier", implemented in a specific way.
This code assumes that nothing else uses the computed boolean value (so we can eliminate the variable). It may seem odd to end up with two copies of the intervening body ("f()" in the example), but once this code is in each branch it can be customized for the situation at hand.
There are times when you're computing related functions on the same set of inputs. If you only knew that both results were needed, you could compute both at the same time more cheaply than one after the other. This may be because you're able to apply other techniques (such as Fuse Loops) to the paired code, or it may be because of some mathematical relationship between them (sin() and cos() share some intermediate computations).
The function can be fairly complicated. In a previous paper (Refactoring: An Example) we described a situation where a template would repeatedly scan a string and replace patterns, but if the template knew all strings for any replacement, it could do it in one pass by working on whatever pattern it found next.
The expression being moved must not have side-effects (because it goes from being evaluated twice to being evaluated only once), and it must not be able to change its value between the two uses (even in the presence of threads).
This rule lets you replace computations with equivalent but cheaper ones.
By adding a little extra storage and putting a sentinel value there, we're able to combine the tests "a < SIZE" and "a[pos] != searchValue".
This is also a special case of Initialize Earlier. ("Hoist" brings in the idea of lifting the code up.) Because the computation is inside a loop, its cost may be paid multiple times. Be careful that other threads can't change the value (lest it not be constant in the loop). The expression must have no side effects (as it's now evaluated only once).
This is one of those "touchy" transformations. You may find yourself running it "backwards" more often than forwards (i.e., realizing that a repetitive sequence of statements can be put in a loop). The benefit is that it reduces or eliminates the overhead of the loop; this can usually only pay off when the loop overhead is high relative to the computation being done. You don't have to completely unroll the loop; you could run it half as many times by having two copies of the body (and making any adjustments for odd numbers). Make sure the subsequent copies of the body take into account the updated value of the loop index.
The two loops must run the same number of times (or be made to). The two bodies should be independent, so merging them won't let them interfere with each other.
This is the compiler optimization "Reduction in Strength". The "reduction" refers to the fact that we've converted the multiplication to an addition. This transformation is related to "Apply Algebraic Identity".
The "Procedure" rules J. Bentley describes are more complicated than I can properly explain, so I'll summarize them and refer you to Writing Efficient Programs for the details.
Bentley's first procedure rule is equivalent to M. Fowler's refactoring "Inline Method" (and "Extract Method"). See Refactoring for a full discussion.
Use Coroutines for Multi-Pass Algorithms
A coroutine is a variant of the idea of a subroutine where the two routines call each other in a peer relationship rather than master-slave. (See Knuth, vol. 1.) Multiple calls of the coroutine lead to it resuming where it left off, rather than recursively starting at the top again. Java uses subroutines, not coroutines, so you're not going to easily create code that uses that approach. (A Java program could probably use state machines of some sort to simulate them but it would not be an "accidental" implementation.)
Bentley points out a related issue though: some designs will read and write complete files in a multi-pass way, but an implementation could be designed to have the various passes communicate via in-memory buffers, a little at a time, rather than through the file system. The classes java.io.PipedInputStream and java.io.PipedOutputStream can help you implement this in a multi-threaded way. (This is related to Unix' "|" pipe and filter mechanism.)
Introduce Helper Method for Small Cases
Some methods are better for large cases than small ones. (You often see this for recursive solutions.) The method can divide-and-conquer until it hits the small case, then call a special method tuned for that. (Bentley uses a good example of this in discussing sorting: a quicksort may be the chosen generic sorting method, but it has a lot of overhead in sorting only a few elements.)
If the last thing a routine does is recursively call itself, you can instead adjust the arguments and branch back to the top, saving the overhead of stacking contexts and making another procedure call.
Introduce an Explicit Stack
Recursion works by having a hidden call stack. A routine can simulate this by maintaining its own stack. When a recursive call would be made, push the current state onto the stack, adjust the arguments, and branch to the top. This is a pretty dramatic change in a procedure, and can definitely reduce the clarity of the routine. (You may be able to use it backwards, converting a stack-based routine into a recursive routine - the recursive routine will usually be much shorter and much easier to understand.)
Suppose the example code is in a loop where i runs from 0 to a large value. In the original, odd numbers are tested last even though they make up half the numbers. In the revised version, the tests are ordered so that the most common values are tested first.
Be very careful when twisting around the if-clauses and revising the conditions.
In this example, we've changed a series of flags to use the bits in a word, so we can use word-level operations (including equality, "and", "or", etc.) The principle goes further, though. You might be able to use multiple threads across CPUS; you may be able to arrange things so reads and writes of values can be overlapped; you may be able to overlap processing and I/O; etc.
Trade Off Time versus Space
Bentley has a series of rules that discuss how to spend space to save time, and vice versa. His rules cover the gamut - from pre-computing all results to introducing interpreters.
We've discussed a number of refactorings, based on Bentley's book Writing Efficient Programs. He discussed them in the context of performance tuning, but they are applicable outside that domain. We've classified them like this:
I encourage you to use these refactorings when they improve your code. I also urge you to pick up and read a copy of Jon Bentley's Writing Efficient Programs, both for a better explanation of these rules, and a glimpse of how performance tuning can resemble refactoring for design.
[Written 2-11-2000; added Auer/Beck and Smith references, 2-19-2000; corrected "more false" problem caught by Kieran Barry 12-22-2001.]
Copyright 1994-2010, William C. Wake - William.Wake@acm.org