In the simplest case, an object has no knowledge of another object. No
references? No problem. The syntax for this situation is both simple and
well-known.
1
There comes a day when an object learns about another object. In Java, this
is done via a reference:
public class ClassA {
ClassB member;
}
Even a simple reference involves a number of decisions:
- Protection. The general rule is to use the least public
protection required (from private, package, protected, and
public).
- Static or member. If there's a single reference shared by all
members of that class, it should be declared static. If each
instance of the class needs its own reference, it's not static.
- Initialization. You can initialize a value in the declaration, in
the constructor, or via a member function. You want to maintain the validity
of the reference. By setting it in the declaration or constructor, you can
ensure that it moves from valid reference to valid reference (e.g., no
chance of NullPointerException). You may pay a price in time or
memory, though. If you don't set the value until some member function is
called, you risk that a sequence of calls (in an unexpected order) can
retrieve a null value.
- Update. Some references are intended to be set once and treated
as read-only after that. (The object referred to may change, but the
reference itself will not.) These are usually best dealt with using
declaration or constructor initialization. Other references are meant to
change, and you'll provide some methods that have that effect. Note that it
is harder to think about values that change than those that don't.
2 or a Few
If you have a reference to a different class, you normally create a
separate member variable. If the second reference is of the same type as the
first, you have to decide whether or not to make it separate. A second
reference may presage a third, fourth, etc. - see the next section ("Many").
If you can only have 2 or 3 of something, just declare the new one(s) as a
new field. For example, if you're dealing with a transfer between two
accounts, there's one account being credited and one being debited. We see no
reason this will evolve to n-way transfers, so two fields will suffice. You
might make the same decision for a 2-d coordinate (x and y).
Many
Once you grow to two or more of something, you should consider allowing for
an arbitrary number of them. Java has three alternatives:
- Array. You can create an array of the type being referenced:
ObjectX[] array = new ObjectX [count];
(Recall that in Java the array and its contents are initialized
separately, so this statement created an array of null references.) The nice
part about using an array is that it contains objects of a known type. The
downside is the requirement that you know the size of the array to allocate
it, and it's difficult to change its size.
- Collection. "Collection" is meant to cover all Java's generic
collection objects: Vector, List, etc. In Java, containers are based on
storing and fetching an Object. The user of the collection must know to
which type to cast the result. (If Java had some sort of template or generic
type facility, this cast would be unnecessary.) The benefit of using
collections is that they provide flexible, variable-size containers, with a
variety of performance characteristics. The downside is their requirement
for casts.
- Specialized collection object. Some collections need to maintain
specialized information. In these cases, you might introduce a special type
for the collection. For example, suppose you're creating a checkbook program
that needs to track a number of transactions. You could maintain an array or
Vector of these, but a set of transactions really represent a whole account.
Since Accounts have other significant information attached to them (account
number, total, etc.), we might have a new class Account with a list of
Transactions.
Account <>------* Transaction
Now, other objects can maintain a reference to Account instead of to a
bunch of Transactions. Account can use whatever collection method is most
appropriate. (If collections are used, Account can hide the necessary casts
- an example of Fowler's refactoring "Encapsulate Downcast".)
The good side of a specialized object is that it creates a meaningful new
object where we can attach behavior. It's more work than creating a generic
collection, though. (See also M. Fowler's refactoring "Encapsulate
Collection".)
Composite
A list of items may suffice for a long time, but you may eventually need a
more complex structure.
Suppose you have a set of products, and someone chooses at most one of
each, maintained in a list of current orders. One day, the marketing
department gets the idea of having "bundles", where a bundle is a list of
existing products. You can try to maintain the existing list structure, adding
the products in the bundle into the "current order" list. The problem comes
when a user wants to remove a bundle: you find you've lost track of what was
in the bundle.
The crucial step to solving this is to say, "What if a bundle were another
product?" Then we have products containing products. This sort of recursion
screams out "Composite".
You might think you don't need the full complexity of composite - it allows
bundles that contain bundles, and who would need that? That may or may not be
true. It's certainly possible to conceive of a system that allows only
products or bundles containing products (but not bundles containing bundles).
This situation is much like the 0-1-infinity rule that discourages using 2
or 3 explicit members: if you have two levels of products, odds are
good that you'll have more. (You'll probably find it less complex to allow the
recursion).
How to Move from Custom Collection to Composite
- Move your design to a custom collection type:
List <>-----* Item
Make sure the List has methods for add(Item), remove(Item),
and getChild(int). (Test.)
- Create a new parent class "Parent", with a protected constructor, that
has the add(Item), remove(Item), and getChild(int)
methods. (Test.) [The names of your classes should be
appropriate for your domain; in the example, we might use "Product",
"Bundle", and "Item".]
- Make Parent the superclass of Item. (Test.)
- Modify List's and Parent's methods to take and store instances of Parent
(instead of Item). You may need to declare and stub out any methods from
Item to Parent. The compiler will tell you which ones they are. (Test.)
- Make Parent the superclass of List. (Test.)
- Move the signature of any (remaining) operation common to both List and
Item up into the parent. Provide a default implementation if possible.
(Test.)
- For each method, ensure that it's implementation is reasonable. You may
be able to make some of them abstract in the Parent. (Test.)
- Locate each reference to Item. See if it can make sense to operate on
Parent instead, and generalize the type if possible. (Test.)
- Locate each reference to List. See if it makes sense to operate on
Parent instead, and generalize the type if possible. (Test.)
And Back...
When do you move back from Composite to a collection?
- When your objects are no longer in a meaningful hierarchy. (You have
items and lists of items, but no lists of lists.)
- When there are no shared operations (left) between lists and items.
If you have the data hierarchy of a Composite, but no shared operations,
you're in a situation similar to a "data bag" or "struct class": you may want
to observe what clients do with the class - perhaps there's functionality that
belongs in the Composite rather than its client.
How to Move from Composite to Collection
- Modify the add(), remove(), and getChild()
methods of Parent and List to operate on Item rather than Parent. The
compiler will let you know if anybody is still trying to use "bundles of
bundles". (Test.)
- Make sure List has all the methods of Parent. (Move an implementation
down if necessary.) (Test.)
- Make List's superclass be the superclass of Parent. (Test.)
- Make sure Item has any methods it needs from Parent. (It shouldn't need
the list methods.) (Test.)
- Make Item's superclass be the superclass of Parent. (Test.)
- Nobody should be referencing the Parent class any more. Remove it.
(Test.)
- Review the operations shared by Item and List to see if any should be
removed. (Test.)
...To 0
The remaining changes are straightforward:
- Change a constant-size collection to use an array.
- Move from a small array to explicit members.
- Delete members until there's only one left.
- Delete the last reference, and you're back to "0".
Summary
We've shown how a series of fairly small steps can move you through the
path from 0 references to a Composite. This suggests that we are never
"painted into a corner" if we start simpler, even if our data will grow to the
complexity of the trees used in Composite.
We've also shown how to refactor the other way: from Composite down to
nothing. This assures us that we can simplify our application if the
requirements become simpler.
Resources