Composition vs Inheritance

The internet is so full of discussions concerning this point. It was first presented to me in Effective Java and while I think one should always take suggested good practices with a grain of salt (they aren’t rules after all), I’ve always remembered this since I think it comes up again and again (at least for Java).

Here’s the short description of what the book suggests: Favor composition over inheritance.

Composition

Composition is simple: it’s just a member variable. For example, if your class needs a hash table, then one of your member variables in your class is a HashMap (or something similar).

Inheritance

Inheritance is a bit more complicated, but it’s a fundamental part of Object Oriented Programming. As the name suggests, it represents a relation between classes where one fully takes on the traits of another (sub-class and super-class respectively). This is commonly used when you want to define an interface and multiple different implementations of that interface. For example, there may be different implementations of a List, like a list backed by an array (ArrayList) or a list backed by a linked-list (LinkedList), etc. Each of the implementations fully takes on all traits of the List, so they all support methods like add, removesize, but they all implement them differently.

Composition vs Inheritance

Now what’s all the fuss about? Basically I think the author of Effective Java thinks inheritance is too easily abused. The concept if inheritance is supposed to be used when one class is supposed to fully take on the traits of the super-class. Like the example above, an ArrayList is supposed to fully take on the traits of a List. The existence of the ArrayList class fully exists on the concept of a List class. Or in other words, the ArrayList is a specific version of a List.

I think the problem comes when inheritance is used only to cut down on duplicated code. While good practice includes cutting down on duplicated code (or in general, decreasing the verbosity of code), inheritance should not be used as merely a tool for achieving that. Inheritance has a specific purpose and deviating from that purpose results in poorly structured code.

A good example I came across today, is in our codebase there is a class that executes some code as a Spark job. Part of running that code is of course handling the IO into the process. Unfortunately inheritance was chosen to structure that IO code. There was an interface called IOReaderWriter that exposes some methods to read/write IO. Now my guess is that it was decided for the executor class to extend the IOReaderWriter in order to cut down on some potential duplicated code so that methods inside of the executor class could just call methods like read() directly in the super-class. This is an example of abusing the inheritance model. Executing a spark job should not need to fully take on the traits of an IO helper class. The executor class might need to use those IO methods, but it is incorrect to say that the spark executor class is a specific implementation or version of an `IOReaderWriter`.

Now why the fuss about this? A simple effect of this is what happens when you want to change how the executor read/writes its IO. Maybe the storage layer has changed, or maybe you want to be able to intercept the IO for testing, etc. But in any case you now have the problem that the executor class fully takes on the traits of the IOReaderWriter. In order to change the IO layer, you have to reimplement overridden methods in the executor class, or write some complicated if-else logic. In either case, it would have been a lot easier if composition was used instead of inheritance. Instead of inheriting all traits of the IO class, the executor class only need the IO helper as a member variable and use its methods as it needed. There might be more code since instead of being able to call read() directly, for example, the executor class would include something like ioReaderWriter.read().

As a final thought, after google-searching this idea, it seems that the buzz-phrase regarding this topic is only use inheritance if there exists an “is-a” relationship. For example an ArrayList is a List. But a spark executor is not an IO reader/writer.