A few weeks ago there was a discussion on an IRC channel I hang in about null references. That was at the time of QConLondon when Sir Tony Hoare was giving a keynote presentation calling null references his billion dollar mistake. We talked about null for about an hour before we all gave up after realizing that it was never going to get anywhere.
I like IRC, but for long discussions, especially those involving many parties, it’s not the best medium out there. This blog post is an attempt to concisely explain the problem with null.
First, what’s null? Null is a way in many languages to convey the idea of “nothingness”. The concept of nothingness is an important one in computer science, and hardly anyone is suggesting that we do without it. This is not what people are arguing about.
The debate is over whether nothingness should be explicit or implicit.
In a language like Java, nothingness is implicit. A method that expects a String parameter can also be passed null. A method that expects an array of doubles can also be passed null. We can think of null as a member of every reference type in Java. However, its semantics are different from every other value of that type; calling a method or looking up an attribute on null results in a NullPointerException. This means that special care has to be taken when dealing with null. And because the compiler can’t know whether we are calling .length() on null or a valid string, it cannot prevent us from doing bad things. The responsibility of safe code rests entirely on the programmer’s shoulders.
On the other hand, in a language like Haskell, nothingness is explicit. Haskell has a type called Maybe a (this could be written Maybe<t> in Java) which has exactly two values: Nothing, which is the value you use to describe the absence of a value and Just x, which we can think of as a box containing the actual value. (The value needs to be “pulled out” of the box to be used normally.) The type Maybe String (a concrete instance of Maybe a) cannot be used where a String is expected, and vice-versa; you cannot call the function length on Nothing, the type checker will not allow it. This eliminates the NullPointerException problem. Furthermore, because Haskell knows that Maybe a has two values, it can warn you when it detects that you forgot to handle either case in a function. Haskell can actually help us avoid making errors.