What if every object was an array? No more NullPointerExceptions!

Posted on August 12, 2013August 12, 2013 by lukaseder

To NULL or not to NULL? Programming language designers inevitably have to decide whether they support NULLs or not. And they’ve proven to have a hard time getting this right. NULL is not intuitive in any language, because NULL is an axiom of that language, not a rule that can be derived from lower-level axioms. Take Java for instance, where


// This yields true:
null == null

// These throw an exception (or cannot be compiled)
null.toString();
int value = (Integer) null;

It’s not like there weren’t any alternatives. SQL, for instance, implements a more expressive but probably less intuitive three-value logic, which most developers get wrong in subtle ways once in a while. At the same time, SQL doesn’t know “NULL” results, only “NULL” column values. From a set theory perspective, there are only empty sets, not NULL sets. Other languages allow for dereferencing null through special operators, letting the compiler generate tedious null checks for you, behind the scenes. An example for this is Groovy with its null-safe dereferencing operator. This solution is far from being generally accepted, as can be seen in this discussion about a Scala equivalent. Scala uses Option, which Java 8 will imitate using Optional (or @Nullable).

Let’s think about a much broader solution

To me, nullability isn’t a first-class citizen. I personally dislike the fact that Scala’s Option[T] type pollutes my type system by introducing a generic wrapper type (even if it seems to implement similar array-features through the traversable trait). I don’t want to distinguish the types of Option[T] and T. This is specifically true when reasoning about types from a reflection API perspective, where Scala’s (and Java’s) legacy will forever keep me from accessing the type of T at runtime. But much worse, most of the times, in my application I don’t really want to distinguish between “option” references and “some” references. Heck, I don’t even want to distinguish between having 1 reference and having dozens. jQuery got this quite right. One of the main reasons why jQuery is so popular is because everything you do, you do on a set of wrapped DOM elements. The API never distinguishes between matching 1 or 100 div’s. Check out the following code:


// This clearly operates on a single object or none
$('div#unique-id').html('new content')
                  .click(function() { ... });

// This possibly operates on several objects or none
$('div.any-class').html('new content')
                  .click(function() { ... });

This is possible because JavaScript allows you to override the prototype of the JavaScript Array type, modifying arrays in general, at least for the scope of the jQuery library. How more awesome can it get? .html() and .click() are actions performed on the array as a whole, no matter if you have zero, one, or 100 elements in your match. What would a more typesafe language look like, where everything behaves like an array (or an ArrayList)? Think about the following model:


class Customer {
  String firstNames;  // Read as String[] firstNames
  String lastName;    // Read as String[] lastName
  Order orders;       // Read as Order[] orders
}

class Order {
  int value;          // Read as int[] value
  boolean shipped() { // Read as boolean[] shipped
  }
}

Don’t rant (just yet). Let’s assume this wouldn’t lead to memory or computation overhead. Let’s continue thinking about the advantages of this. So, I want to see if a Customer’s orders have been shipped. Easy:


Customer customer = // ...
boolean shipped = customer.orders.shipped();

This doesn’t look spectacular (yet). But beware of the fact that a customer can have several orders, and the above check is really to see if all orders have been shipped. I really don’t want to write the loop, I find it quite obvious that I want to perform the shipped() check on every order. Consider:


// The length pseudo-field would still be
// present on orders
customer.orders.length;

// In fact, the length pseudo-field is also
// present on customer, in case there are several
customer.length;

// Let's add an order to the customer:
customer.orders.add(new Order());

// Let's reset order
customer.orders.clear();

// Let's calculate the sum of all values
// OO-style:
customer.orders.value.sum();
// Functional style:
sum(customer.orders.value);

Of course there would be a couple of caveats and the above choice of method names might not be the best one. But being able to deal with single references (nullable or non-nullable) or array references (empty, single-valued, multi-valued) in the same syntactic way is just pure syntax awesomeness. Null-checks would be replaced by length checks, but mostly you don’t even have to do those, because each method would always be called on every element in the array. The current single-reference vs. multi-reference semantics would be documented by naming conventions. Clearly, naming something “orders” indicates that multi-references are possible, whereas naming something “customer” indicates that multi-references are improbable. As users have commented, this technique is commonly referred to as array programming, which is implemented in Matlab or R.

Convinced?

I’m curious to hear your thoughts!

Published by lukaseder

I made jOOQ View all posts by lukaseder

23 thoughts on “What if every object was an array? No more NullPointerExceptions!”

Alexey Romanov (@alexey_r) says:

August 12, 2013 at 15:38

There are languages which work like this: http://en.wikipedia.org/wiki/Array_programming Probably the most widely used ones are Matlab and R. Though I don’t know any statically typed ones…

Loading...

Reply
1. lukaseder says:
  
  August 12, 2013 at 15:53
  
  I wasn’t aware of that, good to know!
  
  Loading...
  
  Reply
Denis says:

August 12, 2013 at 15:39

As far as I knew, Scala approach really close to “array objects”.
But instead of arrays – Scala use Traversable trait.
From this point of view there no any differences between Option and List(or Array). In fact, Option is a collection in Scala.

Loading...

Reply
1. lukaseder says:
  
  August 12, 2013 at 15:53
  
  Today, I learned! Nice!
  
  Loading...
  
  Reply
  1. Nicholas says:
    
    August 12, 2013 at 16:39
    
    In fact both of Option and List (Array) are monads, that’s why they are so close.
    Both have map and flatMap methods and you can go fast from List[Option[T]] to List[T] (of course, you can do the same with List[List[T]] -> List[T]).
    But for me treating all as Array is the same oversimplifying as treating nullable and notnull values equally. In both cases you lose some information from you relational model (and yeah, I’d prefer not to have a deal with NULL’s at all, and I know, Chris Date too :) ). And that’s why Scala’s Option is the best thing that happened to NULLs. You are forced to take care about possible nullability and you are flexible in processing results.
    About low level access, reflection and so on: if we speak about scala, there is reflection API for now (since scala 2.10) where you can access all you want. But you don’t need it too much if you’re writing in Scala :)
    
    Loading...
    
    Reply
arnaud says:

August 12, 2013 at 18:07

I like Kotlin approach with “?”.
val maybeStr: String? = “…..”
Null pointer handling is so important that it should be very “compact” and part of the language.

Loading...

Reply
1. lukaseder says:
  
  August 13, 2013 at 08:43
  
  Just like Ceylon…
  
  While I can see that this is something new and innovative, I still think it deals with NULLs as first-class citizens in a language. But among all options, this one seems not so bad, as the type information is mainly used by the compiler, which hopefully won’t pollute my type hierarchy.
  
  Loading...
  
  Reply
Marcin says:

August 14, 2013 at 12:50

// OO-style:
customer.orders.value.sum();

This is not OO-style, it is procedural style.

Loading...

Reply
1. lukaseder says:
  
  August 14, 2013 at 13:44
  
  What would be OO-style, then?
  
  Loading...
  
  Reply
  1. Marcin says:
    
    August 15, 2013 at 13:10
    
    customer.ordersValueSum();
    
    This does not break encapsulation.
    
    Loading...
    
    Reply
arnaud says:

August 15, 2013 at 14:31

customer.ordersValueSum() doesn’t follow the High Cohesion OO pattern.
Sum/Max/Min/Mean,… should not be encapsulated in Customer object (except maybe for performance reasons if optimisation is possible).
Instead, if the problem is the “external” iterator, the visitor pattern could be used: customer.visitOrders(sumVisitor).

Loading...

Reply
1. lukaseder says:
  
  August 15, 2013 at 14:47
  
  Just when I was going to say that customer.ordersValueSum() may break “single responsibility”, you unleash the visitor pattern on such a trivial problem. ;-)
  
  Loading...
  
  Reply
  1. arnaud says:
    
    August 15, 2013 at 14:59
    
    :)
    Visitors will be cool again with Java 8 lambda!
    
    Loading...
    
    Reply
    1. lukaseder says:
      
      August 15, 2013 at 15:03
      
      At the use-site, maybe. At the design-site, impossible. The visitor pattern was brought to this planet by the architecture astronaut:
      https://blog.jooq.org/2012/04/10/the-visitor-pattern-re-visited.
      
      But that’s just my opinion ;-)
      
      Loading...
      
      Reply
      1. arnaud says:
        
        August 15, 2013 at 15:21
        
        arr.forEach(function(elt) {…});
        
        architecture astronaut really?
        
        Loading...
        
        Reply
        
        lukaseder says:
        
        August 15, 2013 at 15:23
        
        ~~Ah, that’s where the misunderstanding originates. That’s just a “lambda”, not really a “visitor” as in the OO-term “visitor pattern”. Lambdas will be awesome, no doubt.~~
        
        Confusing… Yes, I understand that lambdas are awesome at the call-site. And they may even be used with the visitor pattern. But the visitor pattern itself makes a type-hierarchy awefully moronic, in my opinion. Besides, a typical visitor will not be a SAM interface. A visitor will have dozens of “accept” methods, so there’s no lambda equivalent to express a visitor (usually)
        
        Loading...
        
        Reply
arnaud says:

August 16, 2013 at 22:56

Ok, let’s call that “Internal iterator” instead of (SAM) Visitor.
The important idea is that the callsite has not manage the iteration, it doesn’t see a collection.

Loading...

Reply
1. lukaseder says:
  
  August 17, 2013 at 01:29
  
  Yeah, I agree that such a functional solution would make issues more elegant. Thus,
  
  customers.orders.reduce(0, (a, b) -> a + b)
  
  As suggested by the latest state-of-the-lambda
  
  Loading...
  
  Reply
Jasd says:

August 19, 2013 at 21:06

I think it is useful to have NullPointerExceptions because of the “fail-fast” philosophy. When you misspell the jQuery selector there is no hint to that mistake and it will be harder to find.
Furthermore there are many interpretations what
boolean shipped = customer.orders.shipped();
may mean. Is it true if any order of any customer is shipped? Or if all orders are shipped for a single customer? Or if all orders of all customers are shipped? etc.

Loading...

Reply
1. arnaud says:
  
  August 21, 2013 at 10:55
  
  I fully agree with the jQuery fail-fast problem. :)
  
  Fluent assertions can be used to mitigate that:
  https://stackoverflow.com/questions/498469/jquery-assertion-support-defensive-programming
  
  Loading...
  
  Reply
ll says:

August 26, 2013 at 17:00

XQuery works like this, and it can be statically typed.

Loading...

Reply

What if every object was an array? No more NullPointerExceptions!

Let’s think about a much broader solution

Convinced?

Like this:

Published by lukaseder

23 thoughts on “What if every object was an array? No more NullPointerExceptions!”

Leave a ReplyCancel reply

Let’s think about a much broader solution

Convinced?

Like this:

Published by lukaseder

23 thoughts on “What if every object was an array? No more NullPointerExceptions!”

Leave a ReplyCancel reply

Discover more from Java, SQL and jOOQ.