Java 8’s Method References Put Further Restrictions on Overloading

Method overloading has always been a topic with mixed feelings. We’ve blogged about it and the caveats that it introduces a couple of times:

There are two main reasons why overloading is useful:

  1. To allow for defaulted arguments
  2. To allow for disjunct argument type alternatives

Bot reasons are motivated simply to provide convenience for API consumers. Good examples are easy to find in the JDK:

Defaulted arguments

public class Integer {
    public static int parseInt(String s) {
        return parseInt(s,10);
    }

    public static int parseInt(String s, int radix) {}
}

In the above example, the first parseInt() method is simply a convenience method for calling the second one with the most commonly used radix.

Disjunct argument type alternatives

Sometimes, similar behaviour can be achieved using different types of parameters, which mean similar things but which are not compatible in Java’s type system. For example when constructing a String:

public class String {
    public static String valueOf(char c) {
        char data[] = {c};
        return new String(data, true);
    }

    public static String valueOf(boolean b) {
        return b ? "true" : "false";
    }

    // and many more...
}

As you can see, the behaviour of the same method is optimised depending on the argument type. This does not affect the “feel” of the method when reading or writing source code as the semantics of the two valueOf() methods is the same.

Another use-case for this technique is when commonly used, similar but incompatible types need convenient conversion between each other. As an API designer, you don’t want to make your API consumer goof around with such tedious conversions. Instead, you offer:

public class IOUtils {
    public static void copy(InputStream input, OutputStream output);
    public static void copy(InputStream input, Writer output);
    public static void copy(InputStream input, Writer output, String encoding);
    public static void copy(InputStream input, Writer output, Charset encoding);
}

This is a nice example showing both defaulted parameters (optional encoding) as well as argument type alternatives (OutputStream vs. Writer or String vs. Charset encoding representation.

Side-note

I suspect that the union type and defaulted argument ships have sailed for Java a long time ago – while union types might be implemented as syntax sugar, defaulted arguments would be a beast to introduce into the JVM as it would depend on the JVM’s missing support for named arguments.

As displayed by the Ceylon language, these two features cover about 99% of all method overloading use-cases, which is why Ceylon can do completely without overloading – on top of the JVM!

Overloading is dangerous and unnececssary

The above examples show that overloading is essentially just a means to help humans interact with an API. For the runtime, there is no such thing as overloading. There are only different, unique method signatures to which calls are linked “statically” in byte code (give or take more recent opcodes like invokedynamic). But the point is, there’s no difference for the computer if the above methods are all called copy(), or if they had been called unambiguously m1(), m2(), m3(), and m4().

On the other hand, overloading is real in Java source code, and the compiler has to do a lot of work to find the most specific method, and otherwise apply the JLS’s complex overload resolution algorithm. Things get worse with each new Java language version. In Java 8, for instance, method references will add additional pain to API consumers, and require additional care from API designers. Consider the following example by Josh Bloch:

You can copy-paste the above code into Eclipse to verify the compilation error (note that not-up-to-date compilers may report type inference side-effects instead of the actual error). The compilation error reported by Eclipse for the following simplification:

static void pfc(List<Integer> x) {
    Stream<?> s = x.stream().map(Integer::toString);
}

… is

Ambiguous method reference: both toString() and 
toString(int) from the type Integer are eligible

Oops!

The above expression is ambiguous. It can mean any of the following two expressions:

// Instance method:
x.stream().map(i -> i.toString());

// Static method:
x.stream().map(i -> Integer.toString(i));

As can be seen, the ambiguity is immediately resolved by using lambda expressions rather than method references. Another way to resolve this ambiguity (towards the instance method) would be to use the super-type declaration of toString() instead, which is no longer ambiguous:

// Instance method:
x.stream().map(Object::toString);

Conclusion

The conclusion here for API designers is very clear:

Method overloading has become an even more dangerous tool for API designers since Java 8

While the above isn’t really “severe”, API consumers will waste a lot of time overcoming this cognitive friction when their compilers reject seemingly correct code. One big faux-pas that is a takeaway from this example is to:

Never mix similar instance and static method overloads

And in fact, this amplifies when your static method overload overloads a name from java.lang.Object, as we’ve explained in a previous blog post.

There’s a simple reason for the above rule. Because there are only two valid reasons for overloading (defaulted parameters and incompatible parameter alternatives), there is no point in providing a static overload for a method in the same class. A much better design (as exposed by the JDK) is to have “companion classes” – similar to Scala’s companion objects. For instance:

// Instance logic
public interface Collection<E> {}
public class Object {}

// Utilities
public class Collections {}
public final class Objects {}

By changing the namespace for methods, overloading has been circumvented somewhat elegantly, and the previous problems would not have appeared.

TL;DR: Avoid overloading unless the added convenience really adds value!

You Will Regret Applying Overloading with Lambdas!

Writing good APIs is hard. Extremely hard. You have to think of an incredible amount of things if you want your users to love your API. You have to find the right balance between:

  1. Usefulness
  2. Usability
  3. Backward compatibility
  4. Forward compatibility

We’ve blogged about this topic before, in our article: How to Design a Good, Regular API. Today, we’re going to look into how…

Java 8 changes the rules

Yes!

Overloading is a nice tool to provide covenience in two dimensions:

  • By providing argument type alternatives
  • By providing argument default values

Examples for the above from the JDK include:

public class Arrays {

    // Argument type alternatives
    public static void sort(int[] a) { ... }
    public static void sort(long[] a) { ... }

    // Argument default values
    public static IntStream stream(int[] array) { ... }
    public static IntStream stream(int[] array, 
        int startInclusive, 
        int endExclusive) { ... }
}

The jOOQ API is obviously full of such convenience. As jOOQ is a DSL for SQL, we might even abuse a little bit:

public interface DSLContext {
    <T1> SelectSelectStep<Record1<T1>> 
        select(SelectField<T1> field1);

    <T1, T2> SelectSelectStep<Record2<T1, T2>> 
        select(SelectField<T1> field1, 
               SelectField<T2> field2);

    <T1, T2, T3> SelectSelectStep<Record3<T1, T2, T3>> s
        select(SelectField<T1> field1, 
               SelectField<T2> field2, 
               SelectField<T3> field3);

    <T1, T2, T3, T4> SelectSelectStep<Record4<T1, T2, T3, T4>> 
        select(SelectField<T1> field1, 
               SelectField<T2> field2, 
               SelectField<T3> field3, 
               SelectField<T4> field4);

    // and so on...
}

Languages like Ceylon take this idea of convenience one step further by claiming that the above is the only reasonable reason why overloading is be used in Java. And thus, the creators of Ceylon have completely removed overloading from their language, replacing the above by union types and actual default values for arguments. E.g.

// Union types
void sort(int[]|long[] a) { ... }

// Default argument values
IntStream stream(int[] array,
    int startInclusive = 0,
    int endInclusive = array.length) { ... }

Read “Top 10 Ceylon Language Features I Wish We Had In Java” for more information about Ceylon.

In Java, unfortunately, we cannot use union types or argument default values. So we have to use overloading to provide our API consumers with convenience methods.

If your method argument is a functional interface, however, things changed drastically between Java 7 and Java 8, with respect to method overloading. An example is given here from JavaFX.

JavaFX’s “unfriendly” ObservableList

JavaFX enhances the JDK collection types by making them “observable”. Not to be confused with Observable, a dinosaur type from the JDK 1.0 and from pre-Swing days.

JavaFX’s own Observable essentially looks like this:

public interface Observable {
  void addListener(InvalidationListener listener);
  void removeListener(InvalidationListener listener);
}

And luckily, this InvalidationListener is a functional interface:

@FunctionalInterface
public interface InvalidationListener {
  void invalidated(Observable observable);
}

This is great, because we can do things like:

Observable awesome = 
    FXCollections.observableArrayList();
awesome.addListener(fantastic -> splendid.cheer());

(notice how I’ve replaced foo/bar/baz with more cheerful terms. We should all do that. Foo and bar are so 1970)

Unfortunately, things get more hairy when we do what we would probably do, instead. I.e. instead of declaring an Observable, we’d like that to be a much more useful ObservableList:

ObservableList<String> awesome = 
    FXCollections.observableArrayList();
awesome.addListener(fantastic -> splendid.cheer());

But now, we get a compilation error on the second line:

awesome.addListener(fantastic -> splendid.cheer());
//      ^^^^^^^^^^^ 
// The method addListener(ListChangeListener<? super String>) 
// is ambiguous for the type ObservableList<String>

Because, essentially…

public interface ObservableList<E> 
extends List<E>, Observable {
    void addListener(ListChangeListener<? super E> listener);
}

and…

@FunctionalInterface
public interface ListChangeListener<E> {
    void onChanged(Change<? extends E> c);
}

Now again, before Java 8, the two listener types were completely unambiguously distinguishable, and they still are. You can easily call them by passing a named type. Our original code would still work if we wrote:

ObservableList<String> awesome = 
    FXCollections.observableArrayList();
InvalidationListener hearYe = 
    fantastic -> splendid.cheer();
awesome.addListener(hearYe);

Or…

ObservableList<String> awesome = 
    FXCollections.observableArrayList();
awesome.addListener((InvalidationListener) 
    fantastic -> splendid.cheer());

Or even…

ObservableList<String> awesome = 
    FXCollections.observableArrayList();
awesome.addListener((Observable fantastic) -> 
    splendid.cheer());

All of these measures will remove ambiguity. But frankly, lambdas are only half as cool if you have to explicitly type the lambda, or the argument types. We have modern IDEs that can perform autocompletion and help infer types just as much as the compiler itself.

Imagine if we really wanted to call the other addListener() method, the one that takes a ListChangeListener. We’d have to write any of

ObservableList<String> awesome = 
    FXCollections.observableArrayList();

// Agh. Remember that we have to repeat "String" here
ListChangeListener<String> hearYe = 
    fantastic -> splendid.cheer();
awesome.addListener(hearYe);

Or…

ObservableList<String> awesome = 
    FXCollections.observableArrayList();

// Agh. Remember that we have to repeat "String" here
awesome.addListener((ListChangeListener<String>) 
    fantastic -> splendid.cheer());

Or even…

ObservableList<String> awesome = 
    FXCollections.observableArrayList();

// WTF... "extends" String?? But that's what this thing needs...
awesome.addListener((Change<? extends String> fantastic) -> 
    splendid.cheer());

Overload you shan’t. Be wary you must.

API design is hard. It was hard before, it has gotten harder now. With Java 8, if any of your API methods’ arguments are a functional interface, think twice about overloading that API method. And once you’ve concluded to proceed with overloading, think again, a third time whether this is really a good idea.

Not convinced? Have a close look at the JDK. For instance the java.util.stream.Stream type. How many overloaded methods do you see that have the same number of functional interface arguments, which again take the same number of method arguments (as in our previous addListener() example)?

Zero.

There are overloads where overload argument numbers differ. For instance:

<R> R collect(Supplier<R> supplier,
              BiConsumer<R, ? super T> accumulator,
              BiConsumer<R, R> combiner);

<R, A> R collect(Collector<? super T, A, R> collector);

You will never have any ambiguity when calling collect().

But when the argument numbers do not differ, and neither do the arguments’ own method argument numbers, the method names are different. For instance:

<R> Stream<R> map(Function<? super T, ? extends R> mapper);
IntStream mapToInt(ToIntFunction<? super T> mapper);
LongStream mapToLong(ToLongFunction<? super T> mapper);
DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper);

Now, this is super annoying at the call site, because you have to think in advance what method you have to use based on a variety of involved types.

But it’s really the only solution to this dilemma. So, remember:

You Will Regret Applying Overloading with Lambdas!

Did you like this article? You might also like:

Keeping things DRY: Method overloading

A good clean application design requires discipline in keeping things DRY:

Everything has to be done once.
Having to do it twice is a coincidence.
Having to do it three times is a pattern.

— An unknown wise man

Now, if you’re following the Xtreme Programming rules, you know what needs to be done, when you encounter a pattern:

refactor mercilessly

Because we all know what happens when you don’t:

Image Copyright (C) by hahastop.com
Image Copyright (C) by hahastop.com

Not DRY: Method overloading

One of the least DRY things you can do that is still acceptable is method overloading – in those languages that allow it (unlike Ceylon, JavaScript). Being an internal domain-specific language, the jOOQ API makes heavy use of overloading. Consider the type Field (modelling a database column):

public interface Field<T> {

    // [...]

    Condition eq(T value);
    Condition eq(Field<T> field);
    Condition eq(Select<? extends Record1<T>> query);
    Condition eq(QuantifiedSelect<? extends Record1<T>> query);

    Condition in(Collection<?> values);
    Condition in(T... values);
    Condition in(Field<?>... values);
    Condition in(Select<? extends Record1<T>> query);

    // [...]

}

So, in certain cases, non-DRY-ness is inevitable, also to a given extent in the implementation of the above API. The key rule of thumb here, however, is to always have as few implementations as possible also for overloaded methods. Try calling one method from another. For instance these two methods are very similar:

Condition eq(T value);
Condition eq(Field<T> field);

The first method is a special case of the second one, where jOOQ users do not want to explicitly declare a bind variable. It is literally implemented as such:

@Override
public final Condition eq(T value) {
    return equal(value);
}

@Override
public final Condition equal(T value) {
    return equal(Utils.field(value, this));
}

@Override
public final Condition equal(Field<T> field) {
    return compare(EQUALS, nullSafe(field));
}

@Override
public final Condition compare(Comparator comparator, Field<T> field) {
    switch (comparator) {
        case IS_DISTINCT_FROM:
        case IS_NOT_DISTINCT_FROM:
            return new IsDistinctFrom<T>(this, nullSafe(field), comparator);

        default:
            return new CompareCondition(this, nullSafe(field), comparator);
    }
}

As you can see:

  • eq() is just a synonym for the legacy equal() method
  • equal(T) is a more specialised, convenience form of equal(Field<T>)
  • equal(Field<T>) is a more specialised, convenience form of compare(Comparator, Field<T>)
  • compare() finally provides access to the implementation of this API

All of these methods are also part of the public API and can be called by the API consumer, directly, which is why the nullSafe() check is repeated in each method.

Why all the trouble?

The answer is simple.

  • There is only very little possibility of a copy-paste error throughout all the API.
  • … because the same API has to be offered for ne, gt, ge, lt, le
  • No matter what part of the API happens to be integration-tested, the implementation itself is certainly covered by some test.
  • This way, it is extremely easy to provide users with a very rich API with lots of convenience methods, as users do not want to remember how these more general-purpose methods (like compare()) really work.

The last point is particularly important, and because of risks related to backwards-compatibility, not always followed by the JDK, for instance. In order to create a Java 8 Stream from an Iterator, you have to go through all this hassle, for instance:

// Aagh, my fingers hurt...
   StreamSupport.stream(iterator.spliterator(), false);
// ^^^^^^^^^^^^^                 ^^^^^^^^^^^    ^^^^^
//       |                            |           |
// Not Stream!                        |           |
//                                    |           |
// Hmm, Spliterator. Sounds like      |           |
// Iterator. But what is it? ---------+           |
//                                                |
// What's this true and false?                    |
// And do I need to care? ------------------------+

When, intuitively, you’d like to have:

// Not Enterprise enough
iterator.stream();

In other words, subtle Java 8 Streams implementation details will soon leak into a lot of client code, and many new utility functions will wrap these things again and again.

See Brian Goetz’s explanation on Stack Overflow for details.

On the flip side of delegating overload implementations, it is of course harder (i.e. more work) to implement such an API. This is particularly cumbersome if an API vendor also allows users to implement the API themselves (e.g. JDBC). Another issue is the length of stack traces generated by such implementations. But we’ve shown before on this blog that deep stack traces can be a sign of good quality.

Now you know why.

Takeaway

The takeaway is simple. Whenever you encounter a pattern, refactor. Find the most common denominator, factor it out into an implementation, and see that this implementation is hardly ever used by delegating single responsibility steps from method to method.

By following these rules, you will:

  • Have less bugs
  • Have a more convenient API

Happy refactoring!

Overload API methods with care – the sequel

I had recently blogged about funny issues that arise when overloading API methods with generics involved:
https://lukaseder.wordpress.com/2011/11/11/overload-api-methods-with-care/

I promised a sequel as I have encountered more trouble than that, so here it is.

The trouble with generics and varargs

Varargs are another great feature introduced in Java 5. While being merely syntactic sugar, you can save quite some lines of code when passing arrays to methods:

// Method declarations with or without varargs
public static String concat1(int[] values);
public static String concat2(int... values);

// The above methods are actually the same.
String s1 = concat1(new int[] { 1, 2, 3 });
String s2 = concat2(new int[] { 1, 2, 3 });

// Only, concat2 can also be called like this, conveniently
String s3 = concat2(1, 2, 3);

That’s well-known. It works the same way with primitive-type arrays as with Object[]. It also works with T[] where T is a generic type!

// You can now have a generic type in your varargs parameter:
public static <T> T[] array(T... values);

// The above can be called "type-safely" (with auto-boxing):
Integer[] ints   = array(1, 2, 3);
String[] strings = array("1", "2", "3");

// Since Object could also be inferred for T, you can even do this:
Object[] applesAndOranges = array(1, "2", 3.0);

The last example is actually already hinting at the problem. If T does not have any upper bound, the type-safety is gone, completely. It is an illusion, because in the end, the varargs parameter can always be inferred to “Object…”. And here’s how this causes trouble when you overload such an API.

// Overloaded for "convenience". Let's ignore the compiler warning
// caused when calling the second method
public static <T> Field<T> myFunction(T... params);
public static <T> Field<T> myFunction(Field<T>... params);

At first, this may look like a good idea. The argument list can either be constant values (T…) or dynamic fields (Field…). So in principle, you can do things like this:

// The outer function can infer Integer for <T> from the inner
// functions, which can infer Integer for <T> from T...
Field<Integer> f1 = myFunction(myFunction(1), myFunction(2, 3));

// But beware, this will compile too!
Field<?> f2 = myFunction(myFunction(1), myFunction(2.0, 3.0));

The inner functions will infer Integer and Double for <T>. With incompatible return types Field<Integer> and Field<Double>, the “intended” method with the “Field<T>…” argument does not apply anymore. Hence method one with “T…” is linked by the compiler as the only applicable method. But you’re not going to guess the (possibly) inferred bound for <T>. These are possible inferred types:

// This one, you can always do:
Field<?> f2 = myFunction(myFunction(1), myFunction(2.0, 3.0));

// But these ones show what you're actually about to do
Field<? extends Field<?>>                       f3 = // ...
Field<? extends Field<? extends Number>>        f4 = // ...
Field<? extends Field<? extends Comparable<?>>> f5 = // ...
Field<? extends Field<? extends Serializable>>  f6 = // ...

The compiler can infer something like Field<? extends Number & Comparable<?> & Serializable> as a valid upper bound for <T>. There is no valid exact bound for <T>, however. Hence the necessary <? extends [upper bound]>.

Conclusion

Be careful when combining varargs parameters with generics, especially in overloaded methods. If the user correctly binds the generic type parameter to what you intended, everything works fine. But if there is a single typo (e.g. confusing an Integer with a Double), then your API’s user is doomed. And they will not easily find their mistake, as no one sane can read compiler error messages like this:

Test.java:58: incompatible types
found   : Test.Field<Test.Field<
          ? extends java.lang.Number&java.lang.Comparable<
          ? extends java.lang.Number&java.lang.Comparable<?>>>>
required: Test.Field<java.lang.Integer>
        Field<Integer> f2 = myFunction(myFunction(1), 
                                       myFunction(2.0, 3.0));

Overload API methods with care

Overloading methods is a strong concept in API design, especially when your API is a fluent API or DSL (Domain Specific Language). This is the case for jOOQ, where you often want to use the exact same method name for various means of interaction with the library.

Example: jOOQ Conditions

package org.jooq;

public interface Condition {

    // Various overloaded forms of the "AND" operation:
    Condition and(Condition other);
    Condition and(String sql);
    Condition and(String sql, Object... bindings);

    // [...]
}

All of these methods connect two conditions with each other using an “AND” operator. Ideally, the implementations depend on each other, creating a single point of failure. This keeps things DRY:

package org.jooq.impl;

abstract class AbstractCondition implements Condition {

    // The single point of failure
    @Override
    public final Condition and(Condition other) {
        return new CombinedCondition(
            Operator.AND, Arrays.asList(this, other));
    }

    // "Convenience methods" delegating to the other one
    @Override
    public final Condition and(String sql) {
        return and(condition(sql));
    }

    @Override
    public final Condition and(String sql, Object... bindings) {
        return and(condition(sql, bindings));
    }
}

The trouble with generics and overloading

When developing with Eclipse, the Java 5 world seems more shiny than it really is. Varargs and generics were introduced as syntactic sugar in Java 5. They don’t really exist in that way in the JVM. That means, the compiler has to link method invocations correctly, inferring types if needed, and creating synthetic methods in some cases. According to the JLS (Java Language Specification), there is a lot of ambiguity when varargs/generics are employed in overloaded methods.

Let’s elaborate on generics:

A nice thing to do in jOOQ is to treat constant values the same as fields. In many places, field arguments are overloaded like this:

// This is a convenience method:
public static <T> Field<T> myFunction(Field<T> field, T value) {
    return myFunction(field, val(value));
}

// It's equivalent to this one.
public static <T> Field<T> myFunction(Field<T> field, Field<T> value) {
    return MyFunction<T>(field, value);
}

The above works very well in most of the cases. You can use the above API like this:

Field<Integer> field1  = //...
Field<String>  field2  = //...

Field<Integer> result1 = myFunction(field1, 1);
Field<String>  result2 = myFunction(field2, "abc");

But the trouble arises when <T> is bound to Object!

// While this works...
Field<Object>  field3  = //...
Field<Object>  result3 = myFunction(field3, new Object());

// ... this doesn't!
Field<Object>  field4  = //...
Field<Object>  result4 = myFunction(field4, field4);
Field<Object>  result4 = myFunction(field4, (Field) field4);
Field<Object>  result4 = myFunction(field4, (Field<Object>) field4);

When <T> is bound to Object, all of a sudden both methods apply, and according to the JLS, none of them is more specific! While the Eclipse compiler is usually a bit more lenient (and in this case intuitively links the second method), the javac compiler doesn’t know what to do with this call. And there is no way around it. You cannot cast field4 to Field or to Field<Object> to force the linker to link to the second method. That’s pretty bad news for an API designer.

For more details about this special case, consider the following Stack Overflow question, which I reported as a bug to both Oracle and Eclipse. Let’s see which compiler implementation is correct:

http://stackoverflow.com/questions/5361513/reference-is-ambiguous-with-generics

The trouble with static imports, varargs

There is more trouble for API designers, that I will document some other time.