10 Subtle Best Practices when Coding Java

This is a list of 10 best practices that are more subtle than your average Josh Bloch Effective Java rule. While Josh Bloch’s list is very easy to learn and concerns everyday situations, this list here contains less common situations involving API / SPI design that may have a big effect nontheless.

I have encountered these things while writing and maintaining jOOQ, an internal DSL modelling SQL in Java. Being an internal DSL, jOOQ challenges Java compilers and generics to the max, combining generics, varargs and overloading in a way that Josh Bloch probably wouldn’t recommend for the “average API”.

jOOQ is the best way to write SQL in Java

Let me share with you 10 Subtle Best Practices When Coding Java:

1. Remember C++ destructors

Remember C++ destructors? No? Then you might be lucky as you never had to debug through any code leaving memory leaks due to allocated memory not having been freed after an object was removed. Thanks Sun/Oracle for implementing garbage collection!

But nonetheless, destructors have an interesting trait to them. It often makes sense to free memory in the inverse order of allocation. Keep this in mind in Java as well, when you’re operating with destructor-like semantics:

  • When using @Before and @After JUnit annotations
  • When allocating, freeing JDBC resources
  • When calling super methods

There are various other use cases. Here’s a concrete example showing how you might implement some event listener SPI:

@Override
public void beforeEvent(EventContext e) {
    super.beforeEvent(e);
    // Super code before my code
}

@Override
public void afterEvent(EventContext e) {
    // Super code after my code
    super.afterEvent(e);
}

Another good example showing why this can be important is the infamous Dining Philosophers problem. More info about the dining philosophers can be seen in this awesome post:
http://adit.io/posts/2013-05-11-The-Dining-Philosophers-Problem-With-Ron-Swanson.html

The Rule: Whenever you implement logic using before/after, allocate/free, take/return semantics, think about whether the after/free/return operation should perform stuff in the inverse order. tweet this

2. Don’t trust your early SPI evolution judgement

Providing an SPI to your consumers is an easy way to allow them to inject custom behaviour into your library / code. Beware, though, that your SPI evolution judgement may trick you into thinking that you’re (not) going to need that additional parameter. True, no functionality should be added early. But once you’ve published your SPI and once you’ve decided following semantic versioning, you’ll really regret having added a silly, one-argument method to your SPI when you realise that you might need another argument in some cases:

interface EventListener {
    // Bad
    void message(String message);
}

What if you also need a message ID and a message source? API evolution will prevent you from adding that parameter easily, to the above type. Granted, with Java 8, you could add a defender method, to “defend” you bad early design decision:

interface EventListener {
    // Bad
    default void message(String message) {
        message(message, null, null);
    }
    // Better?
    void message(
        String message,
        Integer id,
        MessageSource source
    );
}

Note, unfortunately, the defender method cannot be made final.

But much better than polluting your SPI with dozens of methods, use a context object (or argument object) just for this purpose.

interface MessageContext {
    String message();
    Integer id();
    MessageSource source();
}

interface EventListener {
    // Awesome!
    void message(MessageContext context);
}

You can evolve the MessageContext API much more easily than the EventListener SPI as fewer users will have implemented it.

The Rule: Whenever you specify an SPI, consider using context / parameter objects instead of writing methods with a fixed amount of parameters. tweet this

Remark: It is often a good idea to also communicate results through a dedicated MessageResult type, which can be constructed through a builder API. This will add even more SPI evolution flexibility to your SPI.

3. Avoid returning anonymous, local, or inner classes

Swing programmers probably have a couple of keyboard shortcuts to generate the code for their hundreds of anonymous classes. In many cases, creating them is nice as you can locally adhere to an interface, without going through the “hassle” of thinking about a full SPI subtype lifecycle.

But you should not use anonymous, local, or inner classes too often for a simple reason: They keep a reference to the outer instance. And they will drag that outer instance to wherevery they go, e.g. to some scope outside of your local class if you’re not careful. This can be a major source for memory leaks, as your whole object graph will suddenly entangle in subtle ways.

The Rule: Whenever you write an anonymous, local or inner class, check if you can make it static or even a regular top-level class. Avoid returning anonymous, local or inner class instances from methods to the outside scope. tweet this

Remark: There has been some clever practice around double-curly braces for simple object instantiation:

new HashMap<String, String>() {{
    put("1", "a");
    put("2", "b");
}}

This leverages Java’s instance initializer as specified by the JLS §8.6. Looks nice (maybe a bit weird), but is really a bad idea. What would otherwise be a completely independent HashMap instance now keeps a reference to the outer instance, whatever that just happens to be. Besides, you’ll create an additional class for the class loader to manage.

4. Start writing SAMs now!

Java 8 is knocking on the door. And with Java 8 come lambdas, whether you like them or not. Your API consumers may like them, though, and you better make sure that they can make use of them as often as possible. Hence, unless your API accepts simple “scalar” types such as int, long, String, Date, let your API accept SAMs as often as possible.

What’s a SAM? A SAM is a Single Abstract Method [Type]. Also known as a functional interface, soon to be annotated with the @FunctionalInterface annotation. This goes well with rule number 2, where EventListener is in fact a SAM. The best SAMs are those with single arguments, as they will further simplify writing of a lambda. Imagine writing

listeners.add(c -> System.out.println(c.message()));

Instead of

listeners.add(new EventListener() {
    @Override
    public void message(MessageContext c) {
        System.out.println(c.message()));
    }
});

Imagine XML processing through jOOX, which features a couple of SAMs:

$(document)
    // Find elements with an ID
    .find(c -> $(c).id() != null)
    // Find their  child elements
    .children(c -> $(c).tag().equals("order"))
    // Print all matches
    .each(c -> System.out.println($(c)))

The Rule: Be nice with your API consumers and write SAMs / Functional interfaces already now. tweet this

Remarks: A couple of interesting blog posts about Java 8 Lambdas and improved Collections API can be seen here:

5. Avoid returning null from API methods

I’ve blogged about Java’s NULLs once or twice. I’ve also blogged about Java 8’s introduction of Optional. These are interesting topics both from an academic and from a practical point of view.

While NULLs and NullPointerExceptions will probably stay a major pain in Java for a while, you can still design your API in a way that users will not run into any issues. Try to avoid returning null from API methods whenever possible. Your API consumers should be able to chain methods whenever applicable:

initialise(someArgument).calculate(data).dispatch();

In the above snippet, none of the methods should ever return null. In fact, using null’s semantics (the absence of a value) should be rather exceptional in general. In libraries like jQuery (or jOOX, a Java port thereof), nulls are completely avoided as you’re always operating on iterable objects. Whether you match something or not is irrelevant to the next method call.

Nulls often arise also because of lazy initialisation. In many cases, lazy initialisation can be avoided too, without any significant performance impact. In fact, lazy initialisation should be used carefully, only. If large data structures are involved.

The Rule: Avoid returning nulls from methods whenever possible. Use null only for the “uninitialised” or “absent” semantics. tweet this

6. Never return null arrays or lists from API methods

While there are some cases when returning nulls from methods is OK, there is absolutely no use case of returning null arrays or null collections! Let’s consider the hideous java.io.File.list() method. It returns:

An array of strings naming the files and directories in the directory denoted by this abstract pathname. The array will be empty if the directory is empty. Returns null if this abstract pathname does not denote a directory, or if an I/O error occurs.

Hence, the correct way to deal with this method is

File directory = // ...

if (directory.isDirectory()) {
    String[] list = directory.list();

    if (list != null) {
        for (String file : list) {
            // ...
        }
    }
}

Was that null check really necessary? Most I/O operations produce IOExceptions, but this one returns null. Null cannot hold any error message indicating why the I/O error occurred. So this is wrong in three ways:

  • Null does not help in finding the error
  • Null does not allow to distinguish I/O errors from the File instance not being a directory
  • Everyone will keep forgetting about null, here

In collection contexts, the notion of “absence” is best implemented by empty arrays or collections. Having an “absent” array or collection is hardly ever useful, except again, for lazy initialisation.

The Rule: Arrays or Collections should never be null. tweet this

7. Avoid state, be functional

What’s nice about HTTP is the fact that it is stateless. All relevant state is transferred in each request and in each response. This is essential to the naming of REST: Representational State Transfer. This is awesome when done in Java as well. Think of it in terms of rule number 2 when methods receive stateful parameter objects. Things can be so much simpler if state is transferred in such objects, rather than manipulated from the outside. Take JDBC, for instance. The following example fetches a cursor from a stored procedure:

CallableStatement s =
  connection.prepareCall("{ ? = ... }");

// Verbose manipulation of statement state:
s.registerOutParameter(1, cursor);
s.setString(2, "abc");
s.execute();
ResultSet rs = s.getObject(1);

// Verbose manipulation of result set state:
rs.next();
rs.next();

These are the things that make JDBC such an awkward API to deal with. Each object is incredibly stateful and hard to manipulate. Concretely, there are two major issues:

  • It is very hard to correctly deal with stateful APIs in multi-threaded environments
  • It is very hard to make stateful resources globally available, as the state is not documented
State is like a box of chocolates

Theatrical poster for Forrest Gump, Copyright © 1994 by Paramount Pictures. All Rights Reserved. It is believed that the above usage fulfils what is known as Fair Use

The Rule: Implement more of a functional style. Pass state through method arguments. Manipulate less object state. tweet this

8. Short-circuit equals()

This is a low-hanging fruit. In large object graphs, you can gain significantly in terms of performance, if all your objects’ equals() methods dirt-cheaply compare for identity first:

@Override
public boolean equals(Object other) {
    if (this == other) return true;

    // Rest of equality logic...
}

Note, other short-circuit checks may involve null checks, which should be there as well:

@Override
public boolean equals(Object other) {
    if (this == other) return true;
    if (other == null) return false;

    // Rest of equality logic...
}

The Rule: Short-circuit all your equals() methods to gain performance. tweet this

9. Try to make methods final by default

Some will disagree on this, as making things final by default is quite the opposite of what Java developers are used to. But if you’re in full control of all source code, there’s absolutely nothing wrong with making methods final by default, because:

  • If you do need to override a method (do you really?), you can still remove the final keyword
  • You will never accidentally override any method anymore

This specifically applies for static methods, where “overriding” (actually, shadowing) hardly ever makes sense. I’ve come across a very bad example of shadowing static methods in Apache Tika, recently. Consider:

TikaInputStream extends TaggedInputStream and shadows its static get() method with quite a different implementation.

Unlike regular methods, static methods don’t override each other, as the call-site binds a static method invocation at compile-time. If you’re unlucky, you might just get the wrong method accidentally.

The Rule: If you’re in full control of your API, try making as many methods as possible final by default. tweet this

10. Avoid the method(T…) signature

There’s nothing wrong with the occasional “accept-all” varargs method that accepts an Object... argument:

void acceptAll(Object... all);

Writing such a method brings a little JavaScript feeling to the Java ecosystem. Of course, you probably want to restrict the actual type to something more confined in a real-world situation, e.g. String.... And because you don’t want to confine too much, you might think it is a good idea to replace Object by a generic T:

void acceptAll(T... all);

But it’s not. T can always be inferred to Object. In fact, you might as well just not use generics with the above methods. More importantly, you may think that you can overload the above method, but you cannot:

void acceptAll(T... all);
void acceptAll(String message, T... all);

This looks as though you could optionally pass a String message to the method. But what happens to this call here?

acceptAll("Message", 123, "abc");

The compiler will infer <? extends Serializable & Comparable<?>> for T, which makes the call ambiguous!

So, whenever you have an “accept-all” signature (even if it is generic), you will never again be able to typesafely overload it. API consumers may just be lucky enough to “accidentally” have the compiler chose the “right” most specific method. But they may as well be tricked into using the “accept-all” method or they may not be able to call any method at all.

The Rule: Avoid “accept-all” signatures if you can. And if you cannot, never overload such a method. tweet this

Conclusion

Java is a beast. Unlike other, fancier languages, it has evolved slowly to what it is today. And that’s probably a good thing, because already at the speed of development of Java, there are hundreds of caveats, which can only be mastered through years of experience.

jOOQ is the best way to write SQL in Java

Stay tuned for more top 10 lists on the subject!

How to Design a Good, Regular API

People have strong opinions on how to design a good API. Consequently, there are lots of pages and books in the web, explaining how to do it. This article will focus on a particular aspect of good APIs: Regularity. Regularity is what happens when you follow the “Principle of Least Astonishment“. This principle holds true no matter what kinds of personal taste and style you would like to put into your API, otherwise. It is thus one of the most important features of a good API.

The following are a couple of things to keep in mind when designing a “regular” API:

Rule #1: Establish strong terms

If your API grows, there will be repetitive use of the same terms, over and over again. For instance, some actions will be come in several flavours resulting in various classes / types / methods, that differ only subtly in behaviour. The fact that they’re similar should be reflected by their names. Names should use strong terms. Take JDBC for instance. No matter how you execute a Statement, you will always use the term execute to do it. For instance, you will call any of these methods:

In a similar fashion, you will always use the term close to release resources, no matter which resource you’re releasing. For instance, you will call:

As a matter of fact, close is such a strong and established term in the JDK, that it has lead to the interfaces java.io.Closeable (since Java 1.5), and java.lang.AutoCloseable (since Java 1.7), which generally establish a contract of releasing resources.

Rule violation: Observable

This rule is violated a couple of times in the JDK. For instance, in the java.util.Observable class. While other “Collection-like” types established the terms

  • size()
  • remove()
  • removeAll()

… this class declares

There is no good reason for using other terms in this context. The same applies to Observer.update(), which should really be called notify(), an otherwise established term in JDK APIs

Rule violation: Spring. Most of it

Spring has really gotten popular in the days when J2EE was weird, slow, and cumbersome. Think about EJB 2.0… There may be similar opinions on Spring out there, which are off-topic for this post. Here’s how Spring violates this concrete rule. A couple of random examples where Spring fails to establish strong terms, and uses long concatenations of meaningless, inconcise words instead:

Apart from “feeling” like a horrible API (to me), here’s some more objective analysis:

  • What’s the difference between a Creator and a Factory
  • What’s the difference between a Source and a Provider?
  • What’s the non-subtle difference between an Advisor and a Provider?
  • What’s the non-subtle difference between a Discoverer and a Provider?
  • Is an Advisor related to an AspectJAdvice?
  • Is it a ScanningCandidate or a CandidateComponent?
  • What’s a TargetSource? And how would it be different from a SourceTarget if not a SourceSource or my favourite: A SourceSourceTargetProviderSource?

Gary Fleming commented on my previous blog post about Spring’s funny class names:

I’d be willing to bet that a Markov-chain generated class name (based on Spring Security) would be indistinguishable from the real thing.

Back to more seriousness…

Rule #2: Apply symmetry to term combinations

Once you’ve established strong terms, you will start combining them. When you look at the JDK’s Collection APIs, you will notice the fact that they are symmetric in a way that they’ve established the terms add(), remove(), contains(), and all, before combining them symmetrically:

Now, the Collection type is a good example where an exception to this rule may be acceptable, when a method doesn’t “pull its own weight”. This is probably the case for retainAll(Collection<?>), which doesn’t have an equivalent retain(E) method. It might just as well be a regular violation of this rule, though.

Rule violation: Map

This rule is violated all the time, mostly because of some methods not pulling their own weight (which is ultimately a matter of taste). With Java 8’s defender methods, there will no longer be any excuse of not adding default implementations for useful utility methods that should’ve been on some types. For instance: Map. It violates this rule a couple of times:

Observe also, that there is no point of using the term Set in the method names. The method signature already indicates that the result has a Set type. It would’ve been more consistent and symmetric if those methods would’ve been named keys(), values(), entries(). (On a side-note, Sets and Lists are another topic that I will soon blog about, as I think those types do not pull their own weight either)

At the same time, the Map interface violates this rule by providing

Besides, establishing the term clear() instead of reusing removeAll() with no arguments is unnecessary. This applies to all Collection API members. In fact, the clear() method also violates rule #1. It is not immediately obvious, if clear does anything subtly different from remove when removing collection elements.

Rule #3: Add convenience through overloading

There is mostly only one compelling reason, why you would want to overload a method: Convenience. Often you want to do precisely the same thing in different contexts, but constructing that very specific method argument type is cumbersome. So, for convenience, you offer your API users another variant of the same method, with a “friendlier” argument type set. This can be observed again in the Collection type. We have:

Another example is the Arrays utility class. We have:

Overloading is mostly used for two reasons:

  1. Providing “default” argument behaviour, as in Collection.toArray()
  2. Supporting several incompatible, yet “similar” argument sets, as in Arrays.copyOf()

Other languages have incorporated these concepts into their language syntax. Many languages (e.g. PL/SQL) formally support named default arguments. Some languages (e.g. JavaScript) don’t even care how many arguments there really are. And another, new JVM language called Ceylon got rid of overloading by combining the support for named, default arguments with union types. As Ceylon is a statically typed language, this is probable the most powerful approach of adding convenience to your API.

Rule violation: TreeSet

It is hard to find a good example of a case where this rule is violated in the JDK. But there is one: the TreeSet and TreeMap. Their constructors are overloaded several times. Let’s have a look at these two constructors:

The latter “cleverly” adds some convenience to the first in that it extracts a well-known Comparator from the argument SortedSet to preserve ordering. This behaviour is quite different from the compatible (!) first constructor, which doesn’t do an instanceof check of the argument collection. I.e. these two constructor calls result in different behaviour:

SortedSet<Object> original = // [...]

// Preserves ordering:
new TreeSet<Object>(original);

// Resets ordering:
new TreeSet<Object>((Collection<Object>) original);

These constructors violate the rule in that they produce completely different behaviour. They’re not just mere convenience.

Rule #4: Consistent argument ordering

Be sure that you consistently order arguments of your methods. This is an obvious thing to do for overloaded methods, as you can immediately see how it is better to always put the array first and the int after in the previous example from the Arrays utility class:

But you will quickly notice that all methods in that class will put the array being operated on first. Some examples:

Rule violation: Arrays

The same class also “subtly” violates this rule in that it puts optional arguments in between other arguments, when overloading methods. For instance, it declares

When the latter should’ve been fill(Object[], Object, int, int). This is a “subtle” rule violation, as you may also argue that those methods in Arrays that restrict an argument array to a range will always put the array and the range argument together. In that way, the fill() method would again follow the rule as it provides the same argument order as copyOfRange(), for instance:

You will never be able to escape this problem if you heavily overload your API. Unfortunately, Java doesn’t support named parameters, which helps formally distinguishing arguments in a large argument list, as sometimes, large argument lists cannot be avoided.

Rule violation: String

Another case of a rule violation is the String class:

The problems here are:

  • It is hard to immediately understand the difference between the two methods, as the optional boolean argument is inserted at the beginning of the argument list
  • It is hard to immediately understand the purpose of every int argument, as there are many arguments in a single method

Rule #5: Establish return value types

This may be a bit controversial as people may have different views on this topic. No matter what your opinion is, however, you should create a consistent, regular API when it comes to defining return value types. An example rule set (on which you may disagree):

  • Methods returning a single object should return null when no object was found
  • Methods returning several objects should return an empty List, Set, Map, array, etc. when no object was found (never null)
  • Methods should only throw exceptions in case of an … well, an exception

With such a rule set, it is not a good practice to have 1-2 methods lying around, which:

  • … throw ObjectNotFoundExceptions when no object was found
  • … return null instead of empty Lists

Rule violation: File

File is an example of a JDK class that violates many rules. Among them, the rule of regular return types. Its File.list() Javadoc reads:

An array of strings naming the files and directories in the directory denoted by this abstract pathname. The array will be empty if the directory is empty. Returns null if this abstract pathname does not denote a directory, or if an I/O error occurs.

So, the correct way to iterate over file names (if you’re doing defensive programming) is:

String[] files = file.list();

// You should never forget this null check!
if (files != null) {
    for (String file : files) {
        // Do things with your file
    }
}

Of course, we could argue that the Java 5 expert group could’ve been nice with us and worked that null check into their implementation of the foreach loop. Similar to the missing null check when switching over an enum (which should lead to the default: case). They’ve probably preferred the “fail early” approach in this case.

The point here is that File already has sufficient means of checking if file is really a directory (File.isDirectory()). And it should throw an IOException if something went wrong, instead of returning null. This is a very strong violation of this rule, causing lots of pain at the call-site… Hence:

NEVER return null when returning arrays or collections!

Rule violation: JPA

An example of how JPA violates this rule is the way how entities are retrieved from the EntityManager or from a Query:

As NoResultException is a RuntimeException this flaw heavily violates the Principle of Least Astonishment, as you might stay unaware of this difference until runtime!

IF you insist on throwing NoResultExceptions, make them checked exceptions as client code MUST handle them

Conclusion and further reading

… or rather, further watching. Have a look at Josh Bloch’s presentation on API design. He agrees with most of my claims, around 0:30:30

Another useful example of such a web page is the “Java API Design Checklist” by The Amiable API:
Java API Design Checklist