Rule #1: Establish strong terms
If your API grows, there will be repetitive use of the same terms, over and over again. For instance, some actions will be come in several flavours resulting in various classes / types / methods, that differ only subtly in behaviour. The fact that they’re similar should be reflected by their names. Names should use strong terms. Take JDBC for instance. No matter how you execute a Statement, you will always use the termexecute
to do it. For instance, you will call any of these methods:
In a similar fashion, you will always use the term close
to release resources, no matter which resource you’re releasing. For instance, you will call:
As a matter of fact, close
is such a strong and established term in the JDK, that it has lead to the interfaces java.io.Closeable
(since Java 1.5), and java.lang.AutoCloseable
(since Java 1.7), which generally establish a contract of releasing resources.
Rule violation: Observable
This rule is violated a couple of times in the JDK. For instance, in the java.util.Observable
class. While other “Collection-like” types established the terms
size()
remove()
removeAll()
Observer.update()
, which should really be called notify()
, an otherwise established term in JDK APIs
Rule violation: Spring. Most of it
Spring has really gotten popular in the days when J2EE was weird, slow, and cumbersome. Think about EJB 2.0… There may be similar opinions on Spring out there, which are off-topic for this post. Here’s how Spring violates this concrete rule. A couple of random examples where Spring fails to establish strong terms, and uses long concatenations of meaningless, inconcise words instead:
AbstractBeanFactoryBasedTargetSourceCreator
AbstractInterceptorDrivenBeanDefinitionDecorator
AbstractRefreshablePortletApplicationContext
AspectJAdviceParameterNameDiscoverer
BeanFactoryTransactionAttributeSourceAdvisor
ClassPathScanningCandidateComponentProvider
- … this could go on indefinitely, my favourite being …
- J2eeBasedPreAuthenticatedWebAuthenticationDetailsSource. Note, I’ve blogged about conciseness before…
- What’s the difference between a
Creator
and aFactory
- What’s the difference between a
Source
and aProvider
? - What’s the non-subtle difference between an
Advisor
and aProvider
? - What’s the non-subtle difference between a
Discoverer
and aProvider
? - Is an
Advisor
related to anAspectJAdvice
? - Is it a
ScanningCandidate
or aCandidateComponent
? - What’s a
TargetSource
? And how would it be different from aSourceTarget
if not aSourceSource
or my favourite: ASourceSourceTargetProviderSource
?
I’d be willing to bet that a Markov-chain generated class name (based on Spring Security) would be indistinguishable from the real thing.Back to more seriousness…
Rule #2: Apply symmetry to term combinations
Once you’ve established strong terms, you will start combining them. When you look at the JDK’s Collection APIs, you will notice the fact that they are symmetric in a way that they’ve established the termsadd()
, remove()
, contains()
, and all
, before combining them symmetrically:
add(E)
addAll(Collection<? extends E>)
remove(Object)
removeAll(Collection<?>)
contains(Object)
containsAll(Collection<?>)
Collection
type is a good example where an exception to this rule may be acceptable, when a method doesn’t “pull its own weight”. This is probably the case for retainAll(Collection<?>)
, which doesn’t have an equivalent retain(E)
method. It might just as well be a regular violation of this rule, though.
Rule violation: Map
This rule is violated all the time, mostly because of some methods not pulling their own weight (which is ultimately a matter of taste). With Java 8’s defender methods, there will no longer be any excuse of not adding default implementations for useful utility methods that should’ve been on some types. For instance: Map
. It violates this rule a couple of times:
- It has
keySet()
and alsocontainsKey(Object)
- It has
values()
and alsocontainsValue(Object)
- It has
entrySet()
but nocontainsEntry(K, V)
Set
in the method names. The method signature already indicates that the result has a Set
type. It would’ve been more consistent and symmetric if those methods would’ve been named keys()
, values()
, entries()
. (On a side-note, Sets
and Lists
are another topic that I will soon blog about, as I think those types do not pull their own weight either)
At the same time, the Map
interface violates this rule by providing
put(K, V)
and alsoputAll(Map)
remove(Object)
, but noremoveAll(Collection<?>)
clear()
instead of reusing removeAll()
with no arguments is unnecessary. This applies to all Collection API members. In fact, the clear()
method also violates rule #1. It is not immediately obvious, if clear
does anything subtly different from remove
when removing collection elements.
Rule #3: Add convenience through overloading
There is mostly only one compelling reason, why you would want to overload a method: Convenience. Often you want to do precisely the same thing in different contexts, but constructing that very specific method argument type is cumbersome. So, for convenience, you offer your API users another variant of the same method, with a “friendlier” argument type set. This can be observed again in theCollection
type. We have:
toArray()
, which is a convenient overload of…toArray(T[])
Arrays
utility class. We have:
copyOf(T[], int)
, which is an incompatible overload of…copyOf(boolean[], int)
, and of…copyOf(int[], int)
- … and all the others
- Providing “default” argument behaviour, as in
Collection.toArray()
- Supporting several incompatible, yet “similar” argument sets, as in
Arrays.copyOf()
TreeSet
and TreeMap
. Their constructors are overloaded several times. Let’s have a look at these two constructors:
The latter “cleverly” adds some convenience to the first in that it extracts a well-known Comparator
from the argument SortedSet
to preserve ordering. This behaviour is quite different from the compatible (!) first constructor, which doesn’t do an instanceof
check of the argument collection. I.e. these two constructor calls result in different behaviour:
SortedSet<Object> original = // [...]
// Preserves ordering:
new TreeSet<Object>(original);
// Resets ordering:
new TreeSet<Object>((Collection<Object>) original);
Rule #4: Consistent argument ordering
Be sure that you consistently order arguments of your methods. This is an obvious thing to do for overloaded methods, as you can immediately see how it is better to always put the array first and the int after in the previous example from theArrays
utility class:
copyOf(T[], int)
, which is an incompatible overload of…copyOf(boolean[], int)
copyOf(int[], int)
- … and all the others
binarySearch(Object[], Object)
copyOfRange(T[], int, int)
fill(Object[], Object)
sort(T[], Comparator<? super T>)
fill(Object[], Object, int, int)
. This is a “subtle” rule violation, as you may also argue that those methods in Arrays
that restrict an argument array to a range will always put the array and the range argument together. In that way, the fill()
method would again follow the rule as it provides the same argument order as copyOfRange()
, for instance:
You will never be able to escape this problem if you heavily overload your API. Unfortunately, Java doesn’t support named parameters, which helps formally distinguishing arguments in a large argument list, as sometimes, large argument lists cannot be avoided.
Rule violation: String
Another case of a rule violation is the String
class:
The problems here are:
- It is hard to immediately understand the difference between the two methods, as the optional
boolean
argument is inserted at the beginning of the argument list - It is hard to immediately understand the purpose of every int argument, as there are many arguments in a single method
Rule #5: Establish return value types
This may be a bit controversial as people may have different views on this topic. No matter what your opinion is, however, you should create a consistent, regular API when it comes to defining return value types. An example rule set (on which you may disagree):- Methods returning a single object should return
null
when no object was found - Methods returning several objects should return an empty
List
,Set
,Map
, array, etc. when no object was found (nevernull
) - Methods should only throw exceptions in case of an … well, an exception
- … throw
ObjectNotFoundExceptions
when no object was found - … return
null
instead of emptyLists
File.list()
Javadoc reads:
An array of strings naming the files and directories in the directory denoted by this abstract pathname. The array will be empty if the directory is empty. Returns null if this abstract pathname does not denote a directory, or if an I/O error occurs.So, the correct way to iterate over file names (if you’re doing defensive programming) is:
String[] files = file.list();
// You should never forget this null check!
if (files != null) {
for (String file : files) {
// Do things with your file
}
}
default:
case). They’ve probably preferred the “fail early” approach in this case.
The point here is that File
already has sufficient means of checking if file
is really a directory (File.isDirectory()
). And it should throw an IOException
if something went wrong, instead of returning null
. This is a very strong violation of this rule, causing lots of pain at the call-site… Hence:
NEVER return null when returning arrays or collections!
Rule violation: JPA
An example of how JPA violates this rule is the way how entities are retrieved from the EntityManager
or from a Query
:
EntityManager.find()
methods returnnull
if no entity could be foundQuery.getSingleResult()
throws aNoResultException
if no entity could be found
NoResultException
is a RuntimeException
this flaw heavily violates the Principle of Least Astonishment, as you might stay unaware of this difference until runtime!
IF you insist on throwing NoResultExceptions, make them checked exceptions as client code MUST handle them
Conclusion and further reading
… or rather, further watching. Have a look at Josh Bloch’s presentation on API design. He agrees with most of my claims, around 0:30:30
Another useful example of such a web page is the “Java API Design Checklist” by The Amiable API:
Java API Design Checklist
Wow, this is a great article. One of those that you have to read several times and meditate to get the most out of them. I was pondering on the case of method overloading, and I think that besides all this convenience that you mention, this feature is sometimes used to overcome the problem of multiple dispatch. Since Java does not have mutimethods, the use of overloading with a visitor pattern helps to implement a similar idiom in terms of functionality. Do you think this could be classified as a convenience idiom as well?
Now that mentioned it I am intrigued about whetther Ceylon has multimethods since they got rid of overloading. I’ll have to take a look, that’ll be interesting :-)
Also, in the point #5, about return types, I started thinking about how useful the new Option class un Java 8 will be to overcome the problems inherent to returning null or something. Do you think as well that this is also a good way to improve API designs dealing with this kind of return values?
Thanks for the nice words! I hope you didn’t meditate too long ;-)
This article has been on my mind for a long time. Finally, when I had recently discovered Ceylon’s regularity in language syntax, I had to write it down. Stay tuned for an analysis / rant about language regularity and how Java fails miserably in this.
About your feedback:
Yes, that is another use case for overloading. However, I personally think that the visitor pattern is one of the biggest anti-patterns to ever apply. If this topic interests you, I can suggest this article I had written recently. Apart from personal taste, that idiom doesn’t seem to match what I’ve written, i.e. multimethods are probably more than mere convenience, as each method is expected to behave very differently. Looking forward to a blog post of yours regarding multimethods and the Ceylon language, though!
I suspect you mean this new
java.util.Optional
type that Oracle has stolen from languages like Scala? I’m not quite sure yet, how this will be adopted. Many objects are really “optional” in a way thatnull
references are perfectly meaningful. I think that the extra boilerplate code could prevent a broad adoption of this type. Ceylon, again, has an interesting solution for this, allowing to decorate a type with a question-mark to indicate that it is optional. You can then make null-safe method calls on such a type.But again, I’m not quite sure if these things pull their own weight… With a good API, you hardly ever run into a
NullPointerException
I will be looking forward to reading your article on Ceylon. I had been hearing about it, but haven’t had time to look into it in detail, so it’ll be great to get some perspective on the subject.
Also it will definitely be interesting to write an article on multimethods. That’s a great suggestion and I will definitely look into that in the coming days.
Regarding the use of Optional I think it even predates Scala (2003). In functional languages like ML (1973) there is no way to express the concept of null, instead, one must use a datatype constructor that represents the absence of a value. In SML this is called an Option, whose value could be SOME ‘a or NONE. I believe Haskell (1991) has something similar with a datatype called Maybe, which has two constructors Just ‘a or Nothing.
I have also read of a similar idiom in a book named Refactoring to Patterns by Joshua Kerievsky where he suggests a pattern that he calls the Null Object pattern. No exactly the same as that from the functional world, but quite similar in its final purpose.
But I think you’re right, even with the new Optional object, we still have to check whether it contains something or not, which does not prevent the writing of the check, all the contrary, it enforces it, and I think this is what allegedly should prevent the NPE, but at the cost of having to write this checks over and over again (I assumed this is what you meant when you mention the boilerplate code). Perhaps in the future we can get some syntactic sugar that allows us to write the check using a ? (question mark) and the compiler can expand it to check if the optional contains something or not. That’d be great.
I see, you’ve done your research about “Optional” :-) It looks like this has been an eternal debate between language designers…
I like SQL’s
NULL
which is an actual “unknown” instance of the target type. I.e. aCAST(NULL AS INTEGER)
is really anINTEGER
instance, not the absence of such an instance. This is similar to Java’sDouble.NaN
. On the other hand, having an “unknown” instance for every type makes things quite complicated, specifically for boolean algebra, as all truth tables now have to deal withTRUE
,FALSE
, andUNKNOWN
.Yes. We’ll be writing stuff like:
It can be useful if the expert group adds enough convenience to the Optional type. Right now, it looks pretty scarce. But being able to do things like
orElse()
,orElseThrow()
doesn’t look so wrong in some circumstances.I’m sure that was one of the driving forces for actually adding that type. Another driving force was certainly the projected (but postponed) value type improvement, which should bring primitives and wrappers closer together. Note the existence of OptionalDouble, OptionalInt, OptionalLong. Having a “standard” API for reference and primitive types may certainly help in the future.
Nice post! I would argue that every developer who is designing a Java library should have read “Effective Java” from Joshua Bloch before. It is a seminal book which covers these topics in detail. If all Sun/Oracle developers had read it, then the Java APIs would be a much nicer tool to use.
Thanks! I agree that reading “Effective Java” or also Erich Gamma’s “Design Patterns” is a good thing. However, it takes time and experience to really appreciate their assembled knowledge and findings, and to truly understand the rationale behind it. It is always easier to criticise and recognise anti-patterns than to put knowledge into action. As far as I’m concerned, I feel like I’m still not quite there yet…
… forgot to mention. Another thing that I’m really worried about time and again is the unjustified success of Spring whose popularity makes it really hard for Java newbies to assess what is really a good API.
Spring had filled a gap, yes. But it filled it with mediocrity, in my opinion.