Every now and then, I miss SQL’s three-valued
BOOLEAN
semantics in Java. In SQL, we have:
TRUE
FALSE
UNKNOWN
(also known as NULL
)
Every now and then, I find myself in a situation where I wish I could also express this
UNKNOWN
or
UNINITIALISED
semantics in Java, when plain
true
and
false
aren’t enough.
Implementing a ResultSetIterator
For instance, when implementing a
ResultSetIterator
for
jOOλ, a simple library modelling
SQL streams for Java 8:
SQL.stream(stmt, Unchecked.function(r ->
new SQLGoodies.Schema(
r.getString("FIELD_1"),
r.getBoolean("FIELD_2")
)
))
.forEach(System.out::println);
In order to implement a
Java 8 Stream, we need to construct an
Iterator
, which we can then pass to the new
Spliterators.spliteratorUnknownSize()
method:
StreamSupport.stream(
Spliterators.spliteratorUnknownSize(iterator, 0),
false
);
Another example for this can be seen here on Stack Overflow.
When implementing the
Iterator
interface, we must implement
hasNext()
and
next()
. Note that with Java 8,
remove()
now has a default implementation, so we don’t need to implement it any longer.
While most of the time, a call to
next()
is preceded by a call to
hasNext()
exactly once, nothing in the
Iterator
contract requires this. It is perfectly fine to write:
if (it.hasNext()) {
// Some stuff
// Double-check again to be sure
if (it.hasNext() && it.hasNext()) {
// Yes, we're paranoid
if (it.hasNext())
it.next();
}
}
How to translate the
Iterator
calls to backing calls on the JDBC
ResultSet
? We need to call
ResultSet.next()
.
We
could make the following translation:
Iterator.hasNext() == !ResultSet.isLast()
Iterator.next() == ResultSet.next()
But that translation is:
- Expensive
- Not dealing correctly with empty
ResultSet
s
- Not implemented in all JDBC drivers (Support for the isLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY)
So, we’ll have to maintain a flag, internally, that tells us:
- If we had already called
ResultSet.next()
- What the result of that call was
Instead of creating a second variable, why not just use a three-valued
java.lang.Boolean
. Here’s a
possible implementation from jOOλ:
class ResultSetIterator<T> implements Iterator<T> {
final Supplier<? extends ResultSet> supplier;
final Function<ResultSet, T> rowFunction;
final Consumer<? super SQLException> translator;
/**
* Whether the underlying {@link ResultSet} has
* a next row. This boolean has three states:
* <ul>
* <li>null: it's not known whether there
* is a next row</li>
* <li>true: there is a next row, and it
* has been pre-fetched</li>
* <li>false: there aren't any next rows</li>
* </ul>
*/
Boolean hasNext;
ResultSet rs;
ResultSetIterator(
Supplier<? extends ResultSet> supplier,
Function<ResultSet, T> rowFunction,
Consumer<? super SQLException> translator
) {
this.supplier = supplier;
this.rowFunction = rowFunction;
this.translator = translator;
}
private ResultSet rs() {
return (rs == null)
? (rs = supplier.get())
: rs;
}
@Override
public boolean hasNext() {
try {
if (hasNext == null) {
hasNext = rs().next();
}
return hasNext;
}
catch (SQLException e) {
translator.accept(e);
throw new IllegalStateException(e);
}
}
@Override
public T next() {
try {
if (hasNext == null) {
rs().next();
}
return rowFunction.apply(rs());
}
catch (SQLException e) {
translator.accept(e);
throw new IllegalStateException(e);
}
finally {
hasNext = null;
}
}
}
As you can see, the
hasNext()
method locally caches the
hasNext
three-valued boolean state only if it was
null
before. This means that calling
hasNext()
several times will have no effect
until you call
next()
, which resets the
hasNext
cached state.
Both
hasNext()
and
next()
advance the
ResultSet
cursor if needed.
Readability?
Some of you may argue that this doesn’t help readability. They’d introduce a new variable like:
boolean hasNext;
boolean hasHasNextBeenCalled;
The trouble with this is the fact that you’re still implementing three-valued boolean state, but distributed to two variables, which are very hard to name in a way that is truly more readable than the actual
java.lang.Boolean
solution. Besides, there are actually four state values for two
boolean
variables, so there is a slight increase in the risk of bugs.
Every rule has its exception. Using
null
for the above semantics is a very good exception to the
null
-is-bad histeria that has been going on ever since
the introduction of Option / Optional…
In other words: Which approach is best? There’s no
TRUE
or
FALSE
answer, only
UNKNOWN
;-)
Be careful with this
However, as we’ve
discussed in a previous blog post, you should avoid returning
null
from API methods if possible. In this case, using
null
explicitly as a means to model state is fine because this model is encapsulated in our
ResultSetIterator
. But try to avoid leaking such state to the outside of your API.
Like this:
Like Loading...
Interesting. I had previously experienced this kind of problems, but I had not stated them so clearly.
I watched a talk by Erik Meijer recently and he commented about the arguably wrong implementation of the Iterator protocol in Java. He makes a comparison with that of .Net IEnumerator in which the protocol is a follows:
IEnumerator.MoveNext(): bool -> returns a true if it successfully moved to the next element, false otherwise.
IEnumerator.Current -> points to the current element in the iterator.
That makes it simpler because now current will always return the same thing until you try to move again. And so it would make simpler to test for nullability without having to resort to the side effects caused by MoveNext.
I guess we’ll have to use your solution instead, since this change is almost impossible now in the JDK.
The Erik’s talk I mentioned above:
http://www.infoq.com/presentations/covariance-contravariance-joy-of-coding-2014?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=global
Nice link. I had never thought about it this way. I had actually thought that the JDBC spec team had gotten it wrong, compared to
java.util.Iterator
. But you’re right. JDBC / .NET implemented a better cursor semantics.Interestingly, today I was learning the Dart programming language, created by Gilad Brach, who was also one of the creators of the Java Language Specification. The language has an incredible influence from Java (i.e. syntax, collections, apis, etc). However, when I got to the iterators, we can see that Gilad got them right in Dart:
https://api.dartlang.org/apidocs/channels/stable/dartdoc-viewer/dart:core.Iterator
You will see these are like those in .Net.
Heh, I wasn’t aware of this connection between Dart and Java. It’s a small world, I guess. Thanks for pointing this out!
The main difference between a java iterator and ‘traditional’ iterator is: the traditional iterator as in stl is ‘on’ the element, in java ‘in between’. A resultset is also ‘on’ a db line not in between.
Optional.empty() is my third state.
Yes, your observation matches that of Edwin Dalorzo. In his comments he also points out that this one area where Java and .NET differ.
Guava’s or Java 8’s
Optional
would be a fair choice here. Too bad, there is noOptionalBoolean
in the JDK 8 (similar toOptionalInt
)