Java 8 Friday: 10 Subtle Mistakes When Using the Streams API

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

10 Subtle Mistakes When Using the Streams API

We’ve done all the SQL mistakes lists:

But we haven’t done a top 10 mistakes list with Java 8 yet! For today’s occasion (it’s Friday the 13th), we’ll catch up with what will go wrong in YOUR application when you’re working with Java 8. (it won’t happen to us, as we’re stuck with Java 6 for another while)

1. Accidentally reusing streams

Wanna bet, this will happen to everyone at least once. Like the existing “streams” (e.g. InputStream), you can consume streams only once. The following code won’t work:

IntStream stream = IntStream.of(1, 2);
stream.forEach(System.out::println);

// That was fun! Let's do it again!
stream.forEach(System.out::println);

You’ll get a

java.lang.IllegalStateException: 
  stream has already been operated upon or closed

So be careful when consuming your stream. It can be done only once

2. Accidentally creating “infinite” streams

You can create infinite streams quite easily without noticing. Take the following example:

// Will run indefinitely
IntStream.iterate(0, i -> i + 1)
         .forEach(System.out::println);

The whole point of streams is the fact that they can be infinite, if you design them to be. The only problem is, that you might not have wanted that. So, be sure to always put proper limits:

// That's better
IntStream.iterate(0, i -> i + 1)
         .limit(10)
         .forEach(System.out::println);

3. Accidentally creating “subtle” infinite streams

We can’t say this enough. You WILL eventually create an infinite stream, accidentally. Take the following stream, for instance:

IntStream.iterate(0, i -> ( i + 1 ) % 2)
         .distinct()
         .limit(10)
         .forEach(System.out::println);

So…

  • we generate alternating 0’s and 1’s
  • then we keep only distinct values, i.e. a single 0 and a single 1
  • then we limit the stream to a size of 10
  • then we consume it

Well… the distinct() operation doesn’t know that the function supplied to the iterate() method will produce only two distinct values. It might expect more than that. So it’ll forever consume new values from the stream, and the limit(10) will never be reached. Tough luck, your application stalls.

4. Accidentally creating “subtle” parallel infinite streams

We really need to insist that you might accidentally try to consume an infinite stream. Let’s assume you believe that the distinct() operation should be performed in parallel. You might be writing this:

IntStream.iterate(0, i -> ( i + 1 ) % 2)
         .parallel()
         .distinct()
         .limit(10)
         .forEach(System.out::println);

Now, we’ve already seen that this will turn forever. But previously, at least, you only consumed one CPU on your machine. Now, you’ll probably consume four of them, potentially occupying pretty much all of your system with an accidental infinite stream consumption. That’s pretty bad. You can probably hard-reboot your server / development machine after that. Have a last look at what my laptop looked like prior to exploding:

If I were a laptop, this is how I'd like to go.
If I were a laptop, this is how I’d like to go.

5. Mixing up the order of operations

So, why did we insist on your definitely accidentally creating infinite streams? It’s simple. Because you may just accidentally do it. The above stream can be perfectly consumed if you switch the order of limit() and distinct():

IntStream.iterate(0, i -> ( i + 1 ) % 2)
         .limit(10)
         .distinct()
         .forEach(System.out::println);

This now yields:

0
1

Why? Because we first limit the infinite stream to 10 values (0 1 0 1 0 1 0 1 0 1), before we reduce the limited stream to the distinct values contained in it (0 1).

Of course, this may no longer be semantically correct, because you really wanted the first 10 distinct values from a set of data (you just happened to have “forgotten” that the data is infinite). No one really wants 10 random values, and only then reduce them to be distinct.

If you’re coming from a SQL background, you might not expect such differences. Take SQL Server 2012, for instance. The following two SQL statements are the same:

-- Using TOP
SELECT DISTINCT TOP 10 *
FROM i
ORDER BY ..

-- Using FETCH
SELECT *
FROM i
ORDER BY ..
OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY

So, as a SQL person, you might not be as aware of the importance of the order of streams operations.

jOOQ, the best way to write SQL in Java

6. Mixing up the order of operations (again)

Speaking of SQL, if you’re a MySQL or PostgreSQL person, you might be used to the LIMIT .. OFFSET clause. SQL is full of subtle quirks, and this is one of them. The OFFSET clause is applied FIRST, as suggested in SQL Server 2012’s (i.e. the SQL:2008 standard’s) syntax.

If you translate MySQL / PostgreSQL’s dialect directly to streams, you’ll probably get it wrong:

IntStream.iterate(0, i -> i + 1)
         .limit(10) // LIMIT
         .skip(5)   // OFFSET
         .forEach(System.out::println);

The above yields

5
6
7
8
9

Yes. It doesn’t continue after 9, because the limit() is now applied first, producing (0 1 2 3 4 5 6 7 8 9). skip() is applied after, reducing the stream to (5 6 7 8 9). Not what you may have intended.

BEWARE of the LIMIT .. OFFSET vs. "OFFSET .. LIMIT" trap!

7. Walking the file system with filters

We’ve blogged about this before. What appears to be a good idea is to walk the file system using filters:

Files.walk(Paths.get("."))
     .filter(p -> !p.toFile().getName().startsWith("."))
     .forEach(System.out::println);

The above stream appears to be walking only through non-hidden directories, i.e. directories that do not start with a dot. Unfortunately, you’ve again made mistake #5 and #6. walk() has already produced the whole stream of subdirectories of the current directory. Lazily, though, but logically containing all sub-paths. Now, the filter will correctly filter out paths whose names start with a dot “.”. E.g. .git or .idea will not be part of the resulting stream. But these paths will be: .\.git\refs, or .\.idea\libraries. Not what you intended.

Now, don’t fix this by writing the following:

Files.walk(Paths.get("."))
     .filter(p -> !p.toString().contains(File.separator + "."))
     .forEach(System.out::println);

While that will produce the correct output, it will still do so by traversing the complete directory subtree, recursing into all subdirectories of “hidden” directories.

I guess you’ll have to resort to good old JDK 1.0 File.list() again. The good news is, FilenameFilter and FileFilter are both functional interfaces.

8. Modifying the backing collection of a stream

While you’re iterating a List, you must not modify that same list in the iteration body. That was true before Java 8, but it might become more tricky with Java 8 streams. Consider the following list from 0..9:

// Of course, we create this list using streams:
List<Integer> list = 
IntStream.range(0, 10)
         .boxed()
         .collect(toCollection(ArrayList::new));

Now, let’s assume that we want to remove each element while consuming it:

list.stream()
    // remove(Object), not remove(int)!
    .peek(list::remove)
    .forEach(System.out::println);

Interestingly enough, this will work for some of the elements! The output you might get is this one:

0
2
4
6
8
null
null
null
null
null
java.util.ConcurrentModificationException

If we introspect the list after catching that exception, there’s a funny finding. We’ll get:

[1, 3, 5, 7, 9]

Heh, it “worked” for all the odd numbers. Is this a bug? No, it looks like a feature. If you’re delving into the JDK code, you’ll find this comment in ArrayList.ArraListSpliterator:

/*
 * If ArrayLists were immutable, or structurally immutable (no
 * adds, removes, etc), we could implement their spliterators
 * with Arrays.spliterator. Instead we detect as much
 * interference during traversal as practical without
 * sacrificing much performance. We rely primarily on
 * modCounts. These are not guaranteed to detect concurrency
 * violations, and are sometimes overly conservative about
 * within-thread interference, but detect enough problems to
 * be worthwhile in practice. To carry this out, we (1) lazily
 * initialize fence and expectedModCount until the latest
 * point that we need to commit to the state we are checking
 * against; thus improving precision.  (This doesn't apply to
 * SubLists, that create spliterators with current non-lazy
 * values).  (2) We perform only a single
 * ConcurrentModificationException check at the end of forEach
 * (the most performance-sensitive method). When using forEach
 * (as opposed to iterators), we can normally only detect
 * interference after actions, not before. Further
 * CME-triggering checks apply to all other possible
 * violations of assumptions for example null or too-small
 * elementData array given its size(), that could only have
 * occurred due to interference.  This allows the inner loop
 * of forEach to run without any further checks, and
 * simplifies lambda-resolution. While this does entail a
 * number of checks, note that in the common case of
 * list.stream().forEach(a), no checks or other computation
 * occur anywhere other than inside forEach itself.  The other
 * less-often-used methods cannot take advantage of most of
 * these streamlinings.
 */

Now, check out what happens when we tell the stream to produce sorted() results:

list.stream()
    .sorted()
    .peek(list::remove)
    .forEach(System.out::println);

This will now produce the following, “expected” output

0
1
2
3
4
5
6
7
8
9

And the list after stream consumption? It is empty:

[]

So, all elements are consumed, and removed correctly. The sorted() operation is a “stateful intermediate operation”, which means that subsequent operations no longer operate on the backing collection, but on an internal state. It is now “safe” to remove elements from the list!

Well… can we really? Let’s proceed with parallel(), sorted() removal:

list.stream()
    .sorted()
    .parallel()
    .peek(list::remove)
    .forEach(System.out::println);

This now yields:

7
6
2
5
8
4
1
0
9
3

And the list contains

[8]

Eek. We didn’t remove all elements!? Free beers (and jOOQ stickers) go to anyone who solves this streams puzzler!

This all appears quite random and subtle, we can only suggest that you never actually do modify a backing collection while consuming a stream. It just doesn’t work.

9. Forgetting to actually consume the stream

What do you think the following stream does?

IntStream.range(1, 5)
         .peek(System.out::println)
         .peek(i -> { 
              if (i == 5) 
                  throw new RuntimeException("bang");
          });

When you read this, you might think that it will print (1 2 3 4 5) and then throw an exception. But that’s not correct. It won’t do anything. The stream just sits there, never having been consumed.

As with any fluent API or DSL, you might actually forget to call the “terminal” operation. This might be particularly true when you use peek(), as peek() is an aweful lot similar to forEach().

This can happen with jOOQ just the same, when you forget to call execute() or fetch():

DSL.using(configuration)
   .update(TABLE)
   .set(TABLE.COL1, 1)
   .set(TABLE.COL2, "abc")
   .where(TABLE.ID.eq(3));

Oops. No execute()

jOOQ, the best way to write SQL in Java

Yes, the “best” way – with 1-2 caveats ;-)

10. Parallel stream deadlock

This is now a real goodie for the end!

All concurrent systems can run into deadlocks, if you don’t properly synchronise things. While finding a real-world example isn’t obvious, finding a forced example is. The following parallel() stream is guaranteed to run into a deadlock:

Object[] locks = { new Object(), new Object() };

IntStream
    .range(1, 5)
    .parallel()
    .peek(Unchecked.intConsumer(i -> {
        synchronized (locks[i % locks.length]) {
            Thread.sleep(100);

            synchronized (locks[(i + 1) % locks.length]) {
                Thread.sleep(50);
            }
        }
    }))
    .forEach(System.out::println);

Note the use of Unchecked.intConsumer(), which transforms the functional IntConsumer interface into a org.jooq.lambda.fi.util.function.CheckedIntConsumer, which is allowed to throw checked exceptions.

Well. Tough luck for your machine. Those threads will be blocked forever :-)

The good news is, it has never been easier to produce a schoolbook example of a deadlock in Java!

For more details, see also Brian Goetz’s answer to this question on Stack Overflow.

Conclusion

With streams and functional thinking, we’ll run into a massive amount of new, subtle bugs. Few of these bugs can be prevented, except through practice and staying focused. You have to think about how to order your operations. You have to think about whether your streams may be infinite.

Streams (and lambdas) are a very powerful tool. But a tool which we need to get a hang of, first.

Stay tuned for more exciting Java 8 articles on this blog.

Three-State Booleans in Java

Every now and then, I miss SQL’s three-valued BOOLEAN semantics in Java. In SQL, we have:

  • TRUE
  • FALSE
  • UNKNOWN (also known as NULL)

Every now and then, I find myself in a situation where I wish I could also express this UNKNOWN or UNINITIALISED semantics in Java, when plain true and false aren’t enough.

Implementing a ResultSetIterator

For instance, when implementing a ResultSetIterator for jOOλ, a simple library modelling SQL streams for Java 8:

SQL.stream(stmt, Unchecked.function(r ->
    new SQLGoodies.Schema(
        r.getString("FIELD_1"),
        r.getBoolean("FIELD_2")
    )
))
.forEach(System.out::println);

In order to implement a Java 8 Stream, we need to construct an Iterator, which we can then pass to the new Spliterators.spliteratorUnknownSize() method:

StreamSupport.stream(
  Spliterators.spliteratorUnknownSize(iterator, 0), 
  false
);

Another example for this can be seen here on Stack Overflow.

When implementing the Iterator interface, we must implement hasNext() and next(). Note that with Java 8, remove() now has a default implementation, so we don’t need to implement it any longer.

While most of the time, a call to next() is preceded by a call to hasNext() exactly once, nothing in the Iterator contract requires this. It is perfectly fine to write:

if (it.hasNext()) {
    // Some stuff

    // Double-check again to be sure
    if (it.hasNext() && it.hasNext()) {

        // Yes, we're paranoid
        if (it.hasNext())
            it.next();
    }
}

How to translate the Iterator calls to backing calls on the JDBC ResultSet? We need to call ResultSet.next().

We could make the following translation:

  • Iterator.hasNext() == !ResultSet.isLast()
  • Iterator.next() == ResultSet.next()

But that translation is:

  • Expensive
  • Not dealing correctly with empty ResultSets
  • Not implemented in all JDBC drivers (Support for the isLast method is optional for ResultSets with a result set type of TYPE_FORWARD_ONLY)

So, we’ll have to maintain a flag, internally, that tells us:

  • If we had already called ResultSet.next()
  • What the result of that call was

Instead of creating a second variable, why not just use a three-valued java.lang.Boolean. Here’s a possible implementation from jOOλ:

class ResultSetIterator<T> implements Iterator<T> {

    final Supplier<? extends ResultSet>  supplier;
    final Function<ResultSet, T>         rowFunction;
    final Consumer<? super SQLException> translator;

    /**
     * Whether the underlying {@link ResultSet} has
     * a next row. This boolean has three states:
     * <ul>
     * <li>null:  it's not known whether there 
     *            is a next row</li>
     * <li>true:  there is a next row, and it
     *            has been pre-fetched</li>
     * <li>false: there aren't any next rows</li>
     * </ul>
     */
    Boolean hasNext;
    ResultSet rs;

    ResultSetIterator(
        Supplier<? extends ResultSet> supplier, 
        Function<ResultSet, T> rowFunction, 
        Consumer<? super SQLException> translator
    ) {
        this.supplier = supplier;
        this.rowFunction = rowFunction;
        this.translator = translator;
    }

    private ResultSet rs() {
        return (rs == null) 
             ? (rs = supplier.get()) 
             :  rs;
    }

    @Override
    public boolean hasNext() {
        try {
            if (hasNext == null) {
                hasNext = rs().next();
            }

            return hasNext;
        }
        catch (SQLException e) {
            translator.accept(e);
            throw new IllegalStateException(e);
        }
    }

    @Override
    public T next() {
        try {
            if (hasNext == null) {
                rs().next();
            }

            return rowFunction.apply(rs());
        }
        catch (SQLException e) {
            translator.accept(e);
            throw new IllegalStateException(e);
        }
        finally {
            hasNext = null;
        }
    }
}

As you can see, the hasNext() method locally caches the hasNext three-valued boolean state only if it was null before. This means that calling hasNext() several times will have no effect until you call next(), which resets the hasNext cached state.

Both hasNext() and next() advance the ResultSet cursor if needed.

Readability?

Some of you may argue that this doesn’t help readability. They’d introduce a new variable like:

boolean hasNext;
boolean hasHasNextBeenCalled;

The trouble with this is the fact that you’re still implementing three-valued boolean state, but distributed to two variables, which are very hard to name in a way that is truly more readable than the actual java.lang.Boolean solution. Besides, there are actually four state values for two boolean variables, so there is a slight increase in the risk of bugs.

Every rule has its exception. Using null for the above semantics is a very good exception to the null-is-bad histeria that has been going on ever since the introduction of Option / Optional

In other words: Which approach is best? There’s no TRUE or FALSE answer, only UNKNOWN ;-)

Be careful with this

However, as we’ve discussed in a previous blog post, you should avoid returning null from API methods if possible. In this case, using null explicitly as a means to model state is fine because this model is encapsulated in our ResultSetIterator. But try to avoid leaking such state to the outside of your API.

Java 8 Friday: Optional Will Remain an Option in Java

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem.

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Optional: A new Option in Java

So far, we’ve been pretty thrilled with all the additions to Java 8. All in all, this is a revolution more than anything before. But there are also one or two sore spots. One of them is how Java will never really get rid of

Null: The billion dollar mistake tweet this

In a previous blog post, we have explained the merits of NULL handling in the Ceylon language, which has found one of the best solutions to tackle this issue – at least on the JVM which is doomed to support the null pointer forever. In Ceylon, nullability is a flag that can be added to every type by appending a question mark to the type name. An example:

void hello() {
    String? name = process.arguments.first;
    String greeting;
    if (exists name) {
        greeting = "Hello, ``name``!";
    }
    else {
        greeting = "Hello, World!";
    }
    print(greeting);
}

That’s pretty slick. Combined with flow-sensitive typing, you will never run into the dreaded NullPointerException again:

Recently in the Operating Room. By Geek and Poke
Recently in the Operating Room. By Geek and Poke

Other languages have introduced the Option type. Most prominently: Scala. Java 8 now also introduced the Optional type (as well as the OptionalInt, OptionalLong, OptionalDouble types – more about those later on)

How does Optional work?

The main point behind Optional is to wrap an Object and to provide convenience API to handle nullability in a fluent manner. This goes well with Java 8 lambda expressions, which allow for lazy execution of operations. An example:

Optional<String> stringOrNot = Optional.of("123");

// This String reference will never be null
String alwaysAString =
    stringOrNot.orElse("");

// This Integer reference will be wrapped again
Optional<Integer> integerOrNot = 
    stringOrNot.map(Integer::parseInt);

// This int reference will never be null
int alwaysAnInt = stringOrNot
        .map(s -> Integer.parseInt(s))
        .orElse(0);

There are certain merits to the above in fluent APIs, specifically in the new Java 8 Streams API, which makes extensive use of Optional. For example:

Arrays.asList(1, 2, 3)
      .stream()
      .findAny()
      .ifPresent(System.out::println);

The above piece of code will print any number from the Stream onto the console, but only if such a number exists.

Old API is not retrofitted

For obvious backwards-compatibility reasons, the “old API” is not retrofitted. In other words, unlike Scala, Java 8 doesn’t use Optional all over the JDK. In fact, the only place where Optional is used is in the Streams API. As you can see in the Javadoc, usage is very scarce:

http://docs.oracle.com/javase/8/docs/api/java/util/class-use/Optional.html

This makes Optional a bit difficult to use. We’ve already blogged about this topic before. Concretely, the absence of an Optional type in the API is no guarantee of non-nullability. This is particularly nasty if you convert Streams into collections and collections into streams.

The Java 8 Optional type is treacherous tweet this

Parametric polymorphism

The worst implication of Optional on its “infected” API is parametric polymorphism, or simply: generics. When you reason about types, you will quickly understand that:

// This is a reference to a simple type:
Number s;

// This is a reference to a collection of
// the above simple type:
Collection<Number> c;

Generics are often used for what is generally accepted as composition. We have a Collection of String. With Optional, this compositional semantics is slightly abused (both in Scala and Java) to “wrap” a potentially nullable value. We now have:

// This is a reference to a nullable simple type:
Optional<Number> s;

// This is a reference to a collection of 
// possibly nullable simple types
Collection<Optional<Number>> c;

So far so good. We can substitute types to get the following:

// This is a reference to a simple type:
T s;

// This is a reference to a collection of
// the above simple type:
Collection<T> c;

But now enter wildcards and use-site variance. We can write

// No variance can be applied to simple types:
T s;

// Variance can be applied to collections of
// simple types:
Collection<? extends T> source;
Collection<? super T> target;

What do the above types mean in the context of Optional? Intuitively, we would like this to be about things like Optional<? extends Number> or Optional<? super Number>. In the above example we can write:

// Read a T-value from the source
T s = source.iterator().next();

// ... and put it into the target
target.add(s);

But this doesn’t work any longer with Optional

Collection<Optional<? extends T>> source;
Collection<Optional<? super T>> target;

// Read a value from the source
Optional<? extends T> s = source.iterator().next();

// ... cannot put it into the target
target.add(s); // Nope

… and there is no other way to reason about use-site variance when we have Optional and subtly more complex API.

If you add generic type erasure to the discussion, things get even worse. We no longer erase the component type of the above Collection, we also erase the type of virtually any reference. From a runtime / reflection perspective, this is almost like using Object all over the place!

Generic type systems are incredibly complex even for simple use-cases. Optional makes things only worse. It is quite hard to blend Optional with traditional collections API or other APIs. Compared to the ease of use of Ceylon’s flow-sensitive typing, or even Groovy’s elvis operator, Optional is like a sledge-hammer in your face.

Be careful when you apply it to your API!

Primitive types

One of the main reasons why Optional is still a very useful addition is the fact that the “object-stream” and the “primitive streams” have a “unified API” by the fact that we also have OptionalInt, OptionalLong, OptionalDouble types.

In other words, if you’re operating on primitive types, you can just switch the stream construction and reuse the rest of your stream API usage source code, in almost the same way. Compare these two chains:

// Stream and Optional
Optional<Integer> anyInteger = 
Arrays.asList(1, 2, 3)
      .stream()
      .filter(i -> i % 2 == 0)
      .findAny();
anyInteger.ifPresent(System.out::println);

// IntStream and OptionalInt
OptionalInt anyInt =
Arrays.stream(new int[] {1, 2, 3})
      .filter(i -> i % 2 == 0)
      .findAny();
anyInt.ifPresent(System.out::println);

In other words, given the scarce usage of these new types in JDK API, the dubious usefulness of such a type in general (if retrofitted into a very backwards-compatible environment) and the implications generics erasure have on Optional we dare say that

The only reason why this type was really added is to provide a more unified Streams API for both reference and primitive types tweet this

That’s tough. And makes us wonder, if we should finally get rid of primitive types altogether.

Oh, and…

Optional isn’t Serializable.

Nope. Not Serializable. Unlike ArrayList, for instance. For the usual reason:

Making something in the JDK serializable makes a dramatic increase in our maintenance costs, because it means that the representation is frozen for all time. This constrains our ability to evolve implementations in the future, and the number of cases where we are unable to easily fix a bug or provide an enhancement, which would otherwise be simple, is enormous. So, while it may look like a simple matter of “implements Serializable” to you, it is more than that. The amount of effort consumed by working around an earlier choice to make something serializable is staggering.

Citing Brian Goetz, from:

http://mail.openjdk.java.net/pipermail/jdk8-dev/2013-September/003276.html

Want to discuss Optional? Read these threads on reddit:

Stay tuned for more exciting Java 8 stuff published in this blog series.

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

Java 8 Friday Goodies: SQL ResultSet Streams

At Data Geekery, we love Java. And as we’re really into jOOQ’s fluent API and query DSL, we’re absolutely thrilled about what Java 8 will bring to our ecosystem. We have blogged a couple of times about some nice Java 8 goodies, and now we feel it’s time to start a new blog series, the…

Java 8 Friday

Every Friday, we’re showing you a couple of nice new tutorial-style Java 8 features, which take advantage of lambda expressions, extension methods, and other great stuff. You’ll find the source code on GitHub.

Java 8 Goodie: SQL ResultSet Streams

Yes, the SQL subject must be dealt with again. Even if last week, we promised an article on concurrency, there is one very important aspect of Java 8 lambdas and interoperability with “legacy” APIs that we need to talk about, first.

Checked Exceptions

Yes. Unfortunately, those beasts from the past still haunt us, more than ever when we’re using Java 8’s lambda expressions. Already before Java 8’s release, there are a couple of Stack Overflow questions related to the subject.

Let’s remember how the IOExceptions caused issues when traversing the file system. Unless you write your own utility, you’ll have to resort to this beauty:

Arrays.stream(dir.listFiles()).forEach(file -> {
    try {
        System.out.println(file.getCanonicalPath());
    }
    catch (IOException e) {
        throw new RuntimeException(e);
    }

    // Ouch, my fingers hurt! All this typing!
});

We think it is safe to say:

Java 8 and checked exceptions don’t match.tweet this

A workaround is to write your own CheckedConsumer that wraps the checked exception. Such a consumer will be highly reusable, but… Did you think of all the other FunctionalInterfaces? There are quite a few of them in the java.util.function package:

Some of the many types in java.util.function
Some of the many types in java.util.function

jOOλ – Fixing lambda in Java 8

jool-logo-blackWhile writing this Java 8 blog series, we’ve constantly run into the need to wrap checked exceptions inside lambda expressions. And what do we geeks do when we frequently run into a problem? We fix it! And we have created jOOλ (also jOOL, jOO-Lambda), ASL 2.0 licensed, where we have duplicated pretty much every FunctionalInterface that is available from the JDK to support checked exceptions. Here’s how you would use jOOλ in the above example:

Arrays.stream(dir.listFiles()).forEach(
    Unchecked.consumer(file -> {
        // Throw all sorts of checked exceptions
        // here, we don't care...
        System.out.println(file.getCanonicalPath());
    })
);

The above example shows how you can simply ignore and pass checked exceptions as RuntimeExceptions. If you actually want to handle them, you can pass an exception handler lambda:

Arrays.stream(dir.listFiles())
      .forEach(Unchecked.consumer(

    file -> {
        System.out.println(file.getCanonicalPath());
    },
    e -> {
        log.info("Log stuff here", e);
        throw new MyRuntimeException(e);
    }
);

The second example now seems equally verbose, but don’t worry. You will probably reuse that exception handler and fall back to this:

Arrays.stream(dir.listFiles())
      .forEach(Unchecked.consumer(
    file -> {
        System.out.println(file.getCanonicalPath());
    },
    myExceptionHandler
);

jOOλ – Providing JDBC ResultSet Streams

Unfortunately, most efforts in the Java 8 Streams API were made in the area of correctly implementing parallelisable streams. While this is very useful for those of us actually doing parallel computing, for most others better integration with legacy APIs would have been better. One API that seriously deserves some lifting is JDBC, and we’ve blogged about this before. With jOOλ, you can now generate Streams directly from ResultSets or even from PreparedStatements. Here’s how you prepare:

Class.forName("org.h2.Driver");
try (Connection c = getConnection()) {
    String sql = "select schema_name, is_default " +
                 "from information_schema.schemata " +
                 "order by schema_name";

    try (PreparedStatement stmt = c.prepareStatement(sql)) {
        // code here
    }
}

Now, all you have to do when using jOOλ is stream your PreparedStatements as such:

SQL.stream(stmt, Unchecked.function(rs ->
    new SQLGoodies.Schema(
        rs.getString("SCHEMA_NAME"),
        rs.getBoolean("IS_DEFAULT")
    )
))
.forEach(System.out::println);

Where SQLGoodies.Schema is just an ordinary POJO. Some of the stream() method’s signatures are these ones:

public static <T> Stream<T> stream(
    PreparedStatement stmt,
    Function<ResultSet, T> rowFunction
);

public static <T> Stream<T> stream(
    PreparedStatement stmt,
    Function<ResultSet, T> rowFunction,
    Consumer<? super SQLException> exceptionHandler
);

Others are available as well.

That is awesome, isn’t it?

JDBC ResultSets should be Java 8 Streams.tweet this

Too bad, the above code didn’t make it into the JDK 8, as this would have been a chance to finally greatly improve on the JDBC API. Another, similar attempt at improving things has been done here by Julian Exenberger.

Java 8 alternatives of writing SQL

We’ve also published a couple of alternatives to jOOλ, using Java 8 with SQL here:

http://www.jooq.org/java-8-and-sql

Conclusion

While Java 8’s lambda expressions are awesome, the new Streams API is pretty incomplete. When implementing the above, we had to implement our own ResultSetIterator, and write all this mess to wrap the iterator in a Stream:

StreamSupport.stream(
    Spliterators.spliteratorUnknownSize(
        new ResultSetIterator<>(
            supplier, 
            rowFunction, 
            exceptionTranslator
        ), 0
    ), false
);

And it shouldn’t be necessary to write an Iterator in the first place, if only we were able to generate finite streams:

// Unfortunately, this method doesn't exist
Stream.generate(
    // Supplier, generating new POJOs
    () -> { 
        rs.next(); 
        return new SQLGoodies.Schema(
            rs.getString("SCHEMA_NAME"),
            rs.getBoolean("IS_DEFAULT")
        );
    },

    // Predicate, terminating the Stream
    () -> { !rs.isLast(); }
);

While jOOλ is an acceptable intermediate solution, and the Guava guys are probably already working out how to fix their library, it is really too bad, that Java 8 is lacking such utility functionality.

But we’re complaining on a high level. Next week, as promised, we’ll see a couple of examples related to concurrency, so stay tuned!

More on Java 8

In the mean time, have a look at Eugen Paraschiv’s awesome Java 8 resources page

Java Streams Preview vs .Net LINQ

I’ve started following this very promising blog by the “Geeks From Paradise”. Apart from the fact that I’m a bit envious of geeks living in Costa Rica, this comparison of the upcoming Java 8 Streams API with various of .NET’s LINQ API capabilities is a very interesting read.

A preview of what you’ll find there (just one of 19 examples):

LINQ

List<string> nameList1 = new List(){ 
  "Anders", "David", "James",
  "Jeff", "Joe", "Erik" };
nameList1.Select(c => "Hello! " + c).ToList()
         .ForEach(c => Console.WriteLine(c));

Java Streams

List<String> nameList1 = asList(
  "Anders", "David", "James",
  "Jeff", "Joe", "Erik");
nameList1.stream()
     .map(c -> "Hello! " + c)
     .forEach(System.out::println);

Read the full blog post here:
http://blog.informatech.cr/2013/03/24/java-streams-preview-vs-net-linq/

Exciting ideas in Java 8: Streams

Brian Goetz’s recent post on the State of the Lambda reveils exciting new ideas that are prone to be included in Java 8. One of them is the concept of “Streams” as opposed to “Collections”. Using the new Java 8 extension methods, the Iterable interface can be extended compatibly with a lot of “lazy” and “eager” methods for streaming behaviour:

public interface Iterable<T> {
    // Abstract methods
    Iterator<T> iterator();

    // Lazy operations
    Iterable<T> filter(Predicate<? super T> predicate) default ...
    <U> Iterable<U> map(Mapper<? super T, ? extends U> mapper) default ...
    <U> Iterable<U> flatMap(
        Mapper<? super T, ? extends Iterable<U>> mapper) default ...
    Iterable<T> cumulate(BinaryOperator<T> op) default ...

    Iterable<T> sorted(
        Comparator<? super T> comparator) default ...
    <U extends Comparable<? super U>> Iterable<T> sortedBy(
        Mapper<? super T, U> extractor) default ...
    Iterable<T> uniqueElements() default ...

    <U> Iterable<U> pipeline(
        Mapper<Iterable<T>, ? extends Iterable<U>> mapper) default ...
    <U> BiStream<T, U> mapped(
        Mapper<? super T, ? extends U> mapper) default ...
    <U> BiStream<U, Iterable<T>> groupBy(
        Mapper<? super T, ? extends U> mapper) default ...
    <U> BiStream<U, Iterable<T>> groupByMulti(
        Mapper<? super T, ? extends Iterable<U>> mapper) default ...

    // Eager operations
    boolean isEmpty() default ...;
    long count() default ...
    T getFirst() default ...
    T getOnly() default ...
    T getAny() default ...

    void forEach(Block<? super T> block) default ...
    T reduce(T base, BinaryOperator<T> reducer) default ...
    <A extends Fillable<? super T>> A into(A target) default ...

    boolean anyMatch(Predicate<? super T> filter) default ...
    boolean noneMatch(Predicate<? super T> filter) default ...
    boolean allMatch(Predicate<? super T> filter) default ...

    <U extends Comparable<? super U>> T maxBy(
        Mapper<? super T, U> extractor) default ...
    <U extends Comparable<? super U>> T minBy(
        Mapper<? super T, U> extractor) default ...
}

The above are rough and incomplete API ideas, or as Brian Goetz puts it: A “strawman writeup of the key concepts”.

The goal of this early publication is for the community to “TRY OUT THE CODE. *The single most valuable thing that the community can do to help move this effort forward, and improve the quality of the result, is to try out the code early and often.*”

Read the full article here:

http://cr.openjdk.java.net/~briangoetz/lambda/collections-overview.html