Watch Out For Recursion in Java 8’s [Primitive]Stream.iterate()


An interesting question by Tagir Valeev on Stack Overflow has recently caught my attention. To keep things short (read the question for details), while the following code works:

public static Stream<Long> longs() {
    return Stream.iterate(1L, i ->
        1L + longs().skip(i - 1L)
                    .findFirst()
                    .get());
}

longs().limit(5).forEach(System.out::println);

printing

1
2
3
4
5

The following, similar code won’t work:

public static LongStream longs() {
    return LongStream.iterate(1L, i ->
        1L + longs().skip(i - 1L)
                    .findFirst()
                    .getAsLong());
}

Causing a StackOverflowError.

Sure, this kind of recursive iteration is not optimal. It wasn’t prior to Java 8 and it certainly isn’t with the new APIs either. But one might think it should at least work, right? The reason why it doesn’t work is because of a subtle implementation difference between the two iterate() methods in Java 8. While the reference type stream’s Iterator first returns the seed and only then proceeds with iterating by applying the iteration function on the previous value:

final Iterator<T> iterator = new Iterator<T>() {
    @SuppressWarnings("unchecked")
    T t = (T) Streams.NONE;

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public T next() {
        return t = (t == Streams.NONE) ? seed : f.apply(t);
    }
};

This is not the case for the LongStream.iterate() version (and other primitive streams):

final PrimitiveIterator.OfLong iterator = new PrimitiveIterator.OfLong() {
    long t = seed;

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public long nextLong() {
        long v = t;
        t = f.applyAsLong(t);
        return v;
    }
};

The iteration function is already pre-fetched one value in advance. This is usually not a problem, but can lead to

  1. Optimisation issues when the iteration function is expensive
  2. Infinite recursions when the iterator is used recursively

As a workaround, it might be best to simply avoid recursion with this method in primitive type streams. Luckily, a fix in JDK 9 is already on its way (as a side effect for a feature enhancement):
https://bugs.openjdk.java.net/browse/JDK-8072727

A Subtle AutoCloseable Contract Change Between Java 7 and Java 8


A nice feature of the Java 7 try-with-resources statement and the AutoCloseable type that was introduced to work with this statement is the fact that static code analysis tools can detect resource leaks. For instance, Eclipse:

resource-leak

When you have the above configuration and you try running the following program, you’ll get three warnings:

public static void main(String[] args) 
throws Exception {
    Connection c = DriverManager.getConnection(
         "jdbc:h2:~/test", "sa", "");
    Statement s = c.createStatement();
    ResultSet r = s.executeQuery("SELECT 1 + 1");
    r.next();
    System.out.println(r.getInt(1));
}

The output is, trivially

2

The warnings are issued on all of c, s, r. A quick fix (don’t do this!) is to suppress the warning using an Eclipse-specific SuppressWarnings parameter:

@SuppressWarnings("resource")
public static void main(String[] args) 
throws Exception {
    ...
}

After all, WeKnowWhatWeReDoing™ and this is just a simple example, right?

Wrong!

The right way to fix this, even for simple examples (at least after Java 7) is to use the effortless try-with-resources statement.

public static void main(String[] args) 
throws Exception {
    try (Connection c = DriverManager.getConnection(
             "jdbc:h2:~/test", "sa", "");
         Statement s = c.createStatement();
         ResultSet r = s.executeQuery("SELECT 1 + 1")) {

        r.next();
        System.out.println(r.getInt(1));
    }
}

In fact, it would be great if Eclipse could auto-fix this warning and wrap all the individual statements in a try-with-resources statement. Upvote this feature request, please!

Great, we know this. What’s the deal with Java 8?

In Java 8, the contract on AutoCloseable has changed very subtly (or bluntly, depending on your point of view).

Java 7 version

A resource that must be closed when it is no longer needed.

Note the word "must".

Java 8 version

An object that may hold resources (such as file or socket handles) until it is closed. The close() method of an AutoCloseable object is called automatically when exiting a try-with-resources block for which the object has been declared in the resource specification header. This construction ensures prompt release, avoiding resource exhaustion exceptions and errors that may otherwise occur.

API Note:

It is possible, and in fact common, for a base class to implement AutoCloseable even though not all of its subclasses or instances will hold releasable resources. For code that must operate in complete generality, or when it is known that the AutoCloseable instance requires resource release, it is recommended to use try-with-resources constructions. However, when using facilities such as Stream that support both I/O-based and non-I/O-based forms, try-with-resources blocks are in general unnecessary when using non-I/O-based forms.

In short, from Java 8 onwards, AutoCloseable is more of a hint saying that you might be using a resource that needs to be closed, but this isn’t necessarily the case.

This is similar to the Iterable contract, which doesn’t say whether you can iterate only once, or several times over the Iterable, but it imposes a contract that is required for the foreach loop.

When do we have “optionally closeable” resources?

Take jOOQ for instance. Unlike in JDBC, a jOOQ Query (which was made AutoCloseable in jOOQ 3.7) may or may not represent a resource, depending on how you execute it. By default, it is not a resource:

try (Connection c = DriverManager.getConnection(
        "jdbc:h2:~/test", "sa", "")) {

    // No new resources created here:
    ResultQuery<Record> query =
        DSL.using(c).resultQuery("SELECT 1 + 1");

    // Resources created and closed immediately
    System.out.println(query.fetch());
}

The output is again:

+----+
|   2|
+----+
|   2|
+----+

But now, we have again an Eclipse warning on the query variable, saying that there is a resource that needs to be closed, even if by using jOOQ this way, we know that this isn’t true. The only resource in the above code is the JDBC Connection, and it is properly handled. The jOOQ-internal PreparedStatement and ResultSet are completely handled and eagerly closed by jOOQ.

Then, why implement AutoCloseable in the first place?

jOOQ inverses JDBC’s default behaviour.

  • In JDBC, everything is done lazily by default, and resources have to be closed explicitly.
  • In jOOQ, everything is done eagerly by default, and optionally, resources can be kept alive explicitly.

For instance, the following code will keep an open PreparedStatement and ResultSet:

try (Connection c = DriverManager.getConnection(
        "jdbc:h2:~/test", "sa", "");

     // We "keep" the statement open in the ResultQuery
     ResultQuery<Record> query =
         DSL.using(c)
            .resultQuery("SELECT 1 + 1")
            .keepStatement(true)) {

    // We keep the ResultSet open in the Cursor
    try (Cursor<Record> cursor = query.fetchLazy()) {
        System.out.println(cursor.fetchOne());
    }
}

With this version, we no longer have any warnings in Eclipse, but the above version is really the exception when using the jOOQ API.

The same thing is true for Java 8’s Stream API. Interestingly, Eclipse doesn’t issue any warnings here:

Stream<Integer> stream = Arrays.asList(1, 2, 3).stream();
stream.forEach(System.out::println);

Conclusion

Resource leak detection seems to be a nice IDE / compiler feature at first. But avoiding false positives is hard. Specifically, because Java 8 changed contracts on AutoCloseable, implementors are allowed to implement the AutoCloseable contract for mere convenience, not as a clear indicator of a resource being present that MUST be closed.

This makes it very hard, if not impossible, for an IDE to detect resource leaks of third party APIs (non-JDK APIs), where these contracts aren’t generally well-known. The solution is, as ever so often with static code analysis tools, to simply turn off potential resource leak detection:

resource-leak-solution

For more insight, see also this Stack Overflow answer by Stuart Marks, linking to the EG’s discussions on lambda-dev

Really Too Bad that Java 8 Doesn’t Have Iterable.stream()


This is one of the more interesting recent Stack Overflow questions:

Why does Iterable not provide stream() and parallelStream() methods?

At first, it might seem intuitive to make it straight-forward to convert an Iterable into a Stream, because the two are really more or less the same thing for 90% of all use-cases.

Granted, the expert group had a strong focus on making the Stream API parallel capable, but anyone who works with Java every day will notice immediately, that Stream is most useful in its sequential form. And an Iterable is just that. A sequential stream with no guarantees with respect to parallelisation. So, it would only be intuitive if we could simply write:

iterable.stream();

In fact, subtypes of Iterable do have such methods, e.g.

collection.stream();

Brian Goetz himself gave an answer to the above Stack Overflow question. The reasons for this omittance are rooted in the fact that some Iterables might prefer to return an IntStream instead of a Stream. This really seems to be a very remote reason for a design decision, but as always, omittance today doesn’t mean omittance forever. On the other hand, if they had introduced Iterable.stream() today, and it turned out to be a mistake, they couldn’t have removed it again.

Well, primitive types in Java are a pain and they did all sorts of bad things to generics in the first place, and now to Stream as well, as we have to write the following, in order to turn an Iterable into a Stream:

Stream s = StreamSupport.stream(iterable.spliterator(), false);

Brian Goetz argues that this is “easy”, but I would disagree. As an API consumer, I experience a lot of friction in productivity because of:

  • Having to remember this otherwise useless StreamSupport type. This method could very well have been put into the Stream interface, because we already have Stream construction methods in there, such as Stream.of().
  • Having to remember the subtle difference between Iterator and Spliterator in the context of what I believe has nothing to do with parallelisation. It may well be that Spliterators will become popular eventually, though, so this doubt is for the magic 8 ball to address.
  • In fact, I have to repeat the information that there is nothing to be parallelised via the boolean argument false

Parallelisation really has such a big weight in this new API, even if it will cover only around 5%-10% of all functional collection manipulation operations. While sequential processing was not the main design goal of the JDK 8 APIs, it is really the main benefit for all of us, and the friction around APIs related to sequential processing should be as low as possible.

The method above should have just been called

Stream s = Stream.stream(iterable);

It could be implemented like this:

public static<T> Stream<T> stream(Iterable<T> i) {
    return StreamSupport.stream(i.spliterator(), false);
}

Obviously with convenience overloads that allow for the additional specialisations, like parallelisation, or passing a Spliterator

But again, if Iterable had its own stream() default method, an incredible number of APIs would be so much better integrated with Java 8 out of the box, without even supporting Java 8 explicitly!

Take jOOQ for instance. jOOQ still supports Java 6, so a direct dependency is not possible. However, jOOQ’s ResultQuery type is an Iterable. This allows you to use such queries directly inline in foreach loops, as if you were writing PL/SQL:

PL/SQL

FOR book IN (
  SELECT * FROM books ORDER BY books.title
)
LOOP
  -- Do things with book
END LOOP;

Java

for (BookRecord book : 
  ctx.selectFrom(BOOKS).orderBy(BOOKS.TITLE)
) {
  // Do things with book
}

Now imagine the same thing in Java 8:

ctx.selectFrom(BOOKS).orderBy(BOOKS.TITLE)
   .stream()
   .map / reduce / findAny, etc...

Unfortunately, the above is currently not possible. You could, of course, eagerly fetch all the results into a jOOQ Result, which extends List:

ctx.selectFrom(BOOKS).orderBy(BOOKS.TITLE)
   .fetch()
   .stream()
   .map / reduce / findAny, etc...

But it’s one more method to call (every time), and the actual stream semantics is broken, because the fetch is done eagerly.

Complaining on a high level

This is, of course, complaining on a high level, but it would really be great if a future version of Java, e.g. Java 9, would add this missing method to the Iterable API. Again, 99% of all use-cases will want the Stream type to be returned, not the IntStream type. And if they do want that for whatever obscure reason (much more obscure than many evil things from old legacy Java APIs, looking at you Calendar), then why shouldn’t they just declare an intStream() method. After all, if someone is crazy enough to write Iterable<Integer> when they’re really operating on int primitive types, they’ll probably accept a little workaround.