3.15.0 Release with Support for R2DBC, Nested ROW, ARRAY, and MULTISET types, 5 new SQL dialects, CREATE PROCEDURE, FUNCTION, and TRIGGER support and Much More


What a lot of users have been waiting for: jOOQ 3.15 is reactive, thanks to the new native R2DBC integration. Recent versions already implemented the reactive streams Publisher SPI, but now we’re not cheating anymore. We’re not longer blocking. Just wrap your R2DBC ConnectionFactory configured jOOQ query in a Flux (or any reactive streams API of your choice), and see what happens.


Both blocking (via JDBC) and non-blocking (via R2DBC) can work side-by-side, allowing users to quickly a query between the two execution models without anychanges to the query building logic.

Projecting ROW types, ARRAY of ROW Types, and MULTISETS

After having implemented standard SQL/XML and SQL/JSON support in jOOQ 3.14, another major milestone in taking SQL to the next level is now available as anexperimental feature: Nesting collections using the standard SQL MULTISET operator.

The operator is currently emulated using SQL/XML or SQL/JSON. The resulting documents are parsed again when fetching them from JDBC. Future versions will also provide native support (Informix, Oracle), and emulations using ARRAY (various dialects, including PostgreSQL).

Imagine this query against the Sakila database (https://www.jooq.org/sakila):

var result =

You’re really going to love Java 10’s var keyword for these purposes. What’s the type of result? Exactly:

    Result<Record2<String, String>>, 

It contains:

|title                      |actors                                            |films          |
|ACE GOLDFINGER             |[(BOB, FAWCETT), (MINNIE, ZELLWEGER), (SEAN, GU...|[(Horror)]     |
|ADAPTATION HOLES           |[(NICK, WAHLBERG), (BOB, FAWCETT), (CAMERON, ST...|[(Documentary)]|

Two collections were nested in a single query without producing any unwanted cartesian products and duplication of data. And stay tuned, we’ve added more goodies! See this article on how to map the above structural type to your nominal types (e.g. Java 16 records) in a type safe way, without reflection!

More info here:

New Dialects

We’ve added support for 5 (!) new SQLDialect’s. That’s unprecedented for any previous minor release. The new dialects are:

  • JAVA

Yes, there’s an experimental “JAVA” dialect. It’s mostly useful if you want to translate your native SQL queries to jOOQ using https://www.jooq.org/translate, and it cannot be executed. In the near future, we might add SCALA and KOTLIN as well, depending on demand.

BigQuery and Snowflake were long overdue by popular vote. The expedited EXASOL support has been sponsored by a customer, which is a great reminder that this is always an option. You need something more quickly? We can make it happen, even if the feature isn’t very popular on the roadmap.

Many other dialects have been brought up to date, including REDSHIFT, HANA, VERTICA, and two have been deprecated: INGRES and ORACLE10G, as these grow less and less popular.

Drop Java 6/7 support for Enterprise Edition, require Java 11 in OSS Edition

We’re cleaning up our support for old dependencies and features. Starting with jOOQ 3.12, we offered Java 6 and 7 support only to jOOQ Enterprise Edition customers. With jOOQ 3.15, this support has now been removed, and Java 8 is the new baseline for commercial editions, Java 11 for the jOOQ Open Source Edition, meaning the OSS Edition is now finally modular, and we get access to little things like the Flow API (see R2DBC) and @Deprecate(forRemoval, since).

Upgrading to Java 8 allows for exciting new improvements to our internals, as we can finally use default methods, lambdas, generalised target type inference, effectively final, diamond operators, try-with-resources, string switches, and what not. Improving our code base leads to dog fooding, and that again leads to new features for you. For example, we’ve put a lot of emphasis on ResultQuery.collect(), refactoring internals: https://blog.jooq.org/2021/05/17/use-resultquery-collect-to-implement-powerful-mappings/

There are new auxiliary types, like org.jooq.Rows and org.jooq.Records for more functional transformation convenience. More functions mean less loops, and also less ArrayList allocations.

At the same time, we’ve started building a Java 17 ready distribution for the commercial editions, which unlocks better record type support.

Refactoring of ResultQuery to work with DML

With all the above goodies related to Java 8, and a more functional usage of jOOQ, we’ve also finally refactored our DML statement type hierarchy (INSERT,UPDATE, DELETE), to let their respective RETURNING clauses return an actual ResultQuery. That means you can now stream(), collect(), fetchMap() and subscribe() (via R2DBC) to your DML statements, and even put them in the WITH clause (in PostgreSQL).

Massive improvements to the parser / translator use case

jOOQ’s secondary value proposition is to use its parser and translator, instead of the DSL API, which is also available for free on our website: https://www.jooq.org/translate

With increasing demand for this product, we’ve greatly improved the experience:

  • The ParsingConnection no longer experimental
  • Batch is now possible
  • We’ve added a cache for input/output SQL string pairs to heavily speed up the integration
  • We’re now delaying bind variable type inference to use actual PreparedStatement information. This produces more accurate results, especially when data types are not known to the parser.
  • A new ParseListener SPI allows for hooking into the parser and extend it with custom syntax support for column, table, and predicate expressions.

CREATE PROCEDURE, FUNCTION, TRIGGER and more procedural instructions

Over the recent releases, we’ve started working on procedural language extensions for the commercial distributions. In addition to creating anonymous blocks, we now also support all lifecycle management DDL for procedures, functions, and triggers, which can contain procedural language logic.

This is great news if you’re supporting multiple RDBMS and want to move some more data processing logic to the server side in a vendor agnostic way.

Explicit JDBC driver dependencies to avoid reflection

To get AOP ready, we’re slowly removing internal reflection usage, meaning we’re experimenting with an explicit JDBC driver build-time dependency. This currently affects:

  • Oracle
  • PostgreSQL
  • SQL Server

Only drivers available from Maven Central have been added as dependency so far.

Full release notes here.

Imperative Loop or Functional Stream Pipeline? Beware of the Performance Impact!

I like weird, yet concise language constructs and API usages

Yes. I am guilty. Evil? Don’t know. But guilty. I heavily use and abuse the java.lang.Boolean type to implement three valued logic in Java:

  • Boolean.TRUE means true (duh)
  • Boolean.FALSE means false
  • null can mean anything like “unknown” or “uninitialised”, etc.

I know – a lot of enterprise developers will bikeshed and cargo cult the old saying:

Code is read more often than it is written

But as with everything, there is a tradeoff. For instance, in algorithm-heavy, micro optimised library code, it is usually more important to have code that really performs well, rather than code that apparently doesn’t need comments because the author has written it in such a clear and beautiful way.

I don’t think it matters much in the case of the boolean type (where I’m just too lazy to encode every three valued situation in an enum). But here’s a more interesting example from that same twitter thread. The code is simple:

if (something) {
  for (Object o : list) 
    if (something(o))
      break woot;

  throw new E();

Yes. You can break out of “labeled ifs”. Because in Java, any statement can be labeled, and if the statement is a compound statement (observe the curly braces following the if), then it may make sense to break out of it. Even if you’ve never seen that idiom, I think it’s quite immediately clear what it does.


If Java were a bit more classic, it might have supported this syntax:

if (something) {
  for (Object o : list) 
    if (something(o))
      goto woot;

  throw new E();

Nicolai suggested that the main reason I hadn’t written the following, equivalent, and arguably more elegant logic, is because jOOQ still supports Java 6:

if (something && list.stream().noneMatch(this::something))
  throw new E();

It’s more concise! So, it’s better, right? Everything new is always better.

A third option would have been the less concise solution that essentially just replaces break by return:

if (something && noneMatchSomething(list)
  throw new E();

// And then:
private boolean noneMatchSomething(List<?> list) {
  for (Object o : list)
    if (something(o))
      return false;
  return true;

There’s an otherwise useless method that has been extracted. The main benefit is that people are not used to breaking out of labeled statements (other than loops, and even then it’s rare), so this is again about some subjective “readability”. I personally find this particular example less readable, because the extracted method is no longer local. I have to jump around in the class and interrupt my train of thoughts. But of course, YMMV with respect to the two imperative alternatives.

Back to objectivity: Performance

When I tweet about Java these days, I’m mostly tweeting about my experience writing jOOQ. A library. A library that has been tuned so much over the past years, that the big client side bottleneck (apart from the obvious database call) is the internal StringBuilder that is used to generate dynamic SQL. And compared to most database queries, you will not even notice that.

But sometimes you do. E.g. if you’re using an in-memory H2 database and run some rather trivial queries, then jOOQ’s overhead can become measurable again. Yes. There are some use-cases, which I do want to take seriously as well, where the difference between an imperative loop and a stream pipeline is measurable.

In the above examples, let’s remove the throw statement and replace it by something simpler (because exceptions have their own significant overhead).

I’ve created this JMH benchmark, which compares the 3 approaches:

  • Imperative with break
  • Imperative with return
  • Stream

Here’s the benchmark

package org.jooq.test.benchmark;

import java.util.ArrayList;
import java.util.List;

import org.openjdk.jmh.annotations.*;

@Fork(value = 3, jvmArgsAppend = "-Djmh.stack.lines=3")
@Warmup(iterations = 5, time = 3)
@Measurement(iterations = 7, time = 3)
public class ImperativeVsStream {

    public static class BenchmarkState {

        boolean something = true;

        @Param({ "2", "8" })
        int listSize;

        List<Integer> list = new ArrayList<>();

        boolean something() {
            return something;

        boolean something(Integer o) {
            return o > 2;

        public void setup() throws Exception {
            for (int i = 0; i < listSize; i++)

        public void teardown() throws Exception {
            list = null;

    public Object testImperativeWithBreak(BenchmarkState state) {
        if (state.something()) {
            for (Integer o : state.list)
                if (state.something(o))
                    break woot;

            return 1;

        return 0;

    public Object testImperativeWithReturn(BenchmarkState state) {
        if (state.something() && woot(state))
            return 1;

        return 0;

    private boolean woot(BenchmarkState state) {
        for (Integer o : state.list)
            if (state.something(o))
                return false;

        return true;

    public Object testStreamNoneMatch(BenchmarkState state) {
        if (state.something() && state.list.stream().noneMatch(state::something))
            return 1;

        return 0;

    public Object testStreamAnyMatch(BenchmarkState state) {
        if (state.something() && !state.list.stream().anyMatch(state::something))
            return 1;

        return 0;

    public Object testStreamAllMatch(BenchmarkState state) {
        if (state.something() && state.list.stream().allMatch(s -> !state.something(s)))
            return 1;

        return 0;

The results are pretty clear:

Benchmark                                    (listSize)   Mode  Cnt         Score          Error  Units
ImperativeVsStream.testImperativeWithBreak            2  thrpt   14  86513288.062 ± 11950020.875  ops/s
ImperativeVsStream.testImperativeWithBreak            8  thrpt   14  74147172.906 ± 10089521.354  ops/s
ImperativeVsStream.testImperativeWithReturn           2  thrpt   14  97740974.281 ± 14593214.683  ops/s
ImperativeVsStream.testImperativeWithReturn           8  thrpt   14  81457864.875 ±  7376337.062  ops/s
ImperativeVsStream.testStreamAllMatch                 2  thrpt   14  14924513.929 ±  5446744.593  ops/s
ImperativeVsStream.testStreamAllMatch                 8  thrpt   14  12325486.891 ±  1365682.871  ops/s
ImperativeVsStream.testStreamAnyMatch                 2  thrpt   14  15729363.399 ±  2295020.470  ops/s
ImperativeVsStream.testStreamAnyMatch                 8  thrpt   14  13696297.091 ±   829121.255  ops/s
ImperativeVsStream.testStreamNoneMatch                2  thrpt   14  18991796.562 ±   147748.129  ops/s
ImperativeVsStream.testStreamNoneMatch                8  thrpt   14  15131005.381 ±   389830.419  ops/s

With this simple example, break or return don’t matter. At some point, adding additional methods might start getting in the way of inlining (because of stacks getting too deep), but not creating additional methods might be getting in the way of inlining as well (because of method bodies getting too large). I don’t want to bet on either approach here at this level, nor is jOOQ tuned that much. Like most similar libraries, the traversal of the jOOQ expression tree generates stack that are too deep to completely inline anyway.

But the very obvious loser here is the Stream approach, which is roughly 6.5x slower in this benchmark than the imperative approaches. This isn’t surprising. The stream pipeline has to be set up every single time to represent something as trivial as the above imperative loop. I’ve already blogged about this in the past, where I compared replacing simple for loops by Stream.forEach()

Meh, does it matter?

In your business logic? Probably not. Your business logic is I/O bound, mostly because of the database. Wasting a few CPU cycles on a client side loop is not the main issue. Even if it is, the waste probably happens because your loop shouldn’t even be at the client side in the first place, but moved into the database as well. I’m currently touring conferences with a call about that topic:

In your infrastructure logic? Maybe! If you’re writing a library, or if you’re using a library like jOOQ, then yes. Chances are that a lot of your logic is CPU bound. You should occasionally profile your application and spot such bottlenecks, both in your code and in third party libraries. E.g. in most of jOOQ’s internals, using a stream pipeline might be a very bad choice, because ultimately, jOOQ is something that might be invoked from within your loops, thus adding significant overhead to your application, if your queries are not heavy (e.g. again when run against an H2 in-memory database).

So, given that you’re clearly “micro-losing” on the performance side by using the Stream API, you may need to evaluate the readability tradeoff more carefully. When business logic is complex, readability is very important compared to micro optimisations. With infrastructure logic, it is much less likely so, in my opinion. And I’m not alone:

Note: there’s that other cargo cult of premature optimisation going around. Yes, you shouldn’t worry about these details too early in your application implementation. But you should still know when to worry about them, and be aware of the tradeoffs.

And while you’re still debating what name to give to that extracted method, I’ve written 5 new labeled if statements! ;-)

Writing Custom Aggregate Functions in SQL Just Like a Java 8 Stream Collector

All SQL databases support the standard aggregate functions COUNT(), SUM(), AVG(), MIN(), MAX().

Some databases support other aggregate functions, like:

  • EVERY()
  • VAR_POP()
  • VAR_SAMP()

But what if you want to roll your own?

Java 8 Stream Collector

When using Java 8 streams, we can easily roll our own aggregate function (i.e. a Collector). Let’s assume we want to find the second highest value in a stream. The highest value can be obtained like this:

    Stream.of(1, 2, 3, 4)
) ;



Now, what about the second highest value? We can write the following collector:

    Stream.of(1, 6, 2, 3, 4, 4, 5).parallel()
              () -> new int[] { 
              (a, i) -> {
                  if (a[0] < i) {
                      a[1] = a[0];
                      a[0] = i;
                  else if (a[1] < i)
                      a[1] = i;
              (a1, a2) -> {
                  if (a2[0] > a1[0]) {
                      a1[1] = a1[0];
                      a1[0] = a2[0];

                      if (a2[1] > a1[1])
                          a1[1] = a2[1];
                  else if (a2[0] > a1[1])
                      a1[1] = a2[0];

                  return a1;
              a -> a[1]
) ;

It doesn’t do anything fancy. It has these 4 functions:

  • Supplier<int[]>: A supplier that provides an intermediary int[] of length 2, initialised with Integer.MIN_VALUE, each. This array will remember the MAX() value in the stream at position 0 and the SECOND_MAX() value in the stream at position 1
  • BiConsumer<int[], Integer>: A accumulator that accumulates new values from the stream into our intermediary data structure.
  • BinaryOperator<int[]>: A combiner that combines two intermediary data structures. This is used for parallel streams only.
  • Function<int[], Integer>: The finisher function that extracts the SECOND_MAX() function from the second position in our intermediary array.

The output is now:


How to do the same thing with SQL?

Many SQL databases offer a very similar way of calculating custom aggregate functions. Here’s how to do the exact same thing with…


With the usual syntactic ceremony…

CREATE TYPE u_second_max AS OBJECT (

  -- Intermediary data structure

  -- Corresponds to the Collector.supplier() function
  STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT u_second_max) RETURN NUMBER,

  -- Corresponds to the Collector.accumulate() function
  MEMBER FUNCTION ODCIAggregateIterate(self IN OUT u_second_max, value IN NUMBER) RETURN NUMBER,

  -- Corresponds to the Collector.combineer() function
  MEMBER FUNCTION ODCIAggregateMerge(self IN OUT u_second_max, ctx2 IN u_second_max) RETURN NUMBER,

  -- Correspodns to the Collector.finisher() function
  MEMBER FUNCTION ODCIAggregateTerminate(self IN u_second_max, returnValue OUT NUMBER, flags IN NUMBER) RETURN NUMBER

-- This is our "colletor" implementation
  STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT u_second_max)
    SCTX := U_SECOND_MAX(0, 0);
    RETURN ODCIConst.Success;

  MEMBER FUNCTION ODCIAggregateIterate(self IN OUT u_second_max, value IN NUMBER) RETURN NUMBER IS
      SELF.MAX := VALUE;
    END IF;
    RETURN ODCIConst.Success;

  MEMBER FUNCTION ODCIAggregateTerminate(self IN u_second_max, returnValue OUT NUMBER, flags IN NUMBER) RETURN NUMBER IS
    RETURN ODCIConst.Success;

  MEMBER FUNCTION ODCIAggregateMerge(self IN OUT u_second_max, ctx2 IN u_second_max) RETURN NUMBER IS
      SELF.MAX := CTX2.MAX;
      END IF;
    END IF;
    RETURN ODCIConst.Success;

-- Finally, we have to give this aggregate function a name

We can now run the above on the Sakila database:

FROM film;

To get:

1000    999

And what’s even better, we can use the aggregate function as a window function for free!

  max(film_id) OVER (PARTITION BY length), 
  second_max(film_id) OVER (PARTITION BY length)
FROM film
ORDER BY length, film_id;

The above yields:

15       46      730   505
469      46      730   505
504      46      730   505
505      46      730   505
730      46      730   505
237      47      869   784
247      47      869   784
393      47      869   784
398      47      869   784
407      47      869   784
784      47      869   784
869      47      869   784
2        48      931   866
410      48      931   866
575      48      931   866
630      48      931   866
634      48      931   866
657      48      931   866
670      48      931   866
753      48      931   866
845      48      931   866
866      48      931   866
931      48      931   866

Beautiful, right?


PostgreSQL supports a slightly more concise syntax in the CREATE AGGREGATE statement. If we don’t allow for parallelism, we can write this minimal implementation:

CREATE FUNCTION second_max_sfunc (
  state INTEGER[], data INTEGER
      WHEN state[1] > data
      THEN CASE 
        WHEN state[2] > data
        THEN state
        ELSE ARRAY[state[1], data]
      ELSE ARRAY[data, state[1]]
$$ LANGUAGE plpgsql;

CREATE FUNCTION second_max_ffunc (
  state INTEGER[]
  RETURN state[2];
$$ LANGUAGE plpgsql;

  SFUNC     = second_max_sfunc,
  STYPE     = INTEGER[],
  FINALFUNC = second_max_ffunc

Here, we use the STYPE (Collector.supplier()), the SFUNC (Collector.accumulator()), and the FINALFUNC (Collector.finisher()) specifications.

Other databases

Many other databases allow for specifying user defined aggregate functions. Look up your database manual’s details to learn more. They always work in the same way as a Java 8 Collector.

How to Compile a Class at Runtime with Java 8 and 9

In some cases, it’s really useful to be able to compile a class at runtime using the java.compiler module. You can e.g. load a Java source file from the database, compile it on the fly, and execute its code as if it were part of your application.

In the upcoming jOOR 0.9.8, this will be made possible through https://github.com/jOOQ/jOOR/issues/51. As always with jOOR (and our other projects), we’re wrapping existing JDK API, simplifying the little details that you often don’t want to worry about. Using jOOR API, you can now write:

// Run this code from within the com.example package

Supplier<String> supplier = Reflect.compile(
    "package com.example;\n" +
    "class CompileTest\n" +
    "implements java.util.function.Supplier<String> {\n" +
    "  public String get() {\n" +
    "    return \"Hello World!\";\n" +
    "  }\n" +


And the result is, of course:

Hello World!

If we already had JEP-326, this would be even cooler!

Supplier<String> supplier = Reflect.compile(
    `package org.joor.test;
     class CompileTest
     implements java.util.function.Supplier<String> {
       public String get() {
         return "Hello World!"


What happens behind the scenes?

Again, as in our previous blog post, we need to ship two different versions of our code. One that works in Java 8 (where reflecting and accessing JDK internal API was possible), and one that works in Java 9+ (where this is forbidden). The full annotated API is here:

package org.joor;

import java.io.ByteArrayOutputStream;
import java.io.OutputStream;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodHandles.Lookup;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;

import javax.tools.*;

import static java.lang.StackWalker.Option.RETAIN_CLASS_REFERENCE;

class Compile {

    static Class<?> compile(String className, String content) 
    throws Exception {
        Lookup lookup = MethodHandles.lookup();

        // If we have already compiled our class, simply load it
        try {
            return lookup.lookupClass()

        // Otherwise, let's try to compile it
        catch (ClassNotFoundException ignore) {
            return compile0(className, content, lookup);

    static Class<?> compile0(
        String className, String content, Lookup lookup)
    throws Exception {
        JavaCompiler compiler = 

        ClassFileManager manager = new ClassFileManager(
            compiler.getStandardFileManager(null, null, null));

        List<CharSequenceJavaFileObject> files = new ArrayList<>();
        files.add(new CharSequenceJavaFileObject(
            className, content));

        compiler.getTask(null, manager, null, null, null, files)
        Class<?> result = null;

        // Implement a check whether we're on JDK 8. If so, use
        // protected ClassLoader API, reflectively
        if (onJava8()) {
            ClassLoader cl = lookup.lookupClass().getClassLoader();
            byte[] b = manager.o.getBytes();
            result = Reflect.on(cl).call("defineClass", 
                className, b, 0, b.length).get();

        // Lookup.defineClass() has only been introduced in Java 9.
        // It is required to get private-access to interfaces in
        // the class hierarchy
        else {

            // This method is called by client code from two levels
            // up the current stack frame. We need a private-access
            // lookup from the class in that stack frame in order
            // to get private-access to any local interfaces at
            // that location.
            Class<?> caller = StackWalker
                .walk(s -> s

            // If the compiled class is in the same package as the
            // caller class, then we can use the private-access 
            // Lookup of the caller class
            if (className.startsWith(caller.getPackageName() )) {
                result = MethodHandles
                    .privateLookupIn(caller, lookup)

            // Otherwise, use an arbitrary class loader. This
            // approach doesn't allow for loading private-access 
            // interfaces in the compiled class's type hierarchy
            else {
                result = new ClassLoader() {
                    protected Class<?> findClass(String name) 
                    throws ClassNotFoundException {
                        byte[] b = fileManager.o.getBytes();
                        int len = b.length;
                        return defineClass(className, b, 0, len);

        return result;

    // These are some utility classes needed for the JavaCompiler
    // ----------------------------------------------------------

    static final class JavaFileObject 
    extends SimpleJavaFileObject {
        final ByteArrayOutputStream os = 
            new ByteArrayOutputStream();

        JavaFileObject(String name, JavaFileObject.Kind kind) {
              + name.replace('.', '/') 
              + kind.extension), 

        byte[] getBytes() {
            return os.toByteArray();

        public OutputStream openOutputStream() {
            return os;

    static final class ClassFileManager 
    extends ForwardingJavaFileManager<StandardJavaFileManager> {
        JavaFileObject o;

        ClassFileManager(StandardJavaFileManager m) {

        public JavaFileObject getJavaFileForOutput(
            JavaFileManager.Location location,
            String className,
            JavaFileObject.Kind kind,
            FileObject sibling
        ) {
            return o = new JavaFileObject(className, kind);

    static final class CharSequenceJavaFileObject 
    extends SimpleJavaFileObject {
        final CharSequence content;

        public CharSequenceJavaFileObject(
            String className, 
            CharSequence content
        ) {
              + className.replace('.', '/') 
              + JavaFileObject.Kind.SOURCE.extension), 
            this.content = content;

        public CharSequence getCharContent(
            boolean ignoreEncodingErrors
        ) {
            return content;

Notice how the JDK 9 version is a bit more complicated, as we have to:

  • Find the caller class of our method
  • Get a private method handle lookup for that class if the class being compiled is in the same package as the class calling the compilation
  • Otherwise, use an arbitrary class loader to define the class

Reflection definitely hasn’t become simpler with Java 9!

How to Ensure Your Code Works With Older JDKs

jOOQ is a very backwards compatible product. This doesn’t only mean that we keep our own API backwards compatible as well as possible, but we also still support Java 6 in our commercial distributions.

In a previous blog post, I’ve shown how we manage to support Java 6 while at the same time not missing out on cool Java 8 language and API features, such as Stream and Optional support. For instance, you can do this with jOOQ’s ordinary distribution:

// Fetching 0 or 1 actors
Optional<Record2<String, String>> actor =

// Fetching a stream of actors
try (Stream<Record2<String, String>> actor = ctx
       .fetchStream()) {

This API is present in jOOQ’s ordinary distribution and it is stripped from that distribution prior to building the Java 6 distribution.

But what about the JDK’s more subtle APIs?

It is relatively easy to remember not to use Streams, Optionals, lambdas, method references, default methods lightheartedly in your library’s code. After all, those were all major changes to Java 8 and we can easily add our API removal markers around those parts. And even if we forgot, building the Java 6 distribution would quite probably fail, because Streams are very often used with lambdas, in case of which a compiler that is configured for Java version 1.6 will not compile the code.

But recently, we’ve had a more subtle bug, #6860. jOOQ API was calling java.lang.reflect.Method.getParameterCount(). Since we compile jOOQ’s Java 6 distribution with Java 8, this didn’t fail. The sources were kept Java 6 language compatible, but not JDK 6 API compatible, and unfortunately, there’s no option in javac, nor in the Maven compiler plugin to do such a check.

Why not use Java 6 to compile the Java 6 distribution?

The reason why we’re using Java 8 to build jOOQ’s Java 6 distribution is the fact that Java 8 “fixed” a lot (and I mean a lot) of very old and weird edge cases related to generics,
overloading, varargs, and all that stuff. While this might be irrelevant for ordinary APIs, for jOOQ it is not. We really push the limits of what’s possible with the Java language.

So, we’re paying a price for building jOOQ’s Java 6 distribution with Java 8. We’re flying in “stealth mode”, not 100% sure whether our JDK API usage is compliant.

Luckily, the JDK doesn’t change much between releases, so a lot of stuff from JDK 8 was already there in JDK 6. Also, our integration tests would fail, if we did accidentally use a method like the above. Unfortunately, that particular method call simply slipped by the integration tests (there will never be enough tests for every scenario).

The solution

Apart from fixing the trivial bug and avoiding that particular method, we’ve now added the cool “animal sniffer” Maven plugin to our Java 6 build, whose usage you can see here:

All we needed to add to our Java 6 distribution profile was this little snippet:


This will then produce a validation error like the following:

[INFO] --- animal-sniffer-maven-plugin:1.16:check (default) @ jooq-codegen ---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java16:1.0
[ERROR] C:\..\JavaGenerator.java:232: Undefined reference: int java.lang.reflect.Method.getParameterCount()
[ERROR] C:\..\JavaGenerator.java:239: Undefined reference: int java.lang.reflect.Method.getParameterCount()


Squeezing Another 10% Speed Increase out of jOOQ using JMC and JMH

In this post, we’re going to discuss a couple of recent efforts to squeeze roughly 10% in terms of speed out of jOOQ by iterating on hotspots that were detected using JMC (Java Mission Control) and then validated using JMH (Java Microbenchmark Harness). This post shows how to apply micro optimisations to algorithms where the smallest improvement can have a significant effect.

While JMH is probably without competition, JMC could easily be replaced by JProfiler, YourKit, or even your own manual jstack sampling. I’ll just use JMC because it ships with the JDK and is free for use for development as of JDK 8 and 9 (if you’re unsure whether you’re “developing”, better ask Oracle). Rumours have it that JMC might be contributed to the OpenJDK in the near future.

Micro optimisations

Micro optimisations are a cool technique to squeeze a very small improvement out of a local algorithm (e.g. a loop) that has a significant effect on the entire application / library, because of the fact that the local algorithm is called many times. This is absolutely the case in jOOQ, which is essentially a library that always runs 4 nested loops:

  1. S: A “loop” over all possible SQL statements
  2. E: A “loop” over all executions of such a statement
  3. R: A loop over all rows in the result
  4. C: A loop over all columns in a row

Such four level nested loops result in what we could call a polynomial complexity of our algorithms, even if we cannot call the complexity O(N4) (as the 4 “N” are not all the same), it is certainly of O(S x E x R x C) (I’ll call this “S-E-R-C loops” further down). Even to the untrained eye, it becomes evident that anything that happens in the inner-most “C-loop” can have devastating effects. We better not be opening any files here, that could be opened outside of, e.g. the “S-loop”

In a previous blog post, we’ve discussed common techniques of optimising such situations. In this blog post, we’ll look into a couple of concrete examples.

How to discover flaws in these loops?

We’re looking for the problems that affect all users, the kind of problem that, once fixed, will improve jOOQ’s performance for everyone by e.g. 10%. This is similar to what the JIT does, by performing things like stack allocation, inlining, which don’t drastically improve things locally, but do so globally, and for everyone. Here’s an interesting guest post by Tagir Valeev on JIT optimisation, and how good it is.

Getting a large “S-loop”

The first option is to run profiling sessions on benchmarks. We could, for example, run the entire “S-E-R-C loops” in a JMC profiling session, where the “S-loop” is a loop over all our statements, or in other words, over all our integration tests. Unfortunately, with this approach, our “E-loop” (in the case of jOOQ’s integration tests) is a single execution per statement. We’d have to run the integration tests many, many times in order to get meaningful results.

Also, while the jOOQ integration tests run thousands of distinct queries, most queries are still rather simple, each one focusing on an individual SQL feature (e.g. lateral join). In a end user application, queries might use less specific features, but are much more complex, i.e. they have a lot of ordinary joins.

This technique is useful to find problems that appear in all queries, deep down inside of jOOQ – e.g. at the JDBC interface. But we cannot use this approach to test individual features.

Getting a large “E-loop”

Another option is to write a single test that runs a few statements (small “S-loop”) many times in an explicit loop (large “E-loop”). This has the advantage that a specific bottleneck can be found with a high confidence, but the drawback is: It’s specific. For instance, if we find a small bottleneck in the string concatenation function, well, that is certainly worth fixing, but doesn’t affect most users.

This approach is useful to test individual features. It can also be useful for finding issues that affect all queries, but with a lower confidence than the previous case, where the “S-loop” is maximised.

Getting large “R-loops” and “C-loops”

Creating large result sets is easy and should definitely be part of such benchmarks, because in the case of a large result set, any flaw will multiply drastically, so fixing these things is worthwhile. However, these problems only affect actual result sets, not the query building process or the execution process. Sure, most statements are probably queries, not insertions / updates, etc. But this needs to be kept in mind.

Optimising for problems in large “E-loops”

All of the above scenarios are different optimisation sessions and deserve their own blog posts. In this post, I’m describing what has been discovered and fixed when running a single query 3 million times on an H2 database. The H2 database is chosen here, because it can run in memory of the same process and thus has the least extra overhead compared to jOOQ – so jOOQ’s overhead contributions become significant in a profiling session / benchmark. In fact, it can be shown that in such a benchmark, jOOQ (or Hibernate, etc.) appears to perform quite poorly compared to a JDBC only solution, as many have done before.

This is an important moment to remind ourselves:

Benchmarks do not reflect real-world use cases! You will never run the exact same query 3 million times on a production system, and your production system doesn’t run on H2.

A benchmark profits from so much caching, buffering, you would never perform as fast as in a benchmark.

Always be careful not to draw any wrong conclusions from a benchmark!

This needs to be said, so take every benchmark you find on the web with a grain of salt. This includes our own!

The query being profiled is:


The trivial query returns a ridiculous 4 rows and 4 columns, so the “R-loop” and “C-loops” are negligible. This benchmark is really testing the overhead of jOOQ query execution in a case where the database does not contribute much to the execution time. Again, in a real world scenario, you will get much more overhead from your database.

In the following sections, I’ll show a few minor bottlenecks that could be found when drilling down into these such execution scenarios. As I’ve switched between JMC versions, the screenshots will not always be the same, I’m afraid.

1. Instance allocation of constant values

A very silly mistake was easily discovered right away:

The mistake didn’t contribute a whole lot of overhead, only 1.1% to the sampled time spent, but it made me curious. In version 3.10 of jOOQ, the SelectQueryImpl‘s Limit class, which encodes the jOOQ OFFSET / LIMIT behaviour kept allocating this DSL.val() thingy, which is a bind variable. Sure, limits do work with bind variables, but this happened when SelectQueryImpl was initialised, not when the LIMIT clause is added by the jOOQ API user.

As can be seen in the sources, the following logic was there:

private static final Field<Integer> ZERO              = zero();
private static final Field<Integer> ONE               = one();
private Field<Integer>              numberOfRowsOrMax = 

While the “special limits” ZERO and ONE were static members, the numberOfRowsOrMax value wasn’t. That’s the instantiation we were measuring in JMC. The member is not a constant, but the default value is. It is always initialised with Integer.MAX_VALUE wrapped in an DSL.inline() call. The solution is really simple:

private static final Param<Integer> MAX               = 
private Field<Integer>              numberOfRowsOrMax = MAX;

This is obviously better! Not only does it avoid the allocation of the bind variable, it also avoids the boxing of Integer.MAX_VALUE (which can also be seen in the sampling screenshot).

Note, a similar optimisation is available in the JDK’s ArrayList. When you look at the sources, you’ll see:

 * Shared empty array instance used for empty instances.
private static final Object[] EMPTY_ELEMENTDATA = {};

When you initialise an ArrayList without initial capacity, it will reference this shared instance, instead of creating a new, empty (or even non-empty) array. This delays the allocation of such an array until we actually add things to the ArrayList, just in case it stays empty.

jOOQ’s LIMIT is the same. Most queries might not have a LIMIT, so better not allocate that MAX_VALUE afresh!

This is done once per “E-loop” iteration

One issue down: https://github.com/jOOQ/jOOQ/issues/6635

2. Copying lists in internals

This is really a micro optimisation that you probably shouldn’t do in ordinary business logic. But it might be worthwhile in infrastructure logic, e.g. when you’re also in an “S-E-R-C loop”:

jOOQ (unfortunately) occasionally copies data around between arrays, e.g. wrapping Strings in jOOQ wrapper types, transforming numbers to strings, etc. These loops aren’t bad per se, but remember, we’re inside some level of the “S-E-R-C loop”, so these copying operations might be run hundreds of millions of times when we run a statement 3 million times.

The above loop didn’t contribute a lot of overhead, and possible the cloned object was stack allocated or the clone call eliminated by the JIT. But maybe it wasn’t. The QualifiedName class cloned its argument prior to returning it to make sure that no accidental modifications will have any side effect:

private static final String[] nonEmpty(String[] qualifiedName) {
    String[] result;
    if (nulls > 0) {
        result = new String[qualifiedName.length - nulls];
    else {
        result = qualifiedName.clone();
    return result;

So, the implementation of the method guaranteed a new array as a result.

After a bit of analysis, it could be seen that there is only a single consumer of this method, and it doesn’t leave that consumer. So, it’s safe to remove the clone call. Probably, the utility was refactored from a more general purpose method into this local usage.

This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6640

3. Running checks in loops

This one is too silly to be true:

There’s a costly overhead in the CombinedCondition constructor (<init> method). Notice, how the samples drop from 0.47% to 0.32% between the constructor and the next method init(), that’s the time spent inside the constructor.

A tiny amount of time, but this time is spent every time someone combines two conditions / predicates with AND and OR. Every time. We can probably save this time. The problem is this:

CombinedCondition(Operator operator, Collection<? extends Condition> conditions) {
    for (Condition condition : conditions)
        if (condition == null)
            throw new IllegalArgumentException("The argument 'conditions' must not contain null");

    init(operator, conditions);

There’s a loop over the arguments to give some meaningful error messages. That’s a bit too defensive, I suspect. How about we simply live with the NPE when it arises, as this should be rather unexpected (for the context, jOOQ hardly ever checks on parameters like this, so this should also be removed for consistency reasons).

This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6666 (nice number)

4. Lazy initialisation of lists

The nature of the JDBC API forces us to work with ThreadLocal variables, very unfortunately, as it is not possible to pass arguments from parent SQLData objects to children, especially when we combine nesting of Oracle TABLE/VARRAY and OBJECT types.

In this analysis, we’re combining the profiler’s CPU sampling with its memory sampling:

In the CPU sampling view above, we can see some overhead in the DefaultExecuteContext, which is instantiated once per “E-loop” iteration. Again, not a huge overhead, but let’s look at what this constructor does. It contributes to the overall allocations of ArrayList:

When we select the type in JMC, the other view will then display all the stack traces where ArrayList instances were allocated, among which, again, our dear DefaultExecuteContext constructor:

Where are those ArrayLists allocated? Right here:

BLOBS.set(new ArrayList<Blob>());
CLOBS.set(new ArrayList<Clob>());
SQLXMLS.set(new ArrayList<SQLXML>());
ARRAYS.set(new ArrayList<Array>());

Every time we start executing a query, we initialise a list for each ones of these types. All of our variable binding logic will then register any possibly allocated BLOB or CLOB, etc. such that we can clean these up at the end of the execution (a JDBC 4.0 feature that not everyone knows of!):

static final void register(Blob blob) {
static final void clean() {
    List<Blob> blobs = BLOBS.get();

    if (blobs != null) {
        for (Blob blob : blobs)


Don’t forget calling Blob.free() et al, if you’re working with JDBC directly!

But the truth is, in most cases, we don’t really need these things. We need them only in Oracle, and only if we’re using TABLE / VARRAY or OBJECT types, due to some JDBC restrictions. Why punish all the users of other databases with this overhead? Instead of a sophisticated refactoring, which risks introducing regressions (https://github.com/jOOQ/jOOQ/issues/4205), we can simply initialise these lists lazily. We leave the clean() method as it is, remove the initialisation in the constructor, and replace the register() logic by this:

static final void register(Blob blob) {
    List<Blob> list = BLOBS.get();

    if (list == null) {
        list = new ArrayList<Blob>();


That was easy. And significant. Check out the new allocation measurements:

Note that every allocation, apart from the overhead of allocating things, also incurs additional overhead when the object is garbage collected. That’s a bit trickier to measure and correlate. In general, less allocations is almost always a good thing, except if the allocation is super short lived, in case of which stack allocation can happen, or the logic can even be eliminated by the JIT.

This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6669

6. Using String.replace()

This is mostly a problem in JDK 8 only, JDK 9 fixed string replacing by no longer relying on regular expressions internally. In JDK 8, however (and jOOQ still supports Java 6, so this is relevant), string replacement works through regular expressions as can be seen here:

The Pattern implementation allocates quite a few int[] instances, even if that’s probably not strictly needed for non-regex patterns as those of String.replace():

I’ve already analysed this in a previous blog post, which can be seen here:


This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6672

7. Registering an SPI that is going to be inactive

This one was a bit more tricky to solve as it relies on a deeper analysis. Unfortunately, I have no profiling screenshots available anymore, but it is easy to explain with code. There’s an internal ExecuteListeners utility, which abstracts over the ExecuteListener SPIs. Users can register such a listener and listen to query rendering, variable binding, query execution, and other lifecycle events. By default, there is no such ExecuteListener by the users, but there’s always one internal ExecuteListener:

private static ExecuteListener[] listeners(ExecuteContext ctx) {
    List<ExecuteListener> result = new ArrayList<ExecuteListener>();

    for (ExecuteListenerProvider provider : ctx.configuration()
        if (provider != null)

    if (!FALSE.equals(ctx.settings().isExecuteLogging()))
        result.add(new LoggerListener());

    return result.toArray(EMPTY_EXECUTE_LISTENER);

The LoggerListener is added by default, unless users turn off that feature. Which means:

  • We’ll pretty much always get this ArrayList
  • We’ll pretty much always loop over this list
  • We’ll pretty much always clal this LoggerListener

But what does it do? It logs stuff on DEBUG and TRACE level. For instance:

public void executeEnd(ExecuteContext ctx) {
    if (ctx.rows() >= 0)
        if (log.isDebugEnabled())
            log.debug("Affected row(s)", ctx.rows());

That’s what it does by definition. It’s a debug logger. So, the improved logic for initialising this thing is the following:

private static final ExecuteListener[] listeners(ExecuteContext ctx) {
    List<ExecuteListener> result = null;

    for (ExecuteListenerProvider provider : ctx.configuration()
        if (provider != null)
            (result = init(result)).add(provider.provide());

    if (!FALSE.equals(ctx.settings().isExecuteLogging())) {
        if (LOGGER_LISTENER_LOGGER.isDebugEnabled())
            (result = init(result)).add(new LoggerListener());

    return result == null ? null : result.toArray(EMPTY_EXECUTE_LISTENER);

We’re no longer allocating the ArrayList (that might be premature, the JIT might have rewritten this allocation to not happen, but OK), and we’re only adding the LoggerListener if it DEBUG or TRACE logging is enabled for it, i.e. if it would do any work at all.

That’s just a couple of CPU cycles we can save on every execution. Again, I don’t have the profiling measurements anymore, but trust me. It helped.

This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6747

8. Eager allocation where lazy allocation works

Sometimes, we need two different representations of the same information. The “raw” representation, and a more useful, pre-processed representation for some purposes. This was done, for instance, in QualifiedField:

private final Name          name;
private final Table<Record> table;

QualifiedField(Name name, DataType<T> type) {
    super(name, type);

    this.name = name;
    this.table = name.qualified()
        ? DSL.table(name.qualifier())
        : null;

public final void accept(Context<?> ctx) {

public final Table<Record> getTable() {
    return table;

As can be seen, the name is really the beef of this class. It’s a qualified name that generates itself on the SQL string. The Table representation is useful when navigating the meta model, but this is hardly ever done by jOOQ’s internals and/or user facing code.

However, this eager initialisation it is costly:

Quite a few UnqualifiedName[] arrays are allocated by the call to Name.qualifier(). We can easily make that table reference non-final and calculate it lazily:

private final Name              name;
private Table<Record>           table;

QualifiedField(Name name, DataType<T> type) {
    super(name, type);

    this.name = name;

public final Table<Record> getTable() {
    if (table == null)
        table = name.qualified() ? DSL.table(name.qualifier()) : null;

    return table;

Because name is final, we could call table “effectively final” (in a different meaning than the Java language’s) – we won’t have any thread safety issues because these particular types are immutable inside of jOOQ.

This is done several times per “E-loop” iteration

One more issue down: https://github.com/jOOQ/jOOQ/issues/6755


Now, thus far, we’ve “improved” many low hanging fruit based on a profiler session (that was run, akhem, from outside of Eclipse on a rather busy machine). This wasn’t very scientific. Just tracking down “bottlenecks” which triggered my interest by having high enough numbers to even notice. This is called “micro optimisation”, and it is only worth the trouble if you’re in a “S-E-R-C loop”, meaning that the code you’re optimising is executed many many times. For me, developing jOOQ, this is almost always the case, because jOOQ is a library used by a lot of people who all profit from these optimisations. In many other cases, this might be called “premature optimisation”

But once we’ve optimised, we shouldn’t stop. I’ve done a couple of individual JMH benchmarks for many of the above problems, to see if they were really an improvement. But sometimes, in a JMH benchmark, something that doesn’t look like an improvement will still be an improvement in the bigger picture. The JVM doesn’t inline all methods 100 levels deep. If your algorithm is complex, perhaps a micro optimisation will still have an effect that would not have any effect on a JMH benchmark.

Unfortunately this isn’t very exact science, but with enough intuition, you’ll find the right spots to optimise.

In my case, I verified progress over two patch releases: 3.10.0 -> 3.10.1 -> 3.10.2 (not yet released) by running a JMH benchmark over the entire query execution (including H2’s part). The results of applying roughly 15 of the above and similar optimisations (~2 days’ worth of effort) is:

JDK 9 (9+181)

jOOQ 3.10.0 Open Source Edition

Benchmark                          Mode   Cnt       Score      Error  Units
ExecutionBenchmark.testExecution   thrpt   21  101891.108 ± 7283.832  ops/s

jOOQ 3.10.2 Open Source Edition

Benchmark                          Mode   Cnt       Score      Error  Units
ExecutionBenchmark.testExecution   thrpt   21  110982.940 ± 2374.504  ops/s

JDK 8 (1.8.0_145)

jOOQ 3.10.0 Open Source Edition

Benchmark                          Mode   Cnt       Score      Error  Units
ExecutionBenchmark.testExecution   thrpt   21  110178.873 ± 2134.894  ops/s

jOOQ 3.10.2 Open Source Edition

Benchmark                          Mode   Cnt       Score      Error  Units
ExecutionBenchmark.testExecution   thrpt   21  118795.922 ± 2661.653  ops/s

As can be seen, in both JDK versions, we’ve gotten roughly a 10% speed increase. What’s interesting is also that JDK 8 seemed to have been also 10% faster than JDK 9 in this benchmark, although this can be due to a variety of things that I haven’t considered yet, and which are out of scope for this discussion.


This iterative approach to tackling performance is definitely worth it for library authors:

  • run a representative benchmark (repeat a task millions of times)
  • profile it
  • track down “bottlenecks”
  • if they’re easy to fix without regression risk, do it
  • repeat
  • after a while, verify with JMH

Individual improvements are quite hard to measure, or measure correctly. But when you do 10-15 of them, they start adding up and become significant. 10% can make a difference.

Looking forward to your comments, alternative techniques, alternative tools, etc.!

If you liked this article, you will also like Top 10 Easy Performance Optimisations in Java

jOOQ Tuesdays: Nicolai Parlog Talks About Java 9

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

I’m very excited to feature today Nicolai Parlog, author of The Java Module System

Nicolai, your blog is an “archeological” treasure trove for everyone who wants to learn about why Java expert group decisions were made. What made you dig out all these interesting discussions on the mailing lists?

Ha, thank you, didn’t know I was sitting on a treasure.

It all started with everyone’s favorite bikeshed: Optional. After using it for a few months, I was curious to learn more about the reason behind its introduction to Java and why it was designed the way it was, so I started digging and learned a few things:

  • Piperman, the JDK mailing list archive, is a horrible place to peruse and search.
  • Mailing list discussions are often lengthy, fragmented, and thus hard to revisit.
  • Brian Goetz was absolutely right: Everything related to Optional seems to take 300 messages.

Consequently, researching that post about Optional’s design took a week or so. But as you say, it’s interesting to peek behind the curtain and once a discussion is condensed to its most relevant positions and peppered with some context it really appeals to the wider Java community.

I actually think there’s a niche to be filled, here. Imagine there were a site that did regularly (at least once a week) what I did with a few selected topics: Follow the JDK mailing list, summarize ongoing discussions, and make them accessible to a wide audience. That would be a great service to the Java community as it would make it much easier to follow what is going on and to chime in with an informed opinion when you feel you have something to contribute. Now we just need to find someone with a lot of free time on their hands.

By the way, I think it’s awesome that the comparitively open development of the JDK makes that possible.

I had followed your blog after Java 8 came out, where you explained expert group decisions in retrospect. Now, you’re mostly covering what’s new in Java 9. What are your favourite “hidden” (i.e. non-Jigsaw) Java 9 features and why?

From the few language changes, it’s easy pickings: definitely private interface methods. I’ve been in the situation more than once that I wanted to share code between default methods but found no good place to put it without making it part of the public API. With private mehods in interfaces, that’s a thing of the past.

When it comes to API changes, the decision is much harder as there is more to choose from. People definitely like collection factory methods and I do, too, but I think I’ll go with the changes to Stream and Optional. I really enjoy using those Java 8 features and think it’s great that they’ve been improved in 9.

A JVM feature I really like are multi-release JARs. The ability to ship a JAR that uses the newest APIs, but degrades gracefully on older JVMs will come in very handy. Some projects, Spring for example, already do this, but without JVM support it’s not exactly pleasant.

Can I go on? Because there’s so much more! Just two: Unified logging makes it much easier to tease out JVM log messages without having to configure logging for different subsystems and compact strings and indified string concatenation make working with strings faster, reduce garbage and conserve heap space (on average, 10% to 15% less memory!). Ok, that were three, but there you go.

You’re writing a book on the Java 9 module system that can already be pre-ordered on Manning. What will readers get out of your book?

All they need to become module system experts. Of course it explains all the basics (delcaring, compiling, packaging, and running modular applications) and advanced features (services, implied readability, optional dependencies, etc), but it goes far beyond that. More than how to use a feature it also explains when and why to use it, which nuances to consider, and what are good defaults if you’re not sure which way to go.

It’s also full of practical advice. I migrated two large applications to Java 9 (compiling and running on the new release, not turning them into modules) and that experience as well as the many discussions on the mailing list informed a big chapter on migration. If readers are interested in a preview, I condensed it into a post on the most common Java 9 migration challenges. I also show how to debug modules and the module system with various tools (JDeps for example) and logging (that’s when I started using uniform logging), Last but not least, I plan to include a chapter that simply lists error messages and what to do about them.

In your opinion, what are the good parts and the bad parts about  Jigsaw? Do you think Jigsaw will be adopted quickly?

The good, the bad, and the ugly, eh? My favorite feature (of all of Java 9 actually) is strong encapsulation. The ability to have types that are public only within a module is incredibly valuable! This adds another option to the private-to-public-axis and once people internalize that feature we will wonder how we ever lived without it. Can you imagine giving up private? We will think the same about exported.

I hope the worst aspect of the module system will be the compatibility challenges. That’s a weird way to phrase it, but let me explain. These challenges definitely exist and they will require a non-neglectable investmement from the Java community as a whole to get everything working on Java 9, in the long run as modules. (As an aside: This is well invested time – much of it pays back technical debt.)

My hope is that no other aspect of the module system turns out to be worse. One thing I’m a little concerned about is the strictness of reliable configuration. I like the general principle and I’m definitely one for enforcing good practices, but just think about all those POMs that busily exclude transitive dependencies. Once all those JARs are modules, that won’t work – the module system will not let you launch without all dependencies present.

Generally speaking, the module system makes it harder to go against the maintainers’ decisions. Making internal APIs available via reflection or altering dependencies now goes against the grain of a mechanism that is built deeply into the compiler and JVM. There are of course a number of command line flags to affect the module system but they don’t cover everything. To come back to exclusing dependencies, maybe–ignore-missing-modules ${modules} would be a good idea…

Regarding adoption rate, I expect it to be slower than Java 8. But leaving those projects aside that see every new version as insurmountable and are still on Java 6, I’m sure the vast majority will migrate eventually. If not for Java 9’s features than surely for future ones. As a friend and colleague once said: “I’ll do everything to get to value types.”

Now that Java 9 is out and “legacy”, what Java projects will you cover next in your blog and your work?

Oh boy, I’m still busy with Java 9. First I have to finish the book (November hopefully) and then I want to do a few more migrations because I actually like doing that for some weird and maybe not entirely healthy reason (the things you see…). FYI, I’m for hire, so if readers are stuck with their migration they should reach out.

Beyond that, I’m already looking forward to primitive specialization, e.g. ArrayList<int>, and value types (both from Project Valhalla) as well as the changes Project Amber will bring to Java. I’m sure I’ll start discussing those in 2018.

Another thing I’ll keep myself busy with and which I would love your readers to check out is my YouTube channel. It’s still very young and until the book’s done I won’t do a lot of videos (hope to record one next week), but I’m really thrilled about the whole endavour!

Are Java 8 Streams Truly Lazy? Not Completely!

Notice, this issue has been fixed in Java 8 (8u222), thanks for the comment Zheka Kozlov

In a recent article, I’ve shown that programmers should always apply a filter first, map later strategy with streams. The example I made there was this one:

    .map(e -&gt; superExpensiveMapping(e))

In this case, the limit() operation implements the filtering, which should take place before the mapping.

Several readers correctly mentioned that in this case, it doesn’t matter what order we’re putting the limit() and map() operations, because most operations are evaluated lazily in the Java 8 Stream API.

Or rather: The collect() terminal operation pulls values from the stream lazily, and as the limit(5) operation reaches the end, it will no longer produce new values, regardless whether map() came before or after. This can be proven easily as follows:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -&gt; i + 1)
              .map(i -&gt; i + 1)
              .peek(i -&gt; System.out.println("Map: " + i))
              .forEach(i -&gt; {});


        Stream.iterate(0, i -&gt; i + 1)
              .map(i -&gt; i + 1)
              .peek(i -&gt; System.out.println("Map: " + i))
              .forEach(i -&gt; {});

The output of the above is:

Map: 1
Map: 2
Map: 3
Map: 4
Map: 5

Map: 1
Map: 2
Map: 3
Map: 4
Map: 5

But this isn’t always the case!

This optimisation is an implementation detail, and in general, it is not unwise to really apply the filter first, map later rule thoroughly, not relying on such an optimisation. In particular, the Java 8 implementation of flatMap() is not lazy. Consider the following logic, where we put a flatMap() operation in the middle of the stream:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -&gt; i + 1)
              .flatMap(i -&gt; Stream.of(i, i, i, i))
              .map(i -&gt; i + 1)
              .peek(i -&gt; System.out.println("Map: " + i))
              .forEach(i -&gt; {});


        Stream.iterate(0, i -&gt; i + 1)
              .flatMap(i -&gt; Stream.of(i, i, i, i))
              .map(i -&gt; i + 1)
              .peek(i -&gt; System.out.println("Map: " + i))
              .forEach(i -&gt; {});

The result is now:

Map: 1
Map: 1
Map: 1
Map: 1
Map: 2
Map: 2
Map: 2
Map: 2

Map: 1
Map: 1
Map: 1
Map: 1
Map: 2

So, the first Stream pipeline will map all the 8 flatmapped values prior to applying the limit, whereas the second Stream pipeline really limits the stream to 5 elements first, and then maps only those.

The reason for this is in the flatMap() implementation:

// In ReferencePipeline.flatMap()
try (Stream&lt;? extends R&gt; result = mapper.apply(u)) {
    if (result != null)

As you can see, the result of the flatMap() operation is consumed eagerly with a terminal forEach() operation, which will always produce all the four values in our case and send them to the next operation. So, flatMap() isn’t lazy, and thus the next operation after it will get all of its results. This is true for Java 8. Future Java versions might improve this, of course.

We better filter them first. And map later.

Update: flatMap() gets fixed in JDK 10

Thanks, Tagir Valeev, for pointing out that there’s a fix coming up:

Relevant links:


However, there’s still a bug when using Stream.iterator(): https://bugs.openjdk.java.net/browse/JDK-8267359

A Nice API Design Gem: Strategy Pattern With Lambdas

With Java 8 lambdas being available to us as a programming tool, there is a “new” and elegant way of constructing objects. I put “new” in quotes, because it’s not new. It used to be called the strategy pattern, but as I’ve written on this blog before, many GoF patterns will no longer be implemented in their classic OO way, now that we have lambdas.

A simple example from jOOQ

jOOQ knows a simple type called Converter. It’s a simple SPI, which allows users to implement custom data types and inject data type conversion into jOOQ’s type system. The interface looks like this:

public interface Converter<T, U> {
    U from(T databaseObject);
    T to(U userObject);
    Class<T> fromType();
    Class<U> toType();

Users will have to implement 4 methods:

  • Conversion from a database (JDBC) type T to the user type U
  • Conversion from the user type U to the database (JDBC) type T
  • Two methods providing a Class reference, to work around generic type erasure

Now, an implementation that converts hex strings (database) to integers (user type):

public class HexConverter implements Converter<String, Integer> {

    public Integer from(String hexString) {
        return hexString == null 
            ? null 
            : Integer.parseInt(hexString, 16);

    public String to(Integer number) {
        return number == null 
            ? null 
            : Integer.toHexString(number);

    public Class<String> fromType() {
        return String.class;

    public Class<Integer> toType() {
        return Integer.class;

That wasn’t difficult to write, but it’s quite boring to write this much boilerplate:

  • Why do we need to give this class a name?
  • Why do we need to override methods?
  • Why do we need to handle nulls ourselves?

Now, we could write some object oriented libraries, e.g. abstract base classes that take care at least of the fromType() and toType() methods, but much better: The API designer can provide a “constructor API”, which allows users to provide “strategies”, which is just a fancy name for “function”. One function (i.e. lambda) for each of the four methods. For example:

public interface Converter<T, U> {

    static <T, U> Converter<T, U> of(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return new Converter<T, U>() { ... boring code here ... }

    static <T, U> Converter<T, U> ofNullable(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return of(

            // Boring null handling code here
            t -> t == null ? null : from.apply(t),
            u -> u == null ? null : to.apply(u)

From now on, we can easily write converters in a functional way. For example, our HexConverter would become:

Converter<String, Integer> converter =
    s -> Integer.parseInt(s, 16),

Wow! This is really nice, isn’t it? This is the pure essence of what it means to write a Converter. No more overriding, null handling, type juggling, just the bidirectional conversion logic.

Other examples

A more famous example is the JDK 8 Collector.of() constructor, without which it would be much more tedious to implement a collector. For example, if we want to find the second largest element in a stream… easy!

for (int i : Stream.of(1, 8, 3, 5, 6, 2, 4, 7)
    () -> new int[] { Integer.MIN_VALUE, Integer.MIN_VALUE },
    (a, t) -> {
        if (a[0] < t) {
            a[1] = a[0];
            a[0] = t;
        else if (a[1] < t)
            a[1] = t;
    (a1, a2) -> {
        throw new UnsupportedOperationException(
            "Say no to parallel streams");

Run this, and you get:


Bonus exercise: Make the collector parallel capable by implementing the combiner correctly. In a sequential-only scenario, we don’t need it (until we do, of course…).


The concrete examples are nice examples of API usage, but the key message is this:

If you have an interface of the form:

interface MyInterface {
    void myMethod1();
    String myMethod2();
    void myMethod3(String value);
    String myMethod4(String value);

Then, just add a convenience constructor to the interface, accepting Java 8 functional interfaces like this:

// You write this boring stuff
interface MyInterface {
    static MyInterface of(
        Runnable function1,
        Supplier<String> function2,
        Consumer<String> function3,
        Function<String, String> function4
    ) {
        return new MyInterface() {
            public void myMethod1() {

            public String myMethod2() {
                return function2.get();

            public void myMethod3(String value) {

            public String myMethod4(String value) {
                return function4.apply(value);

As an API designer, you write this boilerplate only once. And your users can then easily write things like these:

// Your users write this awesome stuff
    () -> { ... },
    () -> "hello",
    v -> { ... },
    v -> "world"

Easy! And your users will love you forever for this.

jOOQ Tuesdays: Mario Fusco Talks About Functional and Declarative Programming

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.


I’m very excited to feature today Mario Fusco, author of LambdaJ, working on Red Hat’s drools, a Java Champion and frequent speaker at Java conferences on all topics functional programming.

Mario, a long time ago, I have already stumbled upon your name when looking up the author of Lambdaj – a library that went to the extreme to bring lambdas to Java 5 or earlier. How does it work? And what’s the most peculiar hack you implemented to make it work?

When I started developing Lambdaj in 2007 I thought to it just as a proof-of-concept to check how far I could push Java 5. I never expected that it could become something that somebody else other than myself may actually want to use. In reality, given the limited, or I should say non-existing, capabilities of Java 5 as a functional language, Lambdaj was entirely a big hack. Despite this, people started using and somewhat loving it, and this made me (and possibly somebody else) realize that Java developers, or at least part of them, were tired of the pure imperative paradigm imposed by the language and ready to experiment with something more functional.

The main feature of Lambdaj, and what made its DSL quite nice to use, was the possibility to reference the method of a class in a static and type safe way and pass it to another method. In this way you could for example sort a list of persons by their age doing something like:

sort(persons, on(Person.class).getAge());

As anticipated what happened under the hood was a big hack: the on() method created a proxy of the Person class so you could safely call the getAge() method on it. The proxy didn’t do anything useful other than registering the method call. However it had to return something of the same type of the value returned by the actual method to avoid a ClassCastException. To this purpose it had a mechanism to generate a reasonably unique instance of that type, an int in my example. Before returning that value it also associated it, using a WeakHashMap, to the invoked method. In this way the sort() method was actually invoked with a list and the value generated by my proxy. It then retrieved from the map the Java method associated with that value and invoked it on all the items of the list performing the operation, a sorting in this case, that it was supposed to execute.

That’s crazy :) I’m sure you’re happy that a lot of Lambdaj features are now deprecated. You’re now touring the world with your functional programming talks. What makes you so excited about this topic?

The whole Lambdaj project is now deprecated and abandoned. The new functional features introduced with Java 8 just made it obsolete. Nevertheless it not only had the merit to make developers become curious and interested about functional programming, but also to experiment with new patterns and ideas that in the end also influenced the Java 8 syntax. Take for instance how you can sort a Stream of persons by age using a method reference


It looks evident how the method references have been at least inspired by the Lambdaj‘s on() method.

There is a number of things that I love of functional programming:

  1. The readability: a snippet of code written in functional style looks like a story while too often the equivalent code in imperative style resembles a puzzle.
  2. The declarative nature: in functional programming is enough to declare the result that you want to achieve rather than specifying the steps to obtain it. You only care about the what without getting lost in the details of the how.
  3. The possibility of treating data and behaviors uniformly: functional programming allows you to pass to a method both data (the list of persons to be sorted) and computation (the function to be applied to each person in the list). This idea is fundamental for many algorithms like for example the map/reduce: since data and computation are the same thing and the second is typically orders of magnitude smaller you are free to send them to the machine holding the data instead of the opposite.
  4. The higher level of abstraction: the possibility of encapsulating computations in functions and pass them around to other functions allows both a dramatic reduction of code duplication and the design of more generic and expressive API.
  5. Immutability and referential transparency: using immutable values and having side-effects programs makes far easier to reason on your code, test it and ensure its correctness.
  6. The parallelism friendliness: all the features listed above also enable the parallelization of your software in a simpler and more reliable way. It is not coincidence that functional programming started becoming more popular around 10 years ago that is also when multicore CPUs began to be available on commodity hardware.

Our readers love SQL (or at least, they use it frequently). How does functional programming compare to SQL?

The most evident thing that FP and SQL have in common is their declarative paradigm. To some extent SQL, or at least the data selection part, can be seen as a functional language specialized to manipulate data in tabular format.

The data modification part is a totally different story though. The biggest part of SQL users normally change data in a destructive way, overwriting or even deleting the existing data. This is clearly in contrast with the immutability mantra of functional programming. However this is only how SQL is most commonly used, but nothing dictates that it couldn’t be also employed in a non-destructive append-only way. I wish to see SQL used more often in this way in future.

In your day job, you’re working for Red Hat, on drools. Business rules sound enterprisey. How does that get along with your fondness of functional programming?

Under an user point of view a rule engine in general and drools in particular are the extreme form of declarative programming, second only to Prolog. For this reason developers who are only familiar with the imperative paradigm struggle to use it, because they also try to enforce it to work in an imperative way. Conversely programmers more used to think in functional (and then declarative) terms are more often able to use it correctly when they approach it for the first time.

For what regards me, my work as developer of both the core engine and the compiler of drools allows me to experiment every day in both fields of language design and algorithmic invention and optimization. To cut it short it’s a challenging job and there’s lot’s of fun in it: don’t tell this to my employer but I cannot stop being surprised that they allow me to play with this everyday and they also pay me for that.

You’re also on the board of VoxxedDays Ticino, Zurich, and CERN (wow, how geeky is that? A large hadron collider Java conference!). Why is Voxxed such a big success for you?

I must admit that, before being involved in this, I didn’t imagine the amount of work that organizing a conference requires. However this effort is totally rewarded. In particular the great advantage of VoxxedDays is the fact of being local 1-day events made by developers for developers that practically anybody can afford.

I remember that the most common feedback I received after the first VoxxedDays Ticino that we did 2 years ago was some like: “This has been the very first conference I attended in my life and I didn’t imagine it could have been a so amazing experience both under a technical and even more a social point of view. Thanks a lot for that, I eagerly wait to attend even next year”. Can you imagine something more rewarding for a conference organizer?

The other important thing for me is giving the possibility to speakers that aren’t rock stars (yet) to talk in public and share their experience with a competent audience. I know that for at least some of them this is only the first step to let themselves and others discover their capabilities as public speakers and launch them toward bigger conferences like the Devoxx.

Thank you very much Mario

If you want to learn more about Mario’s insights on functional programming, please do visit his interesting talks at Devoxx from the recent past: