Imperative Loop or Functional Stream Pipeline? Beware of the Performance Impact!

I like weird, yet concise language constructs and API usages

Yes. I am guilty. Evil? Don’t know. But guilty. I heavily use and abuse the java.lang.Boolean type to implement three valued logic in Java:

  • Boolean.TRUE means true (duh)
  • Boolean.FALSE means false
  • null can mean anything like “unknown” or “uninitialised”, etc.

I know – a lot of enterprise developers will bikeshed and cargo cult the old saying:

Code is read more often than it is written

But as with everything, there is a tradeoff. For instance, in algorithm-heavy, micro optimised library code, it is usually more important to have code that really performs well, rather than code that apparently doesn’t need comments because the author has written it in such a clear and beautiful way.

I don’t think it matters much in the case of the boolean type (where I’m just too lazy to encode every three valued situation in an enum). But here’s a more interesting example from that same twitter thread. The code is simple:

woot:
if (something) {
  for (Object o : list) 
    if (something(o))
      break woot;

  throw new E();
}

Yes. You can break out of “labeled ifs”. Because in Java, any statement can be labeled, and if the statement is a compound statement (observe the curly braces following the if), then it may make sense to break out of it. Even if you’ve never seen that idiom, I think it’s quite immediately clear what it does.

Ghasp!

If Java were a bit more classic, it might have supported this syntax:

if (something) {
  for (Object o : list) 
    if (something(o))
      goto woot;

  throw new E();
}
woot:

Nicolai suggested that the main reason I hadn’t written the following, equivalent, and arguably more elegant logic, is because jOOQ still supports Java 6:

if (something && list.stream().noneMatch(this::something))
  throw new E();

It’s more concise! So, it’s better, right? Everything new is always better.

A third option would have been the less concise solution that essentially just replaces break by return:

if (something && noneMatchSomething(list)
  throw new E();

// And then:
private boolean noneMatchSomething(List<?> list) {
  for (Object o : list)
    if (something(o))
      return false;
  return true;
}

There’s an otherwise useless method that has been extracted. The main benefit is that people are not used to breaking out of labeled statements (other than loops, and even then it’s rare), so this is again about some subjective “readability”. I personally find this particular example less readable, because the extracted method is no longer local. I have to jump around in the class and interrupt my train of thoughts. But of course, YMMV with respect to the two imperative alternatives.

Back to objectivity: Performance

When I tweet about Java these days, I’m mostly tweeting about my experience writing jOOQ. A library. A library that has been tuned so much over the past years, that the big client side bottleneck (apart from the obvious database call) is the internal StringBuilder that is used to generate dynamic SQL. And compared to most database queries, you will not even notice that.

But sometimes you do. E.g. if you’re using an in-memory H2 database and run some rather trivial queries, then jOOQ’s overhead can become measurable again. Yes. There are some use-cases, which I do want to take seriously as well, where the difference between an imperative loop and a stream pipeline is measurable.

In the above examples, let’s remove the throw statement and replace it by something simpler (because exceptions have their own significant overhead).

I’ve created this JMH benchmark, which compares the 3 approaches:

  • Imperative with break
  • Imperative with return
  • Stream

Here’s the benchmark

package org.jooq.test.benchmark;

import java.util.ArrayList;
import java.util.List;

import org.openjdk.jmh.annotations.*;

@Fork(value = 3, jvmArgsAppend = "-Djmh.stack.lines=3")
@Warmup(iterations = 5, time = 3)
@Measurement(iterations = 7, time = 3)
public class ImperativeVsStream {

    @State(Scope.Benchmark)
    public static class BenchmarkState {

        boolean something = true;

        @Param({ "2", "8" })
        int listSize;

        List<Integer> list = new ArrayList<>();

        boolean something() {
            return something;
        }

        boolean something(Integer o) {
            return o > 2;
        }

        @Setup(Level.Trial)
        public void setup() throws Exception {
            for (int i = 0; i < listSize; i++)
                list.add(i);
        }

        @TearDown(Level.Trial)
        public void teardown() throws Exception {
            list = null;
        }
    }

    @Benchmark
    public Object testImperativeWithBreak(BenchmarkState state) {
        woot:
        if (state.something()) {
            for (Integer o : state.list)
                if (state.something(o))
                    break woot;

            return 1;
        }

        return 0;
    }

    @Benchmark
    public Object testImperativeWithReturn(BenchmarkState state) {
        if (state.something() && woot(state))
            return 1;

        return 0;
    }

    private boolean woot(BenchmarkState state) {
        for (Integer o : state.list)
            if (state.something(o))
                return false;

        return true;
    }

    @Benchmark
    public Object testStreamNoneMatch(BenchmarkState state) {
        if (state.something() && state.list.stream().noneMatch(state::something))
            return 1;

        return 0;
    }

    @Benchmark
    public Object testStreamAnyMatch(BenchmarkState state) {
        if (state.something() && !state.list.stream().anyMatch(state::something))
            return 1;

        return 0;
    }

    @Benchmark
    public Object testStreamAllMatch(BenchmarkState state) {
        if (state.something() && state.list.stream().allMatch(s -> !state.something(s)))
            return 1;

        return 0;
    }
}

The results are pretty clear:

Benchmark                                    (listSize)   Mode  Cnt         Score          Error  Units
ImperativeVsStream.testImperativeWithBreak            2  thrpt   14  86513288.062 ± 11950020.875  ops/s
ImperativeVsStream.testImperativeWithBreak            8  thrpt   14  74147172.906 ± 10089521.354  ops/s
ImperativeVsStream.testImperativeWithReturn           2  thrpt   14  97740974.281 ± 14593214.683  ops/s
ImperativeVsStream.testImperativeWithReturn           8  thrpt   14  81457864.875 ±  7376337.062  ops/s
ImperativeVsStream.testStreamAllMatch                 2  thrpt   14  14924513.929 ±  5446744.593  ops/s
ImperativeVsStream.testStreamAllMatch                 8  thrpt   14  12325486.891 ±  1365682.871  ops/s
ImperativeVsStream.testStreamAnyMatch                 2  thrpt   14  15729363.399 ±  2295020.470  ops/s
ImperativeVsStream.testStreamAnyMatch                 8  thrpt   14  13696297.091 ±   829121.255  ops/s
ImperativeVsStream.testStreamNoneMatch                2  thrpt   14  18991796.562 ±   147748.129  ops/s
ImperativeVsStream.testStreamNoneMatch                8  thrpt   14  15131005.381 ±   389830.419  ops/s

With this simple example, break or return don’t matter. At some point, adding additional methods might start getting in the way of inlining (because of stacks getting too deep), but not creating additional methods might be getting in the way of inlining as well (because of method bodies getting too large). I don’t want to bet on either approach here at this level, nor is jOOQ tuned that much. Like most similar libraries, the traversal of the jOOQ expression tree generates stack that are too deep to completely inline anyway.

But the very obvious loser here is the Stream approach, which is roughly 6.5x slower in this benchmark than the imperative approaches. This isn’t surprising. The stream pipeline has to be set up every single time to represent something as trivial as the above imperative loop. I’ve already blogged about this in the past, where I compared replacing simple for loops by Stream.forEach()

Meh, does it matter?

In your business logic? Probably not. Your business logic is I/O bound, mostly because of the database. Wasting a few CPU cycles on a client side loop is not the main issue. Even if it is, the waste probably happens because your loop shouldn’t even be at the client side in the first place, but moved into the database as well. I’m currently touring conferences with a call about that topic:

In your infrastructure logic? Maybe! If you’re writing a library, or if you’re using a library like jOOQ, then yes. Chances are that a lot of your logic is CPU bound. You should occasionally profile your application and spot such bottlenecks, both in your code and in third party libraries. E.g. in most of jOOQ’s internals, using a stream pipeline might be a very bad choice, because ultimately, jOOQ is something that might be invoked from within your loops, thus adding significant overhead to your application, if your queries are not heavy (e.g. again when run against an H2 in-memory database).

So, given that you’re clearly “micro-losing” on the performance side by using the Stream API, you may need to evaluate the readability tradeoff more carefully. When business logic is complex, readability is very important compared to micro optimisations. With infrastructure logic, it is much less likely so, in my opinion. And I’m not alone:

Note: there’s that other cargo cult of premature optimisation going around. Yes, you shouldn’t worry about these details too early in your application implementation. But you should still know when to worry about them, and be aware of the tradeoffs.

And while you’re still debating what name to give to that extracted method, I’ve written 5 new labeled if statements! ;-)

Writing Custom Aggregate Functions in SQL Just Like a Java 8 Stream Collector

All SQL databases support the standard aggregate functions COUNT(), SUM(), AVG(), MIN(), MAX().

Some databases support other aggregate functions, like:

  • EVERY()
  • STDDEV_POP()
  • STDDEV_SAMP()
  • VAR_POP()
  • VAR_SAMP()
  • ARRAY_AGG()
  • STRING_AGG()

But what if you want to roll your own?

Java 8 Stream Collector

When using Java 8 streams, we can easily roll our own aggregate function (i.e. a Collector). Let’s assume we want to find the second highest value in a stream. The highest value can be obtained like this:

System.out.println(
    Stream.of(1, 2, 3, 4)
          .collect(Collectors.maxBy(Integer::compareTo))
) ;

Yielding:

Optional[4]

Now, what about the second highest value? We can write the following collector:

System.out.println(
    Stream.of(1, 6, 2, 3, 4, 4, 5).parallel()
          .collect(Collector.of(
              () -> new int[] { 
                  Integer.MIN_VALUE, 
                  Integer.MIN_VALUE 
              },
              (a, i) -> {
                  if (a[0] < i) {
                      a[1] = a[0];
                      a[0] = i;
                  }
                  else if (a[1] < i)
                      a[1] = i;
              },
              (a1, a2) -> {
                  if (a2[0] > a1[0]) {
                      a1[1] = a1[0];
                      a1[0] = a2[0];

                      if (a2[1] > a1[1])
                          a1[1] = a2[1];
                  }
                  else if (a2[0] > a1[1])
                      a1[1] = a2[0];

                  return a1;
              },
              a -> a[1]
          ))
) ;

It doesn’t do anything fancy. It has these 4 functions:

  • Supplier<int[]>: A supplier that provides an intermediary int[] of length 2, initialised with Integer.MIN_VALUE, each. This array will remember the MAX() value in the stream at position 0 and the SECOND_MAX() value in the stream at position 1
  • BiConsumer<int[], Integer>: A accumulator that accumulates new values from the stream into our intermediary data structure.
  • BinaryOperator<int[]>: A combiner that combines two intermediary data structures. This is used for parallel streams only.
  • Function<int[], Integer>: The finisher function that extracts the SECOND_MAX() function from the second position in our intermediary array.

The output is now:

5

How to do the same thing with SQL?

Many SQL databases offer a very similar way of calculating custom aggregate functions. Here’s how to do the exact same thing with…

Oracle:

With the usual syntactic ceremony…

CREATE TYPE u_second_max AS OBJECT (

  -- Intermediary data structure
  MAX NUMBER,
  SECMAX NUMBER,

  -- Corresponds to the Collector.supplier() function
  STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT u_second_max) RETURN NUMBER,

  -- Corresponds to the Collector.accumulate() function
  MEMBER FUNCTION ODCIAggregateIterate(self IN OUT u_second_max, value IN NUMBER) RETURN NUMBER,

  -- Corresponds to the Collector.combineer() function
  MEMBER FUNCTION ODCIAggregateMerge(self IN OUT u_second_max, ctx2 IN u_second_max) RETURN NUMBER,

  -- Correspodns to the Collector.finisher() function
  MEMBER FUNCTION ODCIAggregateTerminate(self IN u_second_max, returnValue OUT NUMBER, flags IN NUMBER) RETURN NUMBER
)
/

-- This is our "colletor" implementation
CREATE OR REPLACE TYPE BODY u_second_max IS
  STATIC FUNCTION ODCIAggregateInitialize(sctx IN OUT u_second_max)
  RETURN NUMBER IS
  BEGIN
    SCTX := U_SECOND_MAX(0, 0);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateIterate(self IN OUT u_second_max, value IN NUMBER) RETURN NUMBER IS
  BEGIN
    IF VALUE > SELF.MAX THEN
      SELF.SECMAX := SELF.MAX;
      SELF.MAX := VALUE;
    ELSIF VALUE > SELF.SECMAX THEN
      SELF.SECMAX := VALUE;
    END IF;
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateTerminate(self IN u_second_max, returnValue OUT NUMBER, flags IN NUMBER) RETURN NUMBER IS
  BEGIN
    RETURNVALUE := SELF.SECMAX;
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateMerge(self IN OUT u_second_max, ctx2 IN u_second_max) RETURN NUMBER IS
  BEGIN
    IF CTX2.MAX > SELF.MAX THEN
      SELF.SECMAX := SELF.MAX;
      SELF.MAX := CTX2.MAX;
    
      IF CTX2.SECMAX > SELF.SECMAX THEN
        SELF.SECMAX := CTX2.SECMAX;
      END IF;
    ELSIF CTX2.MAX > SELF.SECMAX THEN
      SELF.SECMAX := CTX2.MAX;
    END IF;
  
    RETURN ODCIConst.Success;
  END;
END;
/

-- Finally, we have to give this aggregate function a name
CREATE FUNCTION SECOND_MAX (input NUMBER) RETURN NUMBER
PARALLEL_ENABLE AGGREGATE USING u_second_max;
/

We can now run the above on the Sakila database:

SELECT 
  max(film_id), 
  second_max(film_id) 
FROM film;

To get:

MAX     SECOND_MAX
------------------
1000    999

And what’s even better, we can use the aggregate function as a window function for free!

SELECT 
  film_id,
  length,
  max(film_id) OVER (PARTITION BY length), 
  second_max(film_id) OVER (PARTITION BY length)
FROM film
ORDER BY length, film_id;

The above yields:

FILM_ID  LENGTH  MAX   SECOND_MAX
---------------------------------
15       46      730   505
469      46      730   505
504      46      730   505
505      46      730   505
730      46      730   505
237      47      869   784
247      47      869   784
393      47      869   784
398      47      869   784
407      47      869   784
784      47      869   784
869      47      869   784
2        48      931   866
410      48      931   866
575      48      931   866
630      48      931   866
634      48      931   866
657      48      931   866
670      48      931   866
753      48      931   866
845      48      931   866
866      48      931   866
931      48      931   866

Beautiful, right?

PostgreSQL

PostgreSQL supports a slightly more concise syntax in the CREATE AGGREGATE statement. If we don’t allow for parallelism, we can write this minimal implementation:

CREATE FUNCTION second_max_sfunc (
  state INTEGER[], data INTEGER
) RETURNS INTEGER[] AS
$$
BEGIN
  IF state IS NULL THEN
    RETURN ARRAY[data, NULL];
  ELSE
    RETURN CASE 
      WHEN state[1] > data
      THEN CASE 
        WHEN state[2] > data
        THEN state
        ELSE ARRAY[state[1], data]
      END
      ELSE ARRAY[data, state[1]]
    END;
  END IF;
END;
$$ LANGUAGE plpgsql;
/

CREATE FUNCTION second_max_ffunc (
  state INTEGER[]
) RETURNS INTEGER AS
$$
BEGIN
  RETURN state[2];
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE second_max (INTEGER) (
  SFUNC     = second_max_sfunc,
  STYPE     = INTEGER[],
  FINALFUNC = second_max_ffunc
);

Here, we use the STYPE (Collector.supplier()), the SFUNC (Collector.accumulator()), and the FINALFUNC (Collector.finisher()) specifications.

Other databases

Many other databases allow for specifying user defined aggregate functions. Look up your database manual’s details to learn more. They always work in the same way as a Java 8 Collector.

How to Compile a Class at Runtime with Java 8 and 9

In some cases, it’s really useful to be able to compile a class at runtime using the java.compiler module. You can e.g. load a Java source file from the database, compile it on the fly, and execute its code as if it were part of your application.

In the upcoming jOOR 0.9.8, this will be made possible through https://github.com/jOOQ/jOOR/issues/51. As always with jOOR (and our other projects), we’re wrapping existing JDK API, simplifying the little details that you often don’t want to worry about. Using jOOR API, you can now write:

// Run this code from within the com.example package

Supplier<String> supplier = Reflect.compile(
    "com.example.CompileTest",
    "package com.example;\n" +
    "class CompileTest\n" +
    "implements java.util.function.Supplier<String> {\n" +
    "  public String get() {\n" +
    "    return \"Hello World!\";\n" +
    "  }\n" +
    "}\n"
).create().get();

System.out.println(supplier.get());

And the result is, of course:

Hello World!

If we already had JEP-326, this would be even cooler!

Supplier<String> supplier = Reflect.compile(
    `org.joor.test.CompileTest`,
    `package org.joor.test;
     class CompileTest
     implements java.util.function.Supplier<String> {
       public String get() {
         return "Hello World!"
       }
     }`
).create().get();

System.out.println(supplier.get());

What happens behind the scenes?

Again, as in our previous blog post, we need to ship two different versions of our code. One that works in Java 8 (where reflecting and accessing JDK internal API was possible), and one that works in Java 9+ (where this is forbidden). The full annotated API is here:

package org.joor;

import java.io.ByteArrayOutputStream;
import java.io.OutputStream;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodHandles.Lookup;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;

import javax.tools.*;

import static java.lang.StackWalker.Option.RETAIN_CLASS_REFERENCE;

class Compile {

    static Class<?> compile(String className, String content) 
    throws Exception {
        Lookup lookup = MethodHandles.lookup();

        // If we have already compiled our class, simply load it
        try {
            return lookup.lookupClass()
                         .getClassLoader()
                         .loadClass(className);
        }

        // Otherwise, let's try to compile it
        catch (ClassNotFoundException ignore) {
            return compile0(className, content, lookup);
        }
    }

    static Class<?> compile0(
        String className, String content, Lookup lookup)
    throws Exception {
        JavaCompiler compiler = 
            ToolProvider.getSystemJavaCompiler();

        ClassFileManager manager = new ClassFileManager(
            compiler.getStandardFileManager(null, null, null));

        List<CharSequenceJavaFileObject> files = new ArrayList<>();
        files.add(new CharSequenceJavaFileObject(
            className, content));

        compiler.getTask(null, manager, null, null, null, files)
                .call();
        Class<?> result = null;

        // Implement a check whether we're on JDK 8. If so, use
        // protected ClassLoader API, reflectively
        if (onJava8()) {
            ClassLoader cl = lookup.lookupClass().getClassLoader();
            byte[] b = manager.o.getBytes();
            result = Reflect.on(cl).call("defineClass", 
                className, b, 0, b.length).get();
        }

        // Lookup.defineClass() has only been introduced in Java 9.
        // It is required to get private-access to interfaces in
        // the class hierarchy
        else {

            // This method is called by client code from two levels
            // up the current stack frame. We need a private-access
            // lookup from the class in that stack frame in order
            // to get private-access to any local interfaces at
            // that location.
            Class<?> caller = StackWalker
                .getInstance(RETAIN_CLASS_REFERENCE)
                .walk(s -> s
                    .skip(2)
                    .findFirst()
                    .get()
                    .getDeclaringClass());

            // If the compiled class is in the same package as the
            // caller class, then we can use the private-access 
            // Lookup of the caller class
            if (className.startsWith(caller.getPackageName() )) {
                result = MethodHandles
                    .privateLookupIn(caller, lookup)
                    .defineClass(fileManager.o.getBytes());
            }

            // Otherwise, use an arbitrary class loader. This
            // approach doesn't allow for loading private-access 
            // interfaces in the compiled class's type hierarchy
            else {
                result = new ClassLoader() {
                    @Override
                    protected Class<?> findClass(String name) 
                    throws ClassNotFoundException {
                        byte[] b = fileManager.o.getBytes();
                        int len = b.length;
                        return defineClass(className, b, 0, len);
                    }
                }.loadClass(className);
            }
        }

        return result;
    }

    // These are some utility classes needed for the JavaCompiler
    // ----------------------------------------------------------

    static final class JavaFileObject 
    extends SimpleJavaFileObject {
        final ByteArrayOutputStream os = 
            new ByteArrayOutputStream();

        JavaFileObject(String name, JavaFileObject.Kind kind) {
            super(URI.create(
                "string:///" 
              + name.replace('.', '/') 
              + kind.extension), 
                kind);
        }

        byte[] getBytes() {
            return os.toByteArray();
        }

        @Override
        public OutputStream openOutputStream() {
            return os;
        }
    }

    static final class ClassFileManager 
    extends ForwardingJavaFileManager<StandardJavaFileManager> {
        JavaFileObject o;

        ClassFileManager(StandardJavaFileManager m) {
            super(m);
        }

        @Override
        public JavaFileObject getJavaFileForOutput(
            JavaFileManager.Location location,
            String className,
            JavaFileObject.Kind kind,
            FileObject sibling
        ) {
            return o = new JavaFileObject(className, kind);
        }
    }

    static final class CharSequenceJavaFileObject 
    extends SimpleJavaFileObject {
        final CharSequence content;

        public CharSequenceJavaFileObject(
            String className, 
            CharSequence content
        ) {
            super(URI.create(
                "string:///" 
              + className.replace('.', '/') 
              + JavaFileObject.Kind.SOURCE.extension), 
                JavaFileObject.Kind.SOURCE);
            this.content = content;
        }

        @Override
        public CharSequence getCharContent(
            boolean ignoreEncodingErrors
        ) {
            return content;
        }
    }
}

Notice how the JDK 9 version is a bit more complicated, as we have to:

  • Find the caller class of our method
  • Get a private method handle lookup for that class if the class being compiled is in the same package as the class calling the compilation
  • Otherwise, use an arbitrary class loader to define the class

Reflection definitely hasn’t become simpler with Java 9!

Correct Reflective Access to Interface Default Methods in Java 8, 9, 10

When performing reflective access to default methods in Java, Google seems to fail us. The solutions presented on Stack Overflow, for instance, seem to work only in a certain set of cases, and not on all Java versions.

This article will illustrate different approaches to calling interface default methods through reflection, as may be required by a proxy, for instance.

TL;DR If you’re impatient, all the access methods exposed in this blog are available in this gist, and the problem is also fixed in our library jOOR.

Proxying interfaces with default methods

The useful java.lang.reflect.Proxy API has been around for a while. We can do cool things like:

import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        void quack();
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                System.out.println("Quack");
                return null;
            }
        );

        duck.quack();
    }
}

This just yields:

Quack

In this example, we create a proxy instance that implements the Duck API through an InvocationHandler, which is essentially just a lambda that gets called for each method call on Duck.

The interesting bit is when we want to have a default method on Duck and delegate the call to that default method:

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

We might be inclined to write this:

import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        default void quack() {
            System.out.println("Quack");
        }
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                method.invoke(proxy);
                return null;
            }
        );

        duck.quack();
    }
}

But this will just generate a long long stack trace of nested exceptions (this isn’t specific to the method being a default method. You simply cannot do this):

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:20)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 2 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 7 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 8 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 13 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 14 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 19 more
...
...
... goes on forever

Not very helpful.

Using method handles

So, the original Google search turned up results that indicate we need to use the MethodHandles API. Let’s try that, then!

import java.lang.invoke.MethodHandles;
import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        default void quack() {
            System.out.println("Quack");
        }
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles
                    .lookup()
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

That seems to work, cool!

Quack

… until it doesn’t.

Calling a default method on a non-private-accessible interface

The interface in the above example was carefully chosen to be “private-accessible” by the caller, i.e. the interface is nested in the caller’s class. What if we had a top-level interface?

import java.lang.invoke.MethodHandles;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles
                    .lookup()
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

The almost same code snippet no longer works. We get the following IllegalAccessException:

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:26)
Caused by: java.lang.IllegalAccessException: no private access for invokespecial: interface Duck, from Duck/package
	at java.lang.invoke.MemberName.makeAccessException(MemberName.java:850)
	at java.lang.invoke.MethodHandles$Lookup.checkSpecialCaller(MethodHandles.java:1572)
	at java.lang.invoke.MethodHandles$Lookup.unreflectSpecial(MethodHandles.java:1231)
	at ProxyDemo.lambda$0(ProxyDemo.java:19)
	... 2 more

Bummer. When googling further, we might find the following solution, which accesses MethodHandles.Lookup‘s internals through reflection:

import java.lang.invoke.MethodHandles.Lookup;
import java.lang.reflect.Constructor;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                Constructor<Lookup> constructor = Lookup.class
                    .getDeclaredConstructor(Class.class);
                constructor.setAccessible(true);
                constructor.newInstance(Duck.class)
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

And yay, we get:

Quack

We get that on JDK 8. What about JDK 9 or 10?

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by ProxyDemo (file:/C:/Users/lukas/workspace/playground/target/classes/) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class)
WARNING: Please consider reporting this to the maintainers of ProxyDemo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Quack

Oops. That’s what happens by default. If we run the program with the --illegal-access=deny flag:

java --illegal-access=deny ProxyDemo

Then, we’re getting (and rightfully so):

Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @357246de
        at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:337)
        at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:281)
        at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:192)
        at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:185)
        at ProxyDemo.lambda$0(ProxyDemo.java:18)
        at $Proxy0.quack(Unknown Source)
        at ProxyDemo.main(ProxyDemo.java:28)

One of the Jigsaw project’s goals is to precisely not allow such hacks to persist. So, what’s a better solution? This?

import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles.lookup()
                    .findSpecial( 
                         Duck.class, 
                         "quack",  
                         MethodType.methodType( 
                             void.class, 
                             new Class[0]),  
                         Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}
Quack

Great, it works in Java 9 and 10, what about Java 8?

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:25)
Caused by: java.lang.IllegalAccessException: no private access for invokespecial: interface Duck, from ProxyDemo
	at java.lang.invoke.MemberName.makeAccessException(MemberName.java:850)
	at java.lang.invoke.MethodHandles$Lookup.checkSpecialCaller(MethodHandles.java:1572)
	at java.lang.invoke.MethodHandles$Lookup.findSpecial(MethodHandles.java:1002)
	at ProxyDemo.lambda$0(ProxyDemo.java:18)
	... 2 more

You’re kidding, right?

So, there’s a solution (hack) that works on Java 8 but not on 9 or 10, and there’s a solution that works on Java 9 and 10, but not on Java 8.

A more thorough examination

So far, I’ve just been trying to run different things on different JDKs. The following class tries all combinations. It’s also available in this gist here.

Compile it with JDK 9 or 10 (because it also tries using JDK 9+ API: MethodHandles.privateLookupIn()), but compile it using this command, so you can also run the class on JDK 8:

javac -source 1.8 -target 1.8 CallDefaultMethodThroughReflection.java
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodHandles.Lookup;
import java.lang.invoke.MethodType;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;


interface PrivateInaccessible {
    default void quack() {
        System.out.println(" -> PrivateInaccessible.quack()");
    }
}

public class CallDefaultMethodThroughReflection {
    interface PrivateAccessible {
        default void quack() {
            System.out.println(" -> PrivateAccessible.quack()");
        }
    }

    public static void main(String[] args) {
        System.out.println("PrivateAccessible");
        System.out.println("-----------------");
        System.out.println();
        proxy(PrivateAccessible.class).quack();

        System.out.println();
        System.out.println("PrivateInaccessible");
        System.out.println("-------------------");
        System.out.println();
        proxy(PrivateInaccessible.class).quack();
    }

    private static void quack(Lookup lookup, Class<?> type, Object proxy) {
        System.out.println("Lookup.in(type).unreflectSpecial(...)");

        try {
            lookup.in(type)
                  .unreflectSpecial(type.getMethod("quack"), type)
                  .bindTo(proxy)
                  .invokeWithArguments();
        }
        catch (Throwable e) {
            System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
        }

        System.out.println("Lookup.findSpecial(...)");
        try {
            lookup.findSpecial(type, "quack", MethodType.methodType(void.class, new Class[0]), type)
                  .bindTo(proxy)
                  .invokeWithArguments();
        }
        catch (Throwable e) {
            System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
        }
    }

    @SuppressWarnings("unchecked")
    private static <T> T proxy(Class<T> type) {
        return (T) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { type },
            (Object proxy, Method method, Object[] arguments) -> {
                System.out.println("MethodHandles.lookup()");
                quack(MethodHandles.lookup(), type, proxy);

                try {
                    System.out.println();
                    System.out.println("Lookup(Class)");
                    Constructor<Lookup> constructor = Lookup.class.getDeclaredConstructor(Class.class);
                    constructor.setAccessible(true);
                    constructor.newInstance(type);
                    quack(constructor.newInstance(type), type, proxy);
                }
                catch (Exception e) {
                    System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
                }

                try {
                    System.out.println();
                    System.out.println("MethodHandles.privateLookupIn()");
                    quack(MethodHandles.privateLookupIn(type, MethodHandles.lookup()), type, proxy);
                }
                catch (Error e) {
                    System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
                }

                return null;
            }
        );
    }
}

The output of the above program is:

Java 8

$ java -version
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

$ java CallDefaultMethodThroughReflection
PrivateAccessible
-----------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface CallDefaultMethodThroughReflection$PrivateAccessible, from CallDefaultMethodThroughReflection

Lookup(Class)
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

MethodHandles.privateLookupIn()
 -> class java.lang.NoSuchMethodError: java.lang.invoke.MethodHandles.privateLookupIn(Ljava/lang/Class;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandles$Lookup;

PrivateInaccessible
-------------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from PrivateInaccessible/package
Lookup.findSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from CallDefaultMethodThroughReflection

Lookup(Class)
Lookup.in(type).unreflectSpecial(...)
 -> PrivateInaccessible.quack()
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

MethodHandles.privateLookupIn()
 -> class java.lang.NoSuchMethodError: java.lang.invoke.MethodHandles.privateLookupIn(Ljava/lang/Class;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandles$Lookup;

Java 9

$ java -version
java version "9.0.4"
Java(TM) SE Runtime Environment (build 9.0.4+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)

$ java --illegal-access=deny CallDefaultMethodThroughReflection
PrivateAccessible
-----------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

Lookup(Class)
 -> class java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @30c7da1e

MethodHandles.privateLookupIn()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

PrivateInaccessible
-------------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from PrivateInaccessible/package (unnamed module @30c7da1e)
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

Lookup(Class)
 -> class java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @30c7da1e

MethodHandles.privateLookupIn()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateInaccessible.quack()
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

Java 10

$ java -version
java version "10" 2018-03-20
Java(TM) SE Runtime Environment 18.3 (build 10+46)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10+46, mixed mode)

$ java --illegal-access=deny CallDefaultMethodThroughReflection
... same result as in Java 9

Conclusion

Getting this right is a bit tricky.

  • In Java 8, the best working approach is the hack that opens up the JDK’s internals by accessing a package-private Lookup constructor. This is the only way to consistently call default methods on both private-accessible and private-inaccessible interfaces from any location.
  • In Java 9 and 10, the best working approaches are Lookup.findSpecial() (didn’t work in Java 8) or the new MethodHandles.privateLookupIn() (didn’t exist in in Java 8). The latter is required in case the interfaced is located in another module. That module will still need to open the interface’s package to the caller.

It’s fair to say that this is a bit of a mess. The appropriate meme here is:

According to Rafael Winterhalter (author of ByteBuddy), the “real” fix should go into a revised Proxy API:

I’m not sure if that would solve all the problems, but it should definitely be the case that an implementor shouldn’t worry about all of the above.

Also, clearly, this article didn’t do the complete work, e.g. of testing whether the approaches still work if Duck is imported from another module:

… which will be a topic of another blog post.

Using jOOR

If you’re using jOOR (our reflection library, check it out here), the upcoming version 0.9.8 will include a fix for this:
https://github.com/jOOQ/jOOR/issues/49

The fix simply uses the unsafe reflection approach in Java 8, or the MethodHandles.privateLookupIn() approach in Java 9+. You can then write:

Reflect.on(new Object()).as(PrivateAccessible.class).quack();
Reflect.on(new Object()).as(PrivateInaccessible.class).quack();

Map Reducing a Set of Values Into a Dynamic SQL UNION Query

Sounds fancy, right? But it’s a really nice and reasonable approach to doing dynamic SQL with jOOQ.

This blog post is inspired by a Stack Overflow question, where a user wanted to turn a set of values into a dynamic UNION query like this:

SELECT T.COL1
FROM T
WHERE T.COL2 = 'V1'
UNION
SELECT T.COL1
FROM T
WHERE T.COL2 = 'V2'
...
UNION
SELECT T.COL1
FROM T
WHERE T.COL2 = 'VN'

Note, both the Stack Overflow user and I are well aware of the possibility of using IN predicates :-), let’s just assume for the sake of argument, that the UNION query indeed outperforms the IN predicate in the user’s particular MySQL version and database. If this cannot be accepted, just imagine a more complex use case.

The solution in Java is really very simple:

import static org.jooq.impl.DSL.*;
import java.util.*;
import org.jooq.*;

public class Unions {
    public static void main(String[] args) {
        List<String> list = Arrays.asList("V1", "V2", "V3", "V4");

        System.out.println(
            list.stream()
                .map(Unions::query)
                .reduce(Select::union));
    }

    // Dynamically construct a query from an input string
    private static Select<Record1<String>> query(String s) {
        return select(T.COL1).from(T).where(T.COL2.eq(s));
    }
}

The output is:

Optional[(
  select T.COL1
  from T
  where T.COL2 = 'V1'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V2'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V3'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V4'
)]

If you’re using JDK 9+ (which has Optional.stream()), you can further proceed to running the query fluently as follows:

List<String> list = Arrays.asList("V1", "V2", "V3", "V4");

try (Stream<Record1<String>> stream = list.stream()
    .map(Unions::query)
    .reduce(Select::union))
    .stream() // Optional.stream()!
    .flatMap(Select::fetchStream)) {
    ...
}

This way, if the list is empty, reduce will return an empty optional. Streaming that empty optional will result in not fetching any results from the database.

Benchmarking JDK String.replace() vs Apache Commons StringUtils.replace()

What’s better? Using the JDK’s String.replace() or something like Apache Commons Lang’s Apache Commons Lang’s StringUtils.replace()?

In this article, I’ll compare the two, first in a profiling session using Java Mission Control (JMC), then in a benchmark using JMH, and we’ll see that Java 9 heavily improved things in this area.

Profiling using JMC

In a recent profiling session where I checked for any “obvious” bottlenecks in jOOQ, I’ve discovered this nasty regular expression pattern instantiation:

Tons of int[] instances were allocated by a regular expression pattern. That’s weird, because in general, inside of jOOQ’s internals, special care is always taken to pre-compile any regular expressions that are needed in static members, e.g.:

private static final Pattern TYPE_NAME_PATTERN = 
  Pattern.compile("\\([^\\)]*\\)");

This allows for using the Pattern in a far more optimal way, than e.g. by using String.replaceAll():

// Much better, pattern is pre-compiled
TYPE_NAME_PATTERN.matcher(castTypeName).replaceAll("")

// Much worse, pattern is compiled *every time*
castTypeName.replaceAll("\\([^\\)]*\\)", "")

That should be clear to everyone. The price to pay for this is the fact that the pattern is stored “far away” in some static member, rather than being visible right where it is used, which is a bit less readable. At least in my opinion.

SIDENOTE: People tend to get all angry about premature optimisation and such. Yes, these optimisations are micro optimisations and aren’t always worth the trouble. But this article is about jOOQ, a library that does a lot of expression tree transformations, and it is important for jOOQ to eliminate even 1% “bottlenecks”, as they make a difference. So, please read this article in this context.

Consider also our previous post about this subject: Top 10 Easy Performance Optimisations in Java

What was the problem in jOOQ?

Now, what appears to be obvious when using regular expressions seems less obvious when using ordinary, constant string replacements, such as when calling String.replace(CharSequence), as was done in the linked jOOQ issue #6672. The relevant piece of code was escaping all inline strings that are sent to the SQL database, to prevent syntax errors and, of course, SQL injection:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = result.replace("\\", "\\\\");

    return result.replace("'", "''");
}

We’re always escaping apostrophes by doubling them, and in some databases (e.g. MySQL), we often have to escape backslashes as well (unfortunately, not all ORMs seem to do this or even be aware of this MySQL “feature”).

Unfortunately as well, despite heavy use of Apache Commons Lang’s StringUtils.replace() in jOOQ’s internals, every now and then a String.replace(CharSequence) sneaks in, because it’s just so convenient to write.

Meh, does it matter?

Usually, in ordinary business logic, it shouldn’t (again – don’t optimise prematurely), but in jOOQ, which is essentially a SQL string manipulation library, it can get quite costly if a single replace call is done excessively (for good reasons, of course), and it is slower than it should be. And it is, prior to Java 9, when this method was optimised. I’ve done the profiling with Java 8, where internally, String.replace() uses a literal regex pattern (i.e. a pattern with a “literal” flag that is faster, but it is a pattern, nonetheless).

Not only does the method appear as a major offender in the GC allocation view, it also triggers quite some action in the “hot methods” view of JMC:

Those are quite a few Pattern methods. The percentages have to be understood in the context of a benchmark, running millions of queries against an H2 in-memory database, so the overhead is significant!

Using Apache Commons Lang’s StringUtils

A simple fix is to use Apache Commons Lang’s StringUtils instead:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = StringUtils.replace(result, "\\", "\\\\");

    return StringUtils.replace(result, "'", "''");
}

Now, the pressure has changed significantly. The int[] allocation is barely noticeable in comparison:

And much fewer Pattern calls are made, overall.

Benchmarking using JMH

Profiling can be very useful to spot bottlenecks, but it needs to be read with care. It introduces some artefacts and slight overheads and it is not 100% accurate when sampling call stacks, which might lead the wrong conclusions at times. This is why it is sometimes important to back claims by running an actual benchmark. And when benchmarking, please, don’t just loop 1 million times in a main() method. That will be very very inaccurate, except for very obvious, order-of-magnitude scale differences.

I’m using JMH here, running the following simple benchmark:

package org.jooq.test.benchmark;

import org.apache.commons.lang3.StringUtils;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@Fork(value = 3, jvmArgsAppend = "-Djmh.stack.lines=3")
@Warmup(iterations = 5)
@Measurement(iterations = 7)
public class StringReplaceBenchmark {

    private static final String SHORT_STRING_NO_MATCH = "abc";
    private static final String SHORT_STRING_ONE_MATCH = "a'bc";
    private static final String SHORT_STRING_SEVERAL_MATCHES = "'a'b'c'";
    private static final String LONG_STRING_NO_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_ONE_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabca'bcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_SEVERAL_MATCHES = 
      "abcabca'bcabcabcabcabcabc'abcabcabca'bcabcabcabcabcabca'bcabcabcabcabcabcabc";

    @Benchmark
    public void testStringReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_SEVERAL_MATCHES, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_SEVERAL_MATCHES, "'", "''"));
    }
}

Notice that I tried to run 2 x 3 different string replacement scenarios:

  • The string is “short”
  • The string is “long”

Cross joining (there, finally some SQL in this post!) the above with:

  • No match is found
  • One match is found
  • Several matches are found

That’s important because different optimisations can be implemented for those different cases, and probably, in jOOQ’s case, there is mostly no match in this particular case.

I ran this benchmark once on Java 8:

$ java -version
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

And on Java 9:

$ java -version
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)

As Tagir Valeev was kind enough to remind me that this issue was supposed to be fixed in Java 9:

The results are:

Java 8

testStringReplaceLongStringNoMatch               thrpt   21    4809343.940 ▒  66443.628  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   25063493.793 ▒ 660657.256  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    1406989.855 ▒  43051.008  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    6961669.111 ▒ 141504.827  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    1103323.491 ▒  17047.449  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3899108.777 ▒  41854.636  ops/s

testStringReplaceShortStringNoMatch              thrpt   21    5936992.874 ▒  68115.030  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  171660973.829 ▒ 377711.864  ops/s

testStringReplaceShortStringOneMatch             thrpt   21    3267435.957 ▒ 240198.763  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21    9943846.428 ▒ 270821.641  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    2313713.015 ▒  28806.738  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5447065.933 ▒ 139525.472  ops/s

As can be seen, the difference is “catastrophic”. Apache Commons Lang’s StringUtils drastically outpeforms the JDK’s String.replace() in every discipline, especially when no match is found in a short string! That’s because the library optimises for this particular case:

...
int end = searchText.indexOf(searchString, start);
if (end == INDEX_NOT_FOUND) {
    return text;
}

Java 9

Things look a bit differently for Java 9:

testStringReplaceLongStringNoMatch               thrpt   21   55528132.674 ▒  479721.812  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   55767541.806 ▒  754862.755  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    4806322.839 ▒  217538.714  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    8366539.616 ▒  142757.888  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    2685134.029 ▒   78108.171  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3923819.576 ▒  351103.020  ops/s

testStringReplaceShortStringNoMatch              thrpt   21  122398496.629 ▒ 1350086.256  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  121139633.453 ▒ 2756892.669  ops/s

testStringReplaceShortStringOneMatch             thrpt   21   18070522.151 ▒  498663.835  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21   11367395.622 ▒  153377.552  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    7548407.681 ▒  168950.209  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5045065.948 ▒  175251.545  ops/s

Java 9’s implementation is now similar to that of Apache Commons, with the same optimisation for non-matches:

public String replace(CharSequence target, CharSequence replacement) {
    String tgtStr = target.toString();
    String replStr = replacement.toString();
    int j = indexOf(tgtStr);
    if (j < 0) {
        return this;
    }
    ...

It is still quite slower for matches in long strings, but faster for matches in short strings. The tradeoff for jOOQ will be to still prefer Apache Commons because:

  • Most people are still on Java 8 or less, currently
  • Most replacements won’t match and both implementations fare equally well for that in Java 9, but Apache Commons is much faster for this category in Java 8
  • If there’s a match and thus a replacement, the speed depends on the string length, where the faster implementation is currently undecided

Conclusion

This micro optimisation stuff matters in jOOQ because jOOQ is a library that does a lot of SQL string manipulation. Every allocation and every CPU cycle that is wasted when manipulating SQL strings slows down the library, and thus impacts all of its users. In a situation like this, it is definitely worth considering not using these useful JDK String methods, and opting for the much faster Apache Commons implementations instead.

Things have improved a lot in Java 9, in case of which this can mostly be ignored. But if you still need to support Java 8 (we still support Java 6 in our commercial distributions!), then this has to be considered.

Are Java 8 Streams Truly Lazy? Not Completely!

In a recent article, I’ve shown that programmers should always apply a filter first, map later strategy with streams. The example I made there was this one:

hugeCollection
    .stream()
    .limit(2)
    .map(e -> superExpensiveMapping(e))
    .collect(Collectors.toList());

In this case, the limit() operation implements the filtering, which should take place before the mapping.

Several readers correctly mentioned that in this case, it doesn’t matter what order we’re putting the limit() and map() operations, because most operations are evaluated lazily in the Java 8 Stream API.

Or rather: The collect() terminal operation pulls values from the stream lazily, and as the limit(5) operation reaches the end, it will no longer produce new values, regardless whether map() came before or after. This can be proven easily as follows:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -> i + 1)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .limit(5)
              .forEach(i -> {});

        System.out.println();
        System.out.println();

        Stream.iterate(0, i -> i + 1)
              .limit(5)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .forEach(i -> {});
    }
}

The output of the above is:

Map: 1
Map: 2
Map: 3
Map: 4
Map: 5


Map: 1
Map: 2
Map: 3
Map: 4
Map: 5

But this isn’t always the case!

This optimisation is an implementation detail, and in general, it is not unwise to really apply the filter first, map later rule thoroughly, not relying on such an optimisation. In particular, the Java 8 implementation of flatMap() is not lazy. Consider the following logic, where we put a flatMap() operation in the middle of the stream:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -> i + 1)
              .flatMap(i -> Stream.of(i, i, i, i))
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .limit(5)
              .forEach(i -> {});

        System.out.println();
        System.out.println();

        Stream.iterate(0, i -> i + 1)
              .flatMap(i -> Stream.of(i, i, i, i))
              .limit(5)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .forEach(i -> {});
    }
}

The result is now:

Map: 1
Map: 1
Map: 1
Map: 1
Map: 2
Map: 2
Map: 2
Map: 2


Map: 1
Map: 1
Map: 1
Map: 1
Map: 2

So, the first Stream pipeline will map all the 8 flatmapped values prior to applying the limit, whereas the second Stream pipeline really limits the stream to 5 elements first, and then maps only those.

The reason for this is in the flatMap() implementation:

// In ReferencePipeline.flatMap()
try (Stream<? extends R> result = mapper.apply(u)) {
    if (result != null)
        result.sequential().forEach(downstream);
}

As you can see, the result of the flatMap() operation is consumed eagerly with a terminal forEach() operation, which will always produce all the four values in our case and send them to the next operation. So, flatMap() isn’t lazy, and thus the next operation after it will get all of its results. This is true for Java 8. Future Java versions might improve this, of course.

We better filter them first. And map later.

Update: flatMap() gets fixed in JDK 10

Thanks, Tagir Valeev, for pointing out that there’s a fix coming up:

Relevant links:

https://bugs.openjdk.java.net/browse/JDK-8075939
http://hg.openjdk.java.net/jdk/jdk10/rev/fca88bbbafb9