How to Compile a Class at Runtime with Java 8 and 9

In some cases, it’s really useful to be able to compile a class at runtime using the java.compiler module. You can e.g. load a Java source file from the database, compile it on the fly, and execute its code as if it were part of your application.

In the upcoming jOOR 0.9.8, this will be made possible through https://github.com/jOOQ/jOOR/issues/51. As always with jOOR (and our other projects), we’re wrapping existing JDK API, simplifying the little details that you often don’t want to worry about. Using jOOR API, you can now write:

// Run this code from within the com.example package

Supplier<String> supplier = Reflect.compile(
    "com.example.CompileTest",
    "package com.example;\n" +
    "class CompileTest\n" +
    "implements java.util.function.Supplier<String> {\n" +
    "  public String get() {\n" +
    "    return \"Hello World!\";\n" +
    "  }\n" +
    "}\n"
).create().get();

System.out.println(supplier.get());

And the result is, of course:

Hello World!

If we already had JEP-326, this would be even cooler!

Supplier<String> supplier = Reflect.compile(
    `org.joor.test.CompileTest`,
    `package org.joor.test;
     class CompileTest
     implements java.util.function.Supplier<String> {
       public String get() {
         return "Hello World!"
       }
     }`
).create().get();

System.out.println(supplier.get());

What happens behind the scenes?

Again, as in our previous blog post, we need to ship two different versions of our code. One that works in Java 8 (where reflecting and accessing JDK internal API was possible), and one that works in Java 9+ (where this is forbidden). The full annotated API is here:

package org.joor;

import java.io.ByteArrayOutputStream;
import java.io.OutputStream;
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodHandles.Lookup;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;

import javax.tools.*;

import static java.lang.StackWalker.Option.RETAIN_CLASS_REFERENCE;

class Compile {

    static Class<?> compile(String className, String content) 
    throws Exception {
        Lookup lookup = MethodHandles.lookup();

        // If we have already compiled our class, simply load it
        try {
            return lookup.lookupClass()
                         .getClassLoader()
                         .loadClass(className);
        }

        // Otherwise, let's try to compile it
        catch (ClassNotFoundException ignore) {
            return compile0(className, content, lookup);
        }
    }

    static Class<?> compile0(
        String className, String content, Lookup lookup)
    throws Exception {
        JavaCompiler compiler = 
            ToolProvider.getSystemJavaCompiler();

        ClassFileManager manager = new ClassFileManager(
            compiler.getStandardFileManager(null, null, null));

        List<CharSequenceJavaFileObject> files = new ArrayList<>();
        files.add(new CharSequenceJavaFileObject(
            className, content));

        compiler.getTask(null, manager, null, null, null, files)
                .call();
        Class<?> result = null;

        // Implement a check whether we're on JDK 8. If so, use
        // protected ClassLoader API, reflectively
        if (onJava8()) {
            ClassLoader cl = lookup.lookupClass().getClassLoader();
            byte[] b = manager.o.getBytes();
            result = Reflect.on(cl).call("defineClass", 
                className, b, 0, b.length).get();
        }

        // Lookup.defineClass() has only been introduced in Java 9.
        // It is required to get private-access to interfaces in
        // the class hierarchy
        else {

            // This method is called by client code from two levels
            // up the current stack frame. We need a private-access
            // lookup from the class in that stack frame in order
            // to get private-access to any local interfaces at
            // that location.
            Class<?> caller = StackWalker
                .getInstance(RETAIN_CLASS_REFERENCE)
                .walk(s -> s
                    .skip(2)
                    .findFirst()
                    .get()
                    .getDeclaringClass());

            // If the compiled class is in the same package as the
            // caller class, then we can use the private-access 
            // Lookup of the caller class
            if (className.startsWith(caller.getPackageName() )) {
                result = MethodHandles
                    .privateLookupIn(caller, lookup)
                    .defineClass(fileManager.o.getBytes());
            }

            // Otherwise, use an arbitrary class loader. This
            // approach doesn't allow for loading private-access 
            // interfaces in the compiled class's type hierarchy
            else {
                result = new ClassLoader() {
                    @Override
                    protected Class<?> findClass(String name) 
                    throws ClassNotFoundException {
                        byte[] b = fileManager.o.getBytes();
                        int len = b.length;
                        return defineClass(className, b, 0, len);
                    }
                }.loadClass(className);
            }
        }

        return result;
    }

    // These are some utility classes needed for the JavaCompiler
    // ----------------------------------------------------------

    static final class JavaFileObject 
    extends SimpleJavaFileObject {
        final ByteArrayOutputStream os = 
            new ByteArrayOutputStream();

        JavaFileObject(String name, JavaFileObject.Kind kind) {
            super(URI.create(
                "string:///" 
              + name.replace('.', '/') 
              + kind.extension), 
                kind);
        }

        byte[] getBytes() {
            return os.toByteArray();
        }

        @Override
        public OutputStream openOutputStream() {
            return os;
        }
    }

    static final class ClassFileManager 
    extends ForwardingJavaFileManager<StandardJavaFileManager> {
        JavaFileObject o;

        ClassFileManager(StandardJavaFileManager m) {
            super(m);
        }

        @Override
        public JavaFileObject getJavaFileForOutput(
            JavaFileManager.Location location,
            String className,
            JavaFileObject.Kind kind,
            FileObject sibling
        ) {
            return o = new JavaFileObject(className, kind);
        }
    }

    static final class CharSequenceJavaFileObject 
    extends SimpleJavaFileObject {
        final CharSequence content;

        public CharSequenceJavaFileObject(
            String className, 
            CharSequence content
        ) {
            super(URI.create(
                "string:///" 
              + className.replace('.', '/') 
              + JavaFileObject.Kind.SOURCE.extension), 
                JavaFileObject.Kind.SOURCE);
            this.content = content;
        }

        @Override
        public CharSequence getCharContent(
            boolean ignoreEncodingErrors
        ) {
            return content;
        }
    }
}

Notice how the JDK 9 version is a bit more complicated, as we have to:

  • Find the caller class of our method
  • Get a private method handle lookup for that class if the class being compiled is in the same package as the class calling the compilation
  • Otherwise, use an arbitrary class loader to define the class

Reflection definitely hasn’t become simpler with Java 9!

Correct Reflective Access to Interface Default Methods in Java 8, 9, 10

When performing reflective access to default methods in Java, Google seems to fail us. The solutions presented on Stack Overflow, for instance, seem to work only in a certain set of cases, and not on all Java versions.

This article will illustrate different approaches to calling interface default methods through reflection, as may be required by a proxy, for instance.

TL;DR If you’re impatient, all the access methods exposed in this blog are available in this gist, and the problem is also fixed in our library jOOR.

Proxying interfaces with default methods

The useful java.lang.reflect.Proxy API has been around for a while. We can do cool things like:

import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        void quack();
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                System.out.println("Quack");
                return null;
            }
        );

        duck.quack();
    }
}

This just yields:

Quack

In this example, we create a proxy instance that implements the Duck API through an InvocationHandler, which is essentially just a lambda that gets called for each method call on Duck.

The interesting bit is when we want to have a default method on Duck and delegate the call to that default method:

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

We might be inclined to write this:

import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        default void quack() {
            System.out.println("Quack");
        }
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                method.invoke(proxy);
                return null;
            }
        );

        duck.quack();
    }
}

But this will just generate a long long stack trace of nested exceptions (this isn’t specific to the method being a default method. You simply cannot do this):

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:20)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 2 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 7 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 8 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 13 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ProxyDemo.lambda$0(ProxyDemo.java:15)
	... 14 more
Caused by: java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	... 19 more
...
...
... goes on forever

Not very helpful.

Using method handles

So, the original Google search turned up results that indicate we need to use the MethodHandles API. Let’s try that, then!

import java.lang.invoke.MethodHandles;
import java.lang.reflect.Proxy;

public class ProxyDemo {
    interface Duck {
        default void quack() {
            System.out.println("Quack");
        }
    }

    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles
                    .lookup()
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

That seems to work, cool!

Quack

… until it doesn’t.

Calling a default method on a non-private-accessible interface

The interface in the above example was carefully chosen to be “private-accessible” by the caller, i.e. the interface is nested in the caller’s class. What if we had a top-level interface?

import java.lang.invoke.MethodHandles;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles
                    .lookup()
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

The almost same code snippet no longer works. We get the following IllegalAccessException:

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:26)
Caused by: java.lang.IllegalAccessException: no private access for invokespecial: interface Duck, from Duck/package
	at java.lang.invoke.MemberName.makeAccessException(MemberName.java:850)
	at java.lang.invoke.MethodHandles$Lookup.checkSpecialCaller(MethodHandles.java:1572)
	at java.lang.invoke.MethodHandles$Lookup.unreflectSpecial(MethodHandles.java:1231)
	at ProxyDemo.lambda$0(ProxyDemo.java:19)
	... 2 more

Bummer. When googling further, we might find the following solution, which accesses MethodHandles.Lookup‘s internals through reflection:

import java.lang.invoke.MethodHandles.Lookup;
import java.lang.reflect.Constructor;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                Constructor<Lookup> constructor = Lookup.class
                    .getDeclaredConstructor(Class.class);
                constructor.setAccessible(true);
                constructor.newInstance(Duck.class)
                    .in(Duck.class)
                    .unreflectSpecial(method, Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}

And yay, we get:

Quack

We get that on JDK 8. What about JDK 9 or 10?

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by ProxyDemo (file:/C:/Users/lukas/workspace/playground/target/classes/) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class)
WARNING: Please consider reporting this to the maintainers of ProxyDemo
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Quack

Oops. That’s what happens by default. If we run the program with the --illegal-access=deny flag:

java --illegal-access=deny ProxyDemo

Then, we’re getting (and rightfully so):

Exception in thread "main" java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @357246de
        at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:337)
        at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:281)
        at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:192)
        at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:185)
        at ProxyDemo.lambda$0(ProxyDemo.java:18)
        at $Proxy0.quack(Unknown Source)
        at ProxyDemo.main(ProxyDemo.java:28)

One of the Jigsaw project’s goals is to precisely not allow such hacks to persist. So, what’s a better solution? This?

import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodType;
import java.lang.reflect.Proxy;

interface Duck {
    default void quack() {
        System.out.println("Quack");
    }
}

public class ProxyDemo {
    public static void main(String[] a) {
        Duck duck = (Duck) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { Duck.class },
            (proxy, method, args) -> {
                MethodHandles.lookup()
                    .findSpecial( 
                         Duck.class, 
                         "quack",  
                         MethodType.methodType( 
                             void.class, 
                             new Class[0]),  
                         Duck.class)
                    .bindTo(proxy)
                    .invokeWithArguments();
                return null;
            }
        );

        duck.quack();
    }
}
Quack

Great, it works in Java 9 and 10, what about Java 8?

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at $Proxy0.quack(Unknown Source)
	at ProxyDemo.main(ProxyDemo.java:25)
Caused by: java.lang.IllegalAccessException: no private access for invokespecial: interface Duck, from ProxyDemo
	at java.lang.invoke.MemberName.makeAccessException(MemberName.java:850)
	at java.lang.invoke.MethodHandles$Lookup.checkSpecialCaller(MethodHandles.java:1572)
	at java.lang.invoke.MethodHandles$Lookup.findSpecial(MethodHandles.java:1002)
	at ProxyDemo.lambda$0(ProxyDemo.java:18)
	... 2 more

You’re kidding, right?

So, there’s a solution (hack) that works on Java 8 but not on 9 or 10, and there’s a solution that works on Java 9 and 10, but not on Java 8.

A more thorough examination

So far, I’ve just been trying to run different things on different JDKs. The following class tries all combinations. It’s also available in this gist here.

Compile it with JDK 9 or 10 (because it also tries using JDK 9+ API: MethodHandles.privateLookupIn()), but compile it using this command, so you can also run the class on JDK 8:

javac -source 1.8 -target 1.8 CallDefaultMethodThroughReflection.java
import java.lang.invoke.MethodHandles;
import java.lang.invoke.MethodHandles.Lookup;
import java.lang.invoke.MethodType;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;


interface PrivateInaccessible {
    default void quack() {
        System.out.println(" -> PrivateInaccessible.quack()");
    }
}

public class CallDefaultMethodThroughReflection {
    interface PrivateAccessible {
        default void quack() {
            System.out.println(" -> PrivateAccessible.quack()");
        }
    }

    public static void main(String[] args) {
        System.out.println("PrivateAccessible");
        System.out.println("-----------------");
        System.out.println();
        proxy(PrivateAccessible.class).quack();

        System.out.println();
        System.out.println("PrivateInaccessible");
        System.out.println("-------------------");
        System.out.println();
        proxy(PrivateInaccessible.class).quack();
    }

    private static void quack(Lookup lookup, Class<?> type, Object proxy) {
        System.out.println("Lookup.in(type).unreflectSpecial(...)");

        try {
            lookup.in(type)
                  .unreflectSpecial(type.getMethod("quack"), type)
                  .bindTo(proxy)
                  .invokeWithArguments();
        }
        catch (Throwable e) {
            System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
        }

        System.out.println("Lookup.findSpecial(...)");
        try {
            lookup.findSpecial(type, "quack", MethodType.methodType(void.class, new Class[0]), type)
                  .bindTo(proxy)
                  .invokeWithArguments();
        }
        catch (Throwable e) {
            System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
        }
    }

    @SuppressWarnings("unchecked")
    private static <T> T proxy(Class<T> type) {
        return (T) Proxy.newProxyInstance(
            Thread.currentThread().getContextClassLoader(),
            new Class[] { type },
            (Object proxy, Method method, Object[] arguments) -> {
                System.out.println("MethodHandles.lookup()");
                quack(MethodHandles.lookup(), type, proxy);

                try {
                    System.out.println();
                    System.out.println("Lookup(Class)");
                    Constructor<Lookup> constructor = Lookup.class.getDeclaredConstructor(Class.class);
                    constructor.setAccessible(true);
                    constructor.newInstance(type);
                    quack(constructor.newInstance(type), type, proxy);
                }
                catch (Exception e) {
                    System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
                }

                try {
                    System.out.println();
                    System.out.println("MethodHandles.privateLookupIn()");
                    quack(MethodHandles.privateLookupIn(type, MethodHandles.lookup()), type, proxy);
                }
                catch (Error e) {
                    System.out.println(" -> " + e.getClass() + ": " + e.getMessage());
                }

                return null;
            }
        );
    }
}

The output of the above program is:

Java 8

$ java -version
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

$ java CallDefaultMethodThroughReflection
PrivateAccessible
-----------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface CallDefaultMethodThroughReflection$PrivateAccessible, from CallDefaultMethodThroughReflection

Lookup(Class)
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

MethodHandles.privateLookupIn()
 -> class java.lang.NoSuchMethodError: java.lang.invoke.MethodHandles.privateLookupIn(Ljava/lang/Class;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandles$Lookup;

PrivateInaccessible
-------------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from PrivateInaccessible/package
Lookup.findSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from CallDefaultMethodThroughReflection

Lookup(Class)
Lookup.in(type).unreflectSpecial(...)
 -> PrivateInaccessible.quack()
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

MethodHandles.privateLookupIn()
 -> class java.lang.NoSuchMethodError: java.lang.invoke.MethodHandles.privateLookupIn(Ljava/lang/Class;Ljava/lang/invoke/MethodHandles$Lookup;)Ljava/lang/invoke/MethodHandles$Lookup;

Java 9

$ java -version
java version "9.0.4"
Java(TM) SE Runtime Environment (build 9.0.4+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)

$ java --illegal-access=deny CallDefaultMethodThroughReflection
PrivateAccessible
-----------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

Lookup(Class)
 -> class java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @30c7da1e

MethodHandles.privateLookupIn()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateAccessible.quack()
Lookup.findSpecial(...)
 -> PrivateAccessible.quack()

PrivateInaccessible
-------------------

MethodHandles.lookup()
Lookup.in(type).unreflectSpecial(...)
 -> class java.lang.IllegalAccessException: no private access for invokespecial: interface PrivateInaccessible, from PrivateInaccessible/package (unnamed module @30c7da1e)
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

Lookup(Class)
 -> class java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.invoke.MethodHandles$Lookup(java.lang.Class) accessible: module java.base does not "opens java.lang.invoke" to unnamed module @30c7da1e

MethodHandles.privateLookupIn()
Lookup.in(type).unreflectSpecial(...)
 -> PrivateInaccessible.quack()
Lookup.findSpecial(...)
 -> PrivateInaccessible.quack()

Java 10

$ java -version
java version "10" 2018-03-20
Java(TM) SE Runtime Environment 18.3 (build 10+46)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10+46, mixed mode)

$ java --illegal-access=deny CallDefaultMethodThroughReflection
... same result as in Java 9

Conclusion

Getting this right is a bit tricky.

  • In Java 8, the best working approach is the hack that opens up the JDK’s internals by accessing a package-private Lookup constructor. This is the only way to consistently call default methods on both private-accessible and private-inaccessible interfaces from any location.
  • In Java 9 and 10, the best working approaches are Lookup.findSpecial() (didn’t work in Java 8) or the new MethodHandles.privateLookupIn() (didn’t exist in in Java 8). The latter is required in case the interfaced is located in another module. That module will still need to open the interface’s package to the caller.

It’s fair to say that this is a bit of a mess. The appropriate meme here is:

According to Rafael Winterhalter (author of ByteBuddy), the “real” fix should go into a revised Proxy API:

I’m not sure if that would solve all the problems, but it should definitely be the case that an implementor shouldn’t worry about all of the above.

Also, clearly, this article didn’t do the complete work, e.g. of testing whether the approaches still work if Duck is imported from another module:

… which will be a topic of another blog post.

Using jOOR

If you’re using jOOR (our reflection library, check it out here), the upcoming version 0.9.8 will include a fix for this:
https://github.com/jOOQ/jOOR/issues/49

The fix simply uses the unsafe reflection approach in Java 8, or the MethodHandles.privateLookupIn() approach in Java 9+. You can then write:

Reflect.on(new Object()).as(PrivateAccessible.class).quack();
Reflect.on(new Object()).as(PrivateInaccessible.class).quack();

Map Reducing a Set of Values Into a Dynamic SQL UNION Query

Sounds fancy, right? But it’s a really nice and reasonable approach to doing dynamic SQL with jOOQ.

This blog post is inspired by a Stack Overflow question, where a user wanted to turn a set of values into a dynamic UNION query like this:

SELECT T.COL1
FROM T
WHERE T.COL2 = 'V1'
UNION
SELECT T.COL1
FROM T
WHERE T.COL2 = 'V2'
...
UNION
SELECT T.COL1
FROM T
WHERE T.COL2 = 'VN'

Note, both the Stack Overflow user and I are well aware of the possibility of using IN predicates :-), let’s just assume for the sake of argument, that the UNION query indeed outperforms the IN predicate in the user’s particular MySQL version and database. If this cannot be accepted, just imagine a more complex use case.

The solution in Java is really very simple:

import static org.jooq.impl.DSL.*;
import java.util.*;
import org.jooq.*;

public class Unions {
    public static void main(String[] args) {
        List<String> list = Arrays.asList("V1", "V2", "V3", "V4");

        System.out.println(
            list.stream()
                .map(Unions::query)
                .reduce(Select::union));
    }

    // Dynamically construct a query from an input string
    private static Select<Record1<String>> query(String s) {
        return select(T.COL1).from(T).where(T.COL2.eq(s));
    }
}

The output is:

Optional[(
  select T.COL1
  from T
  where T.COL2 = 'V1'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V2'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V3'
)
union (
  select T.COL1
  from T
  where T.COL2 = 'V4'
)]

If you’re using JDK 9+ (which has Optional.stream()), you can further proceed to running the query fluently as follows:

List<String> list = Arrays.asList("V1", "V2", "V3", "V4");

try (Stream<Record1<String>> stream = list.stream()
    .map(Unions::query)
    .reduce(Select::union))
    .stream() // Optional.stream()!
    .flatMap(Select::fetchStream)) {
    ...
}

This way, if the list is empty, reduce will return an empty optional. Streaming that empty optional will result in not fetching any results from the database.

Benchmarking JDK String.replace() vs Apache Commons StringUtils.replace()

What’s better? Using the JDK’s String.replace() or something like Apache Commons Lang’s Apache Commons Lang’s StringUtils.replace()?

In this article, I’ll compare the two, first in a profiling session using Java Mission Control (JMC), then in a benchmark using JMH, and we’ll see that Java 9 heavily improved things in this area.

Profiling using JMC

In a recent profiling session where I checked for any “obvious” bottlenecks in jOOQ, I’ve discovered this nasty regular expression pattern instantiation:

Tons of int[] instances were allocated by a regular expression pattern. That’s weird, because in general, inside of jOOQ’s internals, special care is always taken to pre-compile any regular expressions that are needed in static members, e.g.:

private static final Pattern TYPE_NAME_PATTERN = 
  Pattern.compile("\\([^\\)]*\\)");

This allows for using the Pattern in a far more optimal way, than e.g. by using String.replaceAll():

// Much better, pattern is pre-compiled
TYPE_NAME_PATTERN.matcher(castTypeName).replaceAll("")

// Much worse, pattern is compiled *every time*
castTypeName.replaceAll("\\([^\\)]*\\)", "")

That should be clear to everyone. The price to pay for this is the fact that the pattern is stored “far away” in some static member, rather than being visible right where it is used, which is a bit less readable. At least in my opinion.

SIDENOTE: People tend to get all angry about premature optimisation and such. Yes, these optimisations are micro optimisations and aren’t always worth the trouble. But this article is about jOOQ, a library that does a lot of expression tree transformations, and it is important for jOOQ to eliminate even 1% “bottlenecks”, as they make a difference. So, please read this article in this context.

Consider also our previous post about this subject: Top 10 Easy Performance Optimisations in Java

What was the problem in jOOQ?

Now, what appears to be obvious when using regular expressions seems less obvious when using ordinary, constant string replacements, such as when calling String.replace(CharSequence), as was done in the linked jOOQ issue #6672. The relevant piece of code was escaping all inline strings that are sent to the SQL database, to prevent syntax errors and, of course, SQL injection:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = result.replace("\\", "\\\\");

    return result.replace("'", "''");
}

We’re always escaping apostrophes by doubling them, and in some databases (e.g. MySQL), we often have to escape backslashes as well (unfortunately, not all ORMs seem to do this or even be aware of this MySQL “feature”).

Unfortunately as well, despite heavy use of Apache Commons Lang’s StringUtils.replace() in jOOQ’s internals, every now and then a String.replace(CharSequence) sneaks in, because it’s just so convenient to write.

Meh, does it matter?

Usually, in ordinary business logic, it shouldn’t (again – don’t optimise prematurely), but in jOOQ, which is essentially a SQL string manipulation library, it can get quite costly if a single replace call is done excessively (for good reasons, of course), and it is slower than it should be. And it is, prior to Java 9, when this method was optimised. I’ve done the profiling with Java 8, where internally, String.replace() uses a literal regex pattern (i.e. a pattern with a “literal” flag that is faster, but it is a pattern, nonetheless).

Not only does the method appear as a major offender in the GC allocation view, it also triggers quite some action in the “hot methods” view of JMC:

Those are quite a few Pattern methods. The percentages have to be understood in the context of a benchmark, running millions of queries against an H2 in-memory database, so the overhead is significant!

Using Apache Commons Lang’s StringUtils

A simple fix is to use Apache Commons Lang’s StringUtils instead:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = StringUtils.replace(result, "\\", "\\\\");

    return StringUtils.replace(result, "'", "''");
}

Now, the pressure has changed significantly. The int[] allocation is barely noticeable in comparison:

And much fewer Pattern calls are made, overall.

Benchmarking using JMH

Profiling can be very useful to spot bottlenecks, but it needs to be read with care. It introduces some artefacts and slight overheads and it is not 100% accurate when sampling call stacks, which might lead the wrong conclusions at times. This is why it is sometimes important to back claims by running an actual benchmark. And when benchmarking, please, don’t just loop 1 million times in a main() method. That will be very very inaccurate, except for very obvious, order-of-magnitude scale differences.

I’m using JMH here, running the following simple benchmark:

package org.jooq.test.benchmark;

import org.apache.commons.lang3.StringUtils;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@Fork(value = 3, jvmArgsAppend = "-Djmh.stack.lines=3")
@Warmup(iterations = 5)
@Measurement(iterations = 7)
public class StringReplaceBenchmark {

    private static final String SHORT_STRING_NO_MATCH = "abc";
    private static final String SHORT_STRING_ONE_MATCH = "a'bc";
    private static final String SHORT_STRING_SEVERAL_MATCHES = "'a'b'c'";
    private static final String LONG_STRING_NO_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_ONE_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabca'bcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_SEVERAL_MATCHES = 
      "abcabca'bcabcabcabcabcabc'abcabcabca'bcabcabcabcabcabca'bcabcabcabcabcabcabc";

    @Benchmark
    public void testStringReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_SEVERAL_MATCHES, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_SEVERAL_MATCHES, "'", "''"));
    }
}

Notice that I tried to run 2 x 3 different string replacement scenarios:

  • The string is “short”
  • The string is “long”

Cross joining (there, finally some SQL in this post!) the above with:

  • No match is found
  • One match is found
  • Several matches are found

That’s important because different optimisations can be implemented for those different cases, and probably, in jOOQ’s case, there is mostly no match in this particular case.

I ran this benchmark once on Java 8:

$ java -version
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

And on Java 9:

$ java -version
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)

As Tagir Valeev was kind enough to remind me that this issue was supposed to be fixed in Java 9:

The results are:

Java 8

testStringReplaceLongStringNoMatch               thrpt   21    4809343.940 ▒  66443.628  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   25063493.793 ▒ 660657.256  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    1406989.855 ▒  43051.008  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    6961669.111 ▒ 141504.827  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    1103323.491 ▒  17047.449  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3899108.777 ▒  41854.636  ops/s

testStringReplaceShortStringNoMatch              thrpt   21    5936992.874 ▒  68115.030  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  171660973.829 ▒ 377711.864  ops/s

testStringReplaceShortStringOneMatch             thrpt   21    3267435.957 ▒ 240198.763  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21    9943846.428 ▒ 270821.641  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    2313713.015 ▒  28806.738  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5447065.933 ▒ 139525.472  ops/s

As can be seen, the difference is “catastrophic”. Apache Commons Lang’s StringUtils drastically outpeforms the JDK’s String.replace() in every discipline, especially when no match is found in a short string! That’s because the library optimises for this particular case:

...
int end = searchText.indexOf(searchString, start);
if (end == INDEX_NOT_FOUND) {
    return text;
}

Java 9

Things look a bit differently for Java 9:

testStringReplaceLongStringNoMatch               thrpt   21   55528132.674 ▒  479721.812  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   55767541.806 ▒  754862.755  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    4806322.839 ▒  217538.714  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    8366539.616 ▒  142757.888  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    2685134.029 ▒   78108.171  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3923819.576 ▒  351103.020  ops/s

testStringReplaceShortStringNoMatch              thrpt   21  122398496.629 ▒ 1350086.256  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  121139633.453 ▒ 2756892.669  ops/s

testStringReplaceShortStringOneMatch             thrpt   21   18070522.151 ▒  498663.835  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21   11367395.622 ▒  153377.552  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    7548407.681 ▒  168950.209  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5045065.948 ▒  175251.545  ops/s

Java 9’s implementation is now similar to that of Apache Commons, with the same optimisation for non-matches:

public String replace(CharSequence target, CharSequence replacement) {
    String tgtStr = target.toString();
    String replStr = replacement.toString();
    int j = indexOf(tgtStr);
    if (j < 0) {
        return this;
    }
    ...

It is still quite slower for matches in long strings, but faster for matches in short strings. The tradeoff for jOOQ will be to still prefer Apache Commons because:

  • Most people are still on Java 8 or less, currently
  • Most replacements won’t match and both implementations fare equally well for that in Java 9, but Apache Commons is much faster for this category in Java 8
  • If there’s a match and thus a replacement, the speed depends on the string length, where the faster implementation is currently undecided

Conclusion

This micro optimisation stuff matters in jOOQ because jOOQ is a library that does a lot of SQL string manipulation. Every allocation and every CPU cycle that is wasted when manipulating SQL strings slows down the library, and thus impacts all of its users. In a situation like this, it is definitely worth considering not using these useful JDK String methods, and opting for the much faster Apache Commons implementations instead.

Things have improved a lot in Java 9, in case of which this can mostly be ignored. But if you still need to support Java 8 (we still support Java 6 in our commercial distributions!), then this has to be considered.

Are Java 8 Streams Truly Lazy? Not Completely!

In a recent article, I’ve shown that programmers should always apply a filter first, map later strategy with streams. The example I made there was this one:

hugeCollection
    .stream()
    .limit(2)
    .map(e -> superExpensiveMapping(e))
    .collect(Collectors.toList());

In this case, the limit() operation implements the filtering, which should take place before the mapping.

Several readers correctly mentioned that in this case, it doesn’t matter what order we’re putting the limit() and map() operations, because most operations are evaluated lazily in the Java 8 Stream API.

Or rather: The collect() terminal operation pulls values from the stream lazily, and as the limit(5) operation reaches the end, it will no longer produce new values, regardless whether map() came before or after. This can be proven easily as follows:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -> i + 1)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .limit(5)
              .forEach(i -> {});

        System.out.println();
        System.out.println();

        Stream.iterate(0, i -> i + 1)
              .limit(5)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .forEach(i -> {});
    }
}

The output of the above is:

Map: 1
Map: 2
Map: 3
Map: 4
Map: 5


Map: 1
Map: 2
Map: 3
Map: 4
Map: 5

But this isn’t always the case!

This optimisation is an implementation detail, and in general, it is not unwise to really apply the filter first, map later rule thoroughly, not relying on such an optimisation. In particular, the Java 8 implementation of flatMap() is not lazy. Consider the following logic, where we put a flatMap() operation in the middle of the stream:

import java.util.stream.Stream;

public class LazyStream {
    public static void main(String[] args) {
        Stream.iterate(0, i -> i + 1)
              .flatMap(i -> Stream.of(i, i, i, i))
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .limit(5)
              .forEach(i -> {});

        System.out.println();
        System.out.println();

        Stream.iterate(0, i -> i + 1)
              .flatMap(i -> Stream.of(i, i, i, i))
              .limit(5)
              .map(i -> i + 1)
              .peek(i -> System.out.println("Map: " + i))
              .forEach(i -> {});
    }
}

The result is now:

Map: 1
Map: 1
Map: 1
Map: 1
Map: 2
Map: 2
Map: 2
Map: 2


Map: 1
Map: 1
Map: 1
Map: 1
Map: 2

So, the first Stream pipeline will map all the 8 flatmapped values prior to applying the limit, whereas the second Stream pipeline really limits the stream to 5 elements first, and then maps only those.

The reason for this is in the flatMap() implementation:

// In ReferencePipeline.flatMap()
try (Stream<? extends R> result = mapper.apply(u)) {
    if (result != null)
        result.sequential().forEach(downstream);
}

As you can see, the result of the flatMap() operation is consumed eagerly with a terminal forEach() operation, which will always produce all the four values in our case and send them to the next operation. So, flatMap() isn’t lazy, and thus the next operation after it will get all of its results. This is true for Java 8. Future Java versions might improve this, of course.

We better filter them first. And map later.

Update: flatMap() gets fixed in JDK 10

Thanks, Tagir Valeev, for pointing out that there’s a fix coming up:

Relevant links:

https://bugs.openjdk.java.net/browse/JDK-8075939
http://hg.openjdk.java.net/jdk/jdk10/rev/fca88bbbafb9

A Nice API Design Gem: Strategy Pattern With Lambdas

With Java 8 lambdas being available to us as a programming tool, there is a “new” and elegant way of constructing objects. I put “new” in quotes, because it’s not new. It used to be called the strategy pattern, but as I’ve written on this blog before, many GoF patterns will no longer be implemented in their classic OO way, now that we have lambdas.

A simple example from jOOQ

jOOQ knows a simple type called Converter. It’s a simple SPI, which allows users to implement custom data types and inject data type conversion into jOOQ’s type system. The interface looks like this:

public interface Converter<T, U> {
    U from(T databaseObject);
    T to(U userObject);
    Class<T> fromType();
    Class<U> toType();
}

Users will have to implement 4 methods:

  • Conversion from a database (JDBC) type T to the user type U
  • Conversion from the user type U to the database (JDBC) type T
  • Two methods providing a Class reference, to work around generic type erasure

Now, an implementation that converts hex strings (database) to integers (user type):

public class HexConverter implements Converter<String, Integer> {

    @Override
    public Integer from(String hexString) {
        return hexString == null 
            ? null 
            : Integer.parseInt(hexString, 16);
    }

    @Override
    public String to(Integer number) {
        return number == null 
            ? null 
            : Integer.toHexString(number);
    }

    @Override
    public Class<String> fromType() {
        return String.class;
    }

    @Override
    public Class<Integer> toType() {
        return Integer.class;
    }
}

That wasn’t difficult to write, but it’s quite boring to write this much boilerplate:

  • Why do we need to give this class a name?
  • Why do we need to override methods?
  • Why do we need to handle nulls ourselves?

Now, we could write some object oriented libraries, e.g. abstract base classes that take care at least of the fromType() and toType() methods, but much better: The API designer can provide a “constructor API”, which allows users to provide “strategies”, which is just a fancy name for “function”. One function (i.e. lambda) for each of the four methods. For example:

public interface Converter<T, U> {
    ...

    static <T, U> Converter<T, U> of(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return new Converter<T, U>() { ... boring code here ... }
    }

    static <T, U> Converter<T, U> ofNullable(
        Class<T> fromType,
        Class<U> toType,
        Function<? super T, ? extends U> from,
        Function<? super U, ? extends T> to
    ) {
        return of(
            fromType,
            toType,

            // Boring null handling code here
            t -> t == null ? null : from.apply(t),
            u -> u == null ? null : to.apply(u)
        );
    }
}

From now on, we can easily write converters in a functional way. For example, our HexConverter would become:

Converter<String, Integer> converter =
Converter.ofNullable(
    String.class,
    Integer.class,
    s -> Integer.parseInt(s, 16),
    Integer::toHexString
);

Wow! This is really nice, isn’t it? This is the pure essence of what it means to write a Converter. No more overriding, null handling, type juggling, just the bidirectional conversion logic.

Other examples

A more famous example is the JDK 8 Collector.of() constructor, without which it would be much more tedious to implement a collector. For example, if we want to find the second largest element in a stream… easy!

for (int i : Stream.of(1, 8, 3, 5, 6, 2, 4, 7)
                   .collect(Collector.of(
    () -> new int[] { Integer.MIN_VALUE, Integer.MIN_VALUE },
    (a, t) -> {
        if (a[0] < t) {
            a[1] = a[0];
            a[0] = t;
        }
        else if (a[1] < t)
            a[1] = t;
    },
    (a1, a2) -> {
        throw new UnsupportedOperationException(
            "Say no to parallel streams");
    }
)))
    System.out.println(i);

Run this, and you get:

8
7

Bonus exercise: Make the collector parallel capable by implementing the combiner correctly. In a sequential-only scenario, we don’t need it (until we do, of course…).

Conclusion

The concrete examples are nice examples of API usage, but the key message is this:

If you have an interface of the form:

interface MyInterface {
    void myMethod1();
    String myMethod2();
    void myMethod3(String value);
    String myMethod4(String value);
}

Then, just add a convenience constructor to the interface, accepting Java 8 functional interfaces like this:

// You write this boring stuff
interface MyInterface {
    static MyInterface of(
        Runnable function1,
        Supplier<String> function2,
        Consumer<String> function3,
        Function<String, String> function4
    ) {
        return new MyInterface() {
            @Override
            public void myMethod1() {
                function1.run();
            }

            @Override
            public String myMethod2() {
                return function2.get();
            }

            @Override
            public void myMethod3(String value) {
                function3.accept(value);
            }

            @Override
            public String myMethod4(String value) {
                return function4.apply(value);
            }
        }
    }
}

As an API designer, you write this boilerplate only once. And your users can then easily write things like these:

// Your users write this awesome stuff
MyInterface.of(
    () -> { ... },
    () -> "hello",
    v -> { ... },
    v -> "world"
);

Easy! And your users will love you forever for this.

Should I Implement the Arcane Iterator.remove() Method? Yes You (Probably) Should

An interesting question was asked on reddit’s /r/java recently:

Should Iterators be used to modify a custom Collection?

Paraphrasing the question: The author wondered whether a custom java.util.Iterator that is returned from a mutable Collection.iterator() method should implement the weird Iterator.remove() method.

A totally understandable question.

What does Iterator.remove() do?

Few people ever use this method at all. For instance, if you want to implement a generic way to remove null values from an arbitrary Collection, this would be the most generic approach:

Collection<Integer> collection =
Stream.of(1, 2, null, 3, 4, null, 5, 6)
      .collect(Collectors.toCollection(ArrayList::new));

System.out.println(collection);

Iterator<Integer> it = collection.iterator();
while (it.hasNext())
    if (it.next() == null)
        it.remove();

System.out.println(collection);

The above program will print:

[1, 2, null, 3, 4, null, 5, 6]
[1, 2, 3, 4, 5, 6]

Somehow, this API usage does feel dirty. An Iterator seems to be useful to … well … iterate its backing collection. It’s really weird that it also allows for modifying it. It’s even weirder that it only offers removal. E.g. we cannot add a new element before or after the current element of iteration, or replace it.

Luckily, Java 8 provides us with a much better method on the Collection API directly, namely Collection.removeIf(Predicate).

The above iteration code can be re-written as such:

collection.removeIf(Objects::isNull);

OK, now should I implement remove() on my own iterators?

Yes, you should – if your custom collection is mutable. For a very simple reason. Check out the default implementation of Collection.removeIf():

default boolean removeIf(Predicate<? super E> filter) {
    Objects.requireNonNull(filter);
    boolean removed = false;
    final Iterator<E> each = iterator();
    while (each.hasNext()) {
        if (filter.test(each.next())) {
            each.remove();
            removed = true;
        }
    }
    return removed;
}

As I said. The most generic way to remove specific elements from a Collection is precisely to go by its Iterator.remove() method and that’s precisely what the JDK does. Subtypes like ArrayList may of course override this implementation because there’s a more performant alternative, but in general, if you write your own custom, modifiable collection, you should implement this method.

And enjoy the ride into Java’s peculiar, historic caveats for which we all love the language.