Benchmarking JDK String.replace() vs Apache Commons StringUtils.replace()

What’s better? Using the JDK’s String.replace() or something like Apache Commons Lang’s Apache Commons Lang’s StringUtils.replace()?

In this article, I’ll compare the two, first in a profiling session using Java Mission Control (JMC), then in a benchmark using JMH, and we’ll see that Java 9 heavily improved things in this area.

Profiling using JMC

In a recent profiling session where I checked for any “obvious” bottlenecks in jOOQ, I’ve discovered this nasty regular expression pattern instantiation:

Tons of int[] instances were allocated by a regular expression pattern. That’s weird, because in general, inside of jOOQ’s internals, special care is always taken to pre-compile any regular expressions that are needed in static members, e.g.:

private static final Pattern TYPE_NAME_PATTERN = 
  Pattern.compile("\\([^\\)]*\\)");

This allows for using the Pattern in a far more optimal way, than e.g. by using String.replaceAll():

// Much better, pattern is pre-compiled
TYPE_NAME_PATTERN.matcher(castTypeName).replaceAll("")

// Much worse, pattern is compiled *every time*
castTypeName.replaceAll("\\([^\\)]*\\)", "")

That should be clear to everyone. The price to pay for this is the fact that the pattern is stored “far away” in some static member, rather than being visible right where it is used, which is a bit less readable. At least in my opinion.

SIDENOTE: People tend to get all angry about premature optimisation and such. Yes, these optimisations are micro optimisations and aren’t always worth the trouble. But this article is about jOOQ, a library that does a lot of expression tree transformations, and it is important for jOOQ to eliminate even 1% “bottlenecks”, as they make a difference. So, please read this article in this context.

Consider also our previous post about this subject: Top 10 Easy Performance Optimisations in Java

What was the problem in jOOQ?

Now, what appears to be obvious when using regular expressions seems less obvious when using ordinary, constant string replacements, such as when calling String.replace(CharSequence), as was done in the linked jOOQ issue #6672. The relevant piece of code was escaping all inline strings that are sent to the SQL database, to prevent syntax errors and, of course, SQL injection:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = result.replace("\\", "\\\\");

    return result.replace("'", "''");
}

We’re always escaping apostrophes by doubling them, and in some databases (e.g. MySQL), we often have to escape backslashes as well (unfortunately, not all ORMs seem to do this or even be aware of this MySQL “feature”).

Unfortunately as well, despite heavy use of Apache Commons Lang’s StringUtils.replace() in jOOQ’s internals, every now and then a String.replace(CharSequence) sneaks in, because it’s just so convenient to write.

Meh, does it matter?

Usually, in ordinary business logic, it shouldn’t (again – don’t optimise prematurely), but in jOOQ, which is essentially a SQL string manipulation library, it can get quite costly if a single replace call is done excessively (for good reasons, of course), and it is slower than it should be. And it is, prior to Java 9, when this method was optimised. I’ve done the profiling with Java 8, where internally, String.replace() uses a literal regex pattern (i.e. a pattern with a “literal” flag that is faster, but it is a pattern, nonetheless).

Not only does the method appear as a major offender in the GC allocation view, it also triggers quite some action in the “hot methods” view of JMC:

Those are quite a few Pattern methods. The percentages have to be understood in the context of a benchmark, running millions of queries against an H2 in-memory database, so the overhead is significant!

Using Apache Commons Lang’s StringUtils

A simple fix is to use Apache Commons Lang’s StringUtils instead:

static final String escape(Object val, Context<?> context) {
    String result = val.toString();

    if (needsBackslashEscaping(context.configuration()))
        result = StringUtils.replace(result, "\\", "\\\\");

    return StringUtils.replace(result, "'", "''");
}

Now, the pressure has changed significantly. The int[] allocation is barely noticeable in comparison:

And much fewer Pattern calls are made, overall.

Benchmarking using JMH

Profiling can be very useful to spot bottlenecks, but it needs to be read with care. It introduces some artefacts and slight overheads and it is not 100% accurate when sampling call stacks, which might lead the wrong conclusions at times. This is why it is sometimes important to back claims by running an actual benchmark. And when benchmarking, please, don’t just loop 1 million times in a main() method. That will be very very inaccurate, except for very obvious, order-of-magnitude scale differences.

I’m using JMH here, running the following simple benchmark:

package org.jooq.test.benchmark;

import org.apache.commons.lang3.StringUtils;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@Fork(value = 3, jvmArgsAppend = "-Djmh.stack.lines=3")
@Warmup(iterations = 5)
@Measurement(iterations = 7)
public class StringReplaceBenchmark {

    private static final String SHORT_STRING_NO_MATCH = "abc";
    private static final String SHORT_STRING_ONE_MATCH = "a'bc";
    private static final String SHORT_STRING_SEVERAL_MATCHES = "'a'b'c'";
    private static final String LONG_STRING_NO_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_ONE_MATCH = 
      "abcabcabcabcabcabcabcabcabcabcabca'bcabcabcabcabcabcabcabcabcabcabcabcabc";
    private static final String LONG_STRING_SEVERAL_MATCHES = 
      "abcabca'bcabcabcabcabcabc'abcabcabca'bcabcabcabcabcabca'bcabcabcabcabcabcabc";

    @Benchmark
    public void testStringReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_NO_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_ONE_MATCH.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(SHORT_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(LONG_STRING_SEVERAL_MATCHES.replace("'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringNoMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_NO_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringOneMatch(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_ONE_MATCH, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceShortStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(SHORT_STRING_SEVERAL_MATCHES, "'", "''"));
    }

    @Benchmark
    public void testStringUtilsReplaceLongStringSeveralMatches(Blackhole blackhole) {
        blackhole.consume(StringUtils.replace(LONG_STRING_SEVERAL_MATCHES, "'", "''"));
    }
}

Notice that I tried to run 2 x 3 different string replacement scenarios:

  • The string is “short”
  • The string is “long”

Cross joining (there, finally some SQL in this post!) the above with:

  • No match is found
  • One match is found
  • Several matches are found

That’s important because different optimisations can be implemented for those different cases, and probably, in jOOQ’s case, there is mostly no match in this particular case.

I ran this benchmark once on Java 8:

$ java -version
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

And on Java 9:

$ java -version
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)

As Tagir Valeev was kind enough to remind me that this issue was supposed to be fixed in Java 9:

The results are:

Java 8

testStringReplaceLongStringNoMatch               thrpt   21    4809343.940 ▒  66443.628  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   25063493.793 ▒ 660657.256  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    1406989.855 ▒  43051.008  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    6961669.111 ▒ 141504.827  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    1103323.491 ▒  17047.449  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3899108.777 ▒  41854.636  ops/s

testStringReplaceShortStringNoMatch              thrpt   21    5936992.874 ▒  68115.030  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  171660973.829 ▒ 377711.864  ops/s

testStringReplaceShortStringOneMatch             thrpt   21    3267435.957 ▒ 240198.763  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21    9943846.428 ▒ 270821.641  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    2313713.015 ▒  28806.738  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5447065.933 ▒ 139525.472  ops/s

As can be seen, the difference is “catastrophic”. Apache Commons Lang’s StringUtils drastically outpeforms the JDK’s String.replace() in every discipline, especially when no match is found in a short string! That’s because the library optimises for this particular case:

...
int end = searchText.indexOf(searchString, start);
if (end == INDEX_NOT_FOUND) {
    return text;
}

Java 9

Things look a bit differently for Java 9:

testStringReplaceLongStringNoMatch               thrpt   21   55528132.674 ▒  479721.812  ops/s
testStringUtilsReplaceLongStringNoMatch          thrpt   21   55767541.806 ▒  754862.755  ops/s

testStringReplaceLongStringOneMatch              thrpt   21    4806322.839 ▒  217538.714  ops/s
testStringUtilsReplaceLongStringOneMatch         thrpt   21    8366539.616 ▒  142757.888  ops/s

testStringReplaceLongStringSeveralMatches        thrpt   21    2685134.029 ▒   78108.171  ops/s
testStringUtilsReplaceLongStringSeveralMatches   thrpt   21    3923819.576 ▒  351103.020  ops/s

testStringReplaceShortStringNoMatch              thrpt   21  122398496.629 ▒ 1350086.256  ops/s
testStringUtilsReplaceShortStringNoMatch         thrpt   21  121139633.453 ▒ 2756892.669  ops/s

testStringReplaceShortStringOneMatch             thrpt   21   18070522.151 ▒  498663.835  ops/s
testStringUtilsReplaceShortStringOneMatch        thrpt   21   11367395.622 ▒  153377.552  ops/s

testStringReplaceShortStringSeveralMatches       thrpt   21    7548407.681 ▒  168950.209  ops/s
testStringUtilsReplaceShortStringSeveralMatches  thrpt   21    5045065.948 ▒  175251.545  ops/s

Java 9’s implementation is now similar to that of Apache Commons, with the same optimisation for non-matches:

public String replace(CharSequence target, CharSequence replacement) {
    String tgtStr = target.toString();
    String replStr = replacement.toString();
    int j = indexOf(tgtStr);
    if (j < 0) {
        return this;
    }
    ...

It is still quite slower for matches in long strings, but faster for matches in short strings. The tradeoff for jOOQ will be to still prefer Apache Commons because:

  • Most people are still on Java 8 or less, currently
  • Most replacements won’t match and both implementations fare equally well for that in Java 9, but Apache Commons is much faster for this category in Java 8
  • If there’s a match and thus a replacement, the speed depends on the string length, where the faster implementation is currently undecided

Conclusion

This micro optimisation stuff matters in jOOQ because jOOQ is a library that does a lot of SQL string manipulation. Every allocation and every CPU cycle that is wasted when manipulating SQL strings slows down the library, and thus impacts all of its users. In a situation like this, it is definitely worth considering not using these useful JDK String methods, and opting for the much faster Apache Commons implementations instead.

Things have improved a lot in Java 9, in case of which this can mostly be ignored. But if you still need to support Java 8 (we still support Java 6 in our commercial distributions!), then this has to be considered.

What you Didn’t Know About JDBC Batch

In our previous blog post “10 Common Mistakes Java Developers Make When Writing SQL“, we have made a point about batching being important when inserting large data sets. In most databases and with most JDBC drivers, you can get a significant performance improvement when running a single prepared statement in batch mode as such:

PreparedStatement s = connection.prepareStatement(
    "INSERT INTO author(id, first_name, last_name)"
  + "  VALUES (?, ?, ?)");

s.setInt(1, 1);
s.setString(2, "Erich");
s.setString(3, "Gamma");
s.addBatch();

s.setInt(1, 2);
s.setString(2, "Richard");
s.setString(3, "Helm");
s.addBatch();

s.setInt(1, 3);
s.setString(2, "Ralph");
s.setString(3, "Johnson");
s.addBatch();

s.setInt(1, 4);
s.setString(2, "John");
s.setString(3, "Vlissides");
s.addBatch();

int[] result = s.executeBatch();

Or with jOOQ:

create.batch(
        insertInto(AUTHOR, ID, FIRST_NAME, LAST_NAME)
       .values((Integer) null, null, null))
      .bind(1, "Erich", "Gamma")
      .bind(2, "Richard", "Helm")
      .bind(3, "Ralph", "Johnson")
      .bind(4, "John", "Vlissides")
      .execute();

What you probably didn’t know, however, is how dramatic the improvement really is and that JDBC drivers like that of MySQL don’t really support batching, whereas Derby, H2, and HSQLDB don’t really seem to benefit from batching. James Sutherland has assembled this very interesting benchmark on his Java Persistence Performance blog, which can be summarised as such:

Database Performance gain when batched
DB2 503%
Derby 7%
H2 20%
HSQLDB 25%
MySQL 5%
MySQL 332% (with rewriteBatchedStatements=true)
Oracle 503%
PostgreSQL 325%
SQL Server 325%

The above table shows the improvement when comparing each database against itself for INSERT, not databases against each other. Regardless of the actual results, it can be said that batching is never worse than not batching for the data set sizes used in the benchmark.

See the full article here to see a more detailed interpretation of the above benchmark results, as well as results for UPDATE statements:
http://java-persistence-performance.blogspot.ch/2013/05/batch-writing-and-dynamic-vs.html