What you Didn’t Know About JDBC Batch

In our previous blog post “10 Common Mistakes Java Developers Make When Writing SQL“, we have made a point about batching being important when inserting large data sets. In most databases and with most JDBC drivers, you can get a significant performance improvement when running a single prepared statement in batch mode as such:

PreparedStatement s = connection.prepareStatement(
    "INSERT INTO author(id, first_name, last_name)"
  + "  VALUES (?, ?, ?)");

s.setInt(1, 1);
s.setString(2, "Erich");
s.setString(3, "Gamma");
s.addBatch();

s.setInt(1, 2);
s.setString(2, "Richard");
s.setString(3, "Helm");
s.addBatch();

s.setInt(1, 3);
s.setString(2, "Ralph");
s.setString(3, "Johnson");
s.addBatch();

s.setInt(1, 4);
s.setString(2, "John");
s.setString(3, "Vlissides");
s.addBatch();

int[] result = s.executeBatch();

Or with jOOQ:

create.batch(
        insertInto(AUTHOR, ID, FIRST_NAME, LAST_NAME)
       .values((Integer) null, null, null))
      .bind(1, "Erich", "Gamma")
      .bind(2, "Richard", "Helm")
      .bind(3, "Ralph", "Johnson")
      .bind(4, "John", "Vlissides")
      .execute();

What you probably didn’t know, however, is how dramatic the improvement really is and that JDBC drivers like that of MySQL don’t really support batching, whereas Derby, H2, and HSQLDB don’t really seem to benefit from batching. James Sutherland has assembled this very interesting benchmark on his Java Persistence Performance blog, which can be summarised as such:

Database Performance gain when batched
DB2 503%
Derby 7%
H2 20%
HSQLDB 25%
MySQL 5%
MySQL 332% (with rewriteBatchedStatements=true)
Oracle 503%
PostgreSQL 325%
SQL Server 325%

The above table shows the improvement when comparing each database against itself for INSERT, not databases against each other. Regardless of the actual results, it can be said that batching is never worse than not batching for the data set sizes used in the benchmark.

See the full article here to see a more detailed interpretation of the above benchmark results, as well as results for UPDATE statements:
http://java-persistence-performance.blogspot.ch/2013/05/batch-writing-and-dynamic-vs.html

2 thoughts on “What you Didn’t Know About JDBC Batch

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s