MyBatis’ Wicked Statement Builders

Now here’s one of the most wicked API’s I’ve seen in a while!

MyBatis is well-known as a database abstraction framework on top of JDBC, allowing for externalising SQL into files, loading them at appropriate places in your Java code. For those of you who like this approach, you may be used to statements similar to this one here, taken from the MyBatis documentation:

<select id="selectPerson" parameterType="int" resultType="hashmap">
  SELECT * FROM PERSON WHERE ID = #{id}
</select>

This is MyBatis alright, you think? You haven’t seen everything. MyBatis also has statement builders. Not just any type of statement builders. They make your Java code look like SQL. Check this out:

private String selectPersonSql() {
  BEGIN(); // Clears ThreadLocal variable
  SELECT("P.ID, P.USERNAME, P.PASSWORD, P.FULL_NAME");
  SELECT("P.LAST_NAME, P.CREATED_ON, P.UPDATED_ON");
  FROM("PERSON P");
  FROM("ACCOUNT A");
  INNER_JOIN("DEPARTMENT D on D.ID = P.DEPARTMENT_ID");
  INNER_JOIN("COMPANY C on D.COMPANY_ID = C.ID");
  WHERE("P.ID = A.ID");
  WHERE("P.FIRST_NAME like ?");
  OR();
  WHERE("P.LAST_NAME like ?");
  GROUP_BY("P.ID");
  HAVING("P.LAST_NAME like ?");
  OR();
  HAVING("P.FIRST_NAME like ?");
  ORDER_BY("P.ID");
  ORDER_BY("P.FULL_NAME");
  return SQL();
}

… jeez, that is one of the funniest ideas I’ve ever seen. A ThreadLocal to keep the SQL statement currently being rendered!! And an API just … err … adding Strings to an internal StringBuilder? Wow.

You can decide for yourself whether you like this or not. But you can’t say these guys aren’t creative 😉 Be sure to read the full docs to learn also about how to write INSERT, UPDATE and DELETE:

http://www.mybatis.org/core/statement-builders.html

FOR UPDATE simulation in SQL Server and CUBRID

Pessimistic locking is useful every now and then when you explicitly want to lock a given set of rows for a subsequent UPDATE, or for database-based inter-process synchronisation and locking. Normally, you would use the SQL Standard FOR UPDATE clause, which is supported in most databases.

A SQL standard example

-- This statement will produce a row-lock on three rows
-- ... or fail / timeout if the row-lock cannot be acquired
SELECT *
  FROM author
 WHERE id IN (3, 4, 5)
   FOR UPDATE

-- This statement will produce cell-locks on two cells in three rows
-- ... or fail / timeout if the cell-lock cannot be acquired
SELECT *
  FROM author
 WHERE id in (3, 4, 5)
   FOR UPDATE OF first_name, last_name

An Oracle syntax extension example

Oracle ships with some useful extensions to the FOR UPDATE clause:

-- This statement will produce a row-lock on three rows
-- skipping all locked records. This is very nice for queue tables where
-- you never want to fail on locks, but just skip to the next record
SELECT *
  FROM author
 WHERE id IN (3, 4, 5)
   FOR UPDATE SKIP LOCKED

Simulation of the FOR UPDATE clause

SQL Server supports FOR UPDATE only in cursors (as defined in the SQL standard). It also ships with proprietary SQL syntax extensions, such as WITH (updlock), that has some very peculiar “features”, such as locking the whole page of a record instead of just one record. The CUBRID database on the other hand doesn’t support pessimistic locking at all in SQL.

But this can be simulated with JDBC, as JDBC allows for specifying the flags TYPE_SCROLL_SENSITIVE, CONCUR_UPDATABLE for any statement, and then using ResultSet.updateXXX() methods to produce a cell-lock / row-lock. Here’s a simplified example in JDBC:

PreparedStatement stmt = connection.prepareStatement(
  "SELECT * FROM author WHERE id IN (3, 4, 5)",
  ResultSet.TYPE_SCROLL_SENSITIVE, 
  ResultSet.CONCUR_UPDATABLE);
ResultSet rs = stmt.executeQuery();

while (rs.next()) {

  // UPDATE the primary key for row-locks, or any other
  // columns for cell-locks
  rs.updateObject(1, rs.getObject(1));
  rs.updateRow();

  // Do more stuff with this record
}

Drawbacks of this approach

The main drawback of this approach is the fact that the database has to maintain a scrollable cursor, whose records are locked one by one. This can cause a major risk of deadlocks or race conditions if the JDBC driver can recover from the unsuccessful locking, if two Java threads execute the following statements:

-- thread 1
SELECT * FROM author ORDER BY id ASC;

-- thread 2
SELECT * FROM author ORDER BY id DESC;

So use this technique with care, possibly only ever locking single rows!

An example using jOOQ to abstract this simulation

The following example shows how jOOQ integration tests test pessimistic locking for all SQL dialects, including SQL Server and CUBRID:

// Two factories with different connections
Factory create1 = // ...
Factory create2 = // ...

// A synchronized list to check for correct execution order
final Vector<String> execOrder = new Vector<String>();

try {
    // This is the test case:
    // 0.0s: Both threads start
    // 0.0s: t1 sleeps for 2s
    // 0.0s: t2 locks the AUTHOR table
    // 0.1s: t2 sleeps for 4s
    // 2.0s: t1 blocks on the AUTHOR table (potentially failing here)
    // 4.0s: t2 commits and unlocks AUTHOR
    // 4.1s: t1 (if blocked) now locks the AUTHOR table

    // For simplicity, no ReentrantLocks and signals are used for
    // synchronisation, but simple Thread.sleep() calls.

    final Thread t1 = new Thread(new Runnable() {
        @Override
        public void run() {
            Thread.sleep(2000);
            execOrder.add("t1-block");
            try {
                create1
                    .select(AUTHOR.ID)
                    .from(AUTHOR)
                    .forUpdate()
                    .fetch();
            }

            // Some databases fail on locking, others lock for a while
            catch (Exception ignore) {
            }
            finally {
                execOrder.add("t1-fail-or-t2-commit");
            }
        }
    });

    final Thread t2 = new Thread(new Runnable() {
        @Override
        public void run() {
            execOrder.add("t2-exec");
            Result<?> result2 = create2
                .selectFrom(AUTHOR)
                .forUpdate()
                .fetch();
            assertEquals(2, result2.size());

            execOrder.add("t2-signal");

            try {
                Thread.sleep(4000);
            }
            catch (Exception ignore) {
            }
 
            execOrder.add("t1-fail-or-t2-commit");

            try {
                create2.getConnection().commit();
            }
            catch (Exception e) {}
        }
    });

    t1.start();
    t2.start();

    t1.join();
    t2.join();

    assertEquals(asList(
      "t2-exec", 
      "t2-signal", 
      "t1-block", 
      "t1-fail-or-t2-commit", 
      "t1-fail-or-t2-commit"), execOrder);
}

The above test shows how the FOR UPDATE clause is implemented or simulated for most databases in jOOQ. Some databases use a timeout on the lock acquisition before failing, while others fail immediately (Oracle allows for specifying that in the FOR UPDATE WAIT / NOWAIT clause)

OneWebSQL™: Another competitor in the SQL schema generation “business”

I’ve been observing OneWebSQL™ for a year now, impatiently expecting its first public version and official go-live. I first discovered it during my regular jOOQ marketing work. I’m always closely observing competition for several reasons:

  • It shows that the market is alive and active, which helps making jOOQ even more relevant
  • jOOQ can incorporate the best ideas of competitor products
  • It’s always fun to meet people who share similar opinions

All of the above is a big motivation to make jOOQ better. For more competitor comparisons, see also the following recent articles:

OneWebSQL™

OneWebSQL™ is one of the more interesting ones, from a professional perspective. Similar to QueryDSL, OneWebSQL™ is backed by a company who has developed a SQL code generation and execution framework over the past years. Unlike jOOQ and QueryDSL, however, OneWebSQL™ is neither free nor open source. So lets have a brief look at what OneWebSQL™ is and how it compares to jOOQ and QueryDSL

Comparing jOOQ, QueryDSL, OneWebSQL™

The below list of compared features and attributes is meant to be objective and unbiased, even if it was written by the developer of jOOQ. Here goes

License

jOOQ ASL 2.0, free and open source
QueryDSL ASL 2.0, free and open source
OneWebSQL™ Proprietary license, closed source and partially obfuscated. Free runtime, somewhat expensive source code generator

Vendor and community

jOOQ Independent developer, active community
QueryDSL Mysema, active community
OneWebSQL™ e-point, newcomer with undisclosed community

Scope

Feature jOOQ QueryDSL OneWebSQL™
Source code generation [1] yes yes yes
Active Records [2] yes partially
(through JPA)
partially
(through generated DAOs)
Procedure support [3] yes no partially
(some JDBC abstraction)
UDT support [3] yes no no
DSL [4] yes
(formal BNF notation)
partially
(chained methods)
partially
(chained methods)
DML support yes yes yes
DDL support no no no
Supported RDBMS 13 9
(or more, using JPA)
5
SQL dialect abstraction [5] yes yes partially
(simulates some syntax elements)
SQL dialect simulation [6] yes yes no
Scala examples [7] Partially
Proof of concept
Partially
Proof of concept
Probably (?)

Some explanations

  • [1]: jOOQ and QueryDSL can reverse-engineer database meta data directly from a database. OneWebSQL™ expects a proprietary model file
  • [2]: “Active Records” are record classes that can persist (store, refresh, delete) themselves or be passed to an EntityManager, DAO, etc for persistence.
  • [3]: Stored procedures (and especially functions) can be used directly in SQL
  • [4]: A domain-specific language is best defined using a BNF notation. This can be translated to Java interfaces as described in a recent blog post. Simpler fluent APIs can still be constructed using “chained methods”. Strictly speaking, that is not a domain-specific language, though
  • [5]: SQL dialects vary greatly in subtle ways. Any SQL building tool should be able to distinguish dialects. This blog contains many examples, such as RDBMS bind variable casting madness
  • [6]: Many syntax elements (and more specifically functions) can be simulated in dialects that do not natively support them. This blog contains many examples, such as The simulation of TRUNC() in Derby
  • [7]: All Java libraries can be used in Scala, even without any native API.

Conclusion

The above comparison was made at a high level. Without further knowledge of the specific needs for any given project, it doesn’t make sense to go into the details of every library’s capabilities. On a high level, one can draw a clear line between the three products:

jOOQ
jOOQ’s strength is the fact that it supports a huge variety of standard and vendor-specific SQL statements, clauses and functions, and that it is able to simulate these in dialects that do not support them. This makes jOOQ a good choice for applications that want to make extensive use of SQL.

QueryDSL
QueryDSL’s strength is the fact that it supports multiple backends (most importantly JPQL and HQL). It greatly facilitates the construction of criteria queries, reducing JPA’s verbosity.

OneWebSQL™
OneWebSQL™’s strength is probably the fact that you get professional support when you buy a license. I’m looking forward to hearing experiences with this new library

Simulation of TRUNC() in Derby

Derby is missing out a lot of functions from the set of functions that other databases usually provide. One example is the TRUNC(value, decimals) function. According to the Wikipedia, truncation can be achieved as such:

-- trunc(x, n) 
CASE WHEN x > 0 
THEN
  floor(power(10, n) * x) / power(10, n) 
ELSE 
  ceil(power(10, n) * x) / power(10, n) 
END

Unfortunately, there is no POWER(base, exponent) function in Derby either. But no problem, we can simulate that as well. Let’s consider the Wikipedia again and we’ll find:

power(b, x) = exp(x * ln(b))

If we substitute that into the original simulation, we get for Derby:

-- trunc(x, n) 
CASE WHEN x > 0 
THEN 
  floor(exp(n * ln(10)) * x) / exp(n * ln(10))
ELSE 
  ceil(exp(n * ln(10)) * x) / exp(n * ln(10)) 
END

Verbose, probably quite inefficient, but effective! Let’s run a short test, too:

create table test (x numeric(10, 5), n int);

insert into test values (11.111, 0);
insert into test values (11.111, 1);
insert into test values (11.111, 2);
insert into test values (11.111, -1);

select
  x, n, 
  case when x >= 0
  then
    floor(exp(n * ln(10)) * x) / exp(n * ln(10))
  else
    ceil(exp(n * ln(10)) * x) / exp(n * ln(10))
  end "trunc(x, n)"
from test;

The above yields

X N TRUNC(X, N)
11.111 0 11
11.111 1 11.1
11.111 2 11.11
11.111 -1 10

Martin Fowler on “The Vietnam of Computer Science”

It couldn’t be a better match by a more suited person for jOOQ. Martin Fowler expresses his feelings about SQL, NoSQL, Object-relational mapping on his blog post, which is copied on DZone:

http://java.dzone.com/articles/martin-fowler-orm-hate

Think about the following two slogans:

“The Vietnam of Computer Science” – “A Peace Treaty Between SQL and Java”

Swell, no? 🙂

How to simulate MySQL’s INSERT statement extensions

I have previously posted about the SQL MERGE statement, and how powerful it is here:
https://blog.jooq.org/2011/11/29/arcane-magic-with-the-sql2003-merge-statement/

Unfortunately, not all databases support this statement. Also, very often it is quite a bit of overkill, when what you want to do is to simply INSERT or UPDATE a single record, depending on whether it already exists.

MySQL’s nice INSERT statement extensions

MySQL has a couple of very nice extensions to the INSERT statement. Some of them are also supported by the CUBRID database. Let’s have a look at two features of the (simplified) syntax definition:

INSERT [IGNORE]
    [INTO] tbl_name [(col_name,...)]
    VALUES (expr,...),(...),...
    [ ON DUPLICATE KEY UPDATE
      col_name=expr
        [, col_name=expr] ... ]

 
Essentially, we’re looking at two keywords / clauses:

  1. IGNORE: This will cause the INSERT statement to silently fail with a warning, if we’re about to insert a duplicate key
  2. ON DUPLICATE KEY UPDATE: This will cause the INSERT statement to perform an update on all affected records, if we’re about to insert a duplicate key

The full documentation can be seen here:
http://dev.mysql.com/doc/refman/5.5/en/insert.html

How jOOQ models these statements in other dialects

jOOQ’s API conveniently allows for using those MySQL features directly in insert statements. An example

// Insert an author who has recently married and thus
// changed their last name. If we already have that author, then
// update their last name:

create.insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
      .values(3, "Smith")
      .onDuplicateKeyUpdate()
      .set(AUTHOR.LAST_NAME, "Smith")
      .execute();

// Alternatively, we may not care if the author already
// exists, we just want to insert them without failure:

create.insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
      .values(3, "Smith")
      .onDuplicateKeyIgnore()
      .execute();

Note how jOOQ doesn’t model the IGNORE keyword as a keyword between INSERT [IGNORE] INTO. This keeps the public API a bit more concise, while at the same time being more consistent with pre-existing functionality, both in the INSERT syntax, as well as in the Loader API:

https://blog.jooq.org/2011/08/23/loading-csv-data-with-jooq-2/

Now, databases that do not support these syntax elements can simulate them using MERGE. The above statements will render something like this on an Oracle database:

-- ON DUPLICATE KEY UPDATE statement
merge into "AUTHOR" 
using (select 1 from dual) 
on ("AUTHOR"."ID" = 3) 
when matched then update set "LAST_NAME" = 'Smith' 
when not matched then insert ("ID", "LAST_NAME") values (3, 'Smith')

-- ON DUPLICATE KEY IGNORE statement
merge into "AUTHOR" 
using (select 1 from dual) 
on ("AUTHOR"."ID" = 3) 
when not matched then insert ("ID", "LAST_NAME") values (3, 'Smith')

Conclusion

These practices allow you to write short and concise code using the jOOQ API while still being able to target several backend databases optimally using efficient SQL statements. This will help you increase both your development as well as your runtime performance!