Using jOOQ to write vendor agnostic SQL with JPA’s native query or @Formula

If your legacy JPA application is using occasional native queries or Hibernate @Formula or Spring Data @Query annotation with vendor specific native SQL embedded in it, you can use jOOQ’s parsing connection and parsing data source to translate between dialects, without having to go all in on your jOOQ adoption – though I think it’s inevitable once you see what jOOQ can do for you.

Now, let’s design a table like this:

CREATE TABLE author (
  id INT NOT NULL,
  first_name TEXT,
  last_name TEXT NOT NULL,

  CONSTRAINT pk_author PRIMARY KEY (id)
);

Now, you may want to write a native query on this table using JPA’s EntityManager.createNativeQuery(), mapping that to entities. You could be using jOOQ’s DSL API for this, but let’s say you’re not ready for a migration to jOOQ yet, or you want to use actual SQL as provided by your DBA, not jOOQ’s DSL.

So, in MariaDB, you might be writing something like this:

List<Author> result =
em.createNativeQuery("""
    select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name
    from t_author as a
    order by a.id
    """, Author.class)
  .getResultList();

Where your entity is defined like this:

@Entity
@Table(name = "author")
public class Author {
    @Id
    public int id;

    @Column(name = "first_name")
    public String firstName;

    @Column(name = "last_name")
    public String lastName;

    // Constructors, getters, setters, equals, hashCode, etc
}

The above runs just fine and produces all the authors in MariaDB, which implemented support for Oracle’s NVL() function. But what about Oracle itself? The query fails on Oracle with:

ORA-00933: SQL command not properly ended

This is because in Oracle, you can’t use the AS keyword to alias tables, only to alias columns. Sure, you can remove that, but what about NVL()? You want this to work on MySQL and SQL Server, as well, but they complain:

MySQL

SQL Error [1305] [42000]: FUNCTION test.nvl does not exist

SQL Server

SQL Error [195] [S0010]: ‘nvl’ is not a recognized built-in function name.

Now, you have these options:

  • Use jOOQ to generate the SQL string for you, using the DSL
  • Use JPQL instead of a native query (but then heavily rewrite it, because JPQL is much less powerful than SQL)
  • Try your luck writing actual vendor agnostic SQL, manually
  • Or…

jOOQ’s parsing connection

You could use jOOQ’s parsing connection, which acts as a proxy to your actual connection, intercepting every SQL statement on the JDBC level in order to translate it to the target dialect.

This is as simple as wrapping your existing JDBC Connection or DataSource as follows:

DataSource originalDataSource = ...
DataSource parsingDataSource = DSL
    .using(originalDataSource, dialect)
    .parsingDataSource();

That’s it! I mean, you could pass some additional configuration Settings after the dialect, but that’s about as easy as it gets. The new DataSource can now run your SQL query on all the aforementioned dialects, e.g. you might see this in your DEBUG logs:

On MySQL:

-- org.hibernate.SQL
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name 
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection Translating from:
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection Translation cache miss: 
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection Translating to: 
   select a.id, ifnull(a.first_name, 'N/A') as first_name, a.last_name
   from t_author as a 
   order by a.id

On SQL Server:

-- org.hibernate.SQL
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name 
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection Translating from:
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection Translation cache miss: 
   select a.id, nvl(a.first_name, 'N/A') as first_name, a.last_name
   from t_author as a 
   order by a.id
-- org.jooq.impl.ParsingConnection] Translating to: 
   select a.id, coalesce(a.first_name, 'N/A') first_name, a.last_name 
   from author a 
   order by a.id

Hibernate was tricked by jOOQ! The NVL function was translated to MySQL’s IFNULL or SQL Server COALESCE, and the AS keyword was removed from the SQL Server query. These are just simple examples, your actual SQL may be much more complex. Play around with the feature set online, here.

Also, the Settings.cacheParsingConnectionLRUCacheSize flag, which defaults to 8192, makes sure the same query doesn’t get re-translated all the time so you won’t spend too much time in jOOQ’s parser.

@Formula too, not just native queries

A quick win in Hibernate when you want to project additional values, akin to SQL’s own computed columns, which are available in many SQL dialects, is the @Formula annotation, which can be added to any entity, like so. Assuming this additional column:

ALTER TABLE author ADD year_of_birth INT;

We might have the following amended entity:

@Entity
@Table(name = "author")
public class Author {
    @Id
    public int id;

    @Column(name = "first_name")
    public String firstName;

    @Column(name = "last_name")
    public String lastName;

    @Column(name = "year_of_birth")
    public Integer yearOfBirth;

    @Formula("year_of_birth between 1981 and 1996")
    public Boolean millenial;

    // Constructors, getters, setters, equals, hashCode, etc
}

But unfortunately, there are still so many RDBMS that don’t actually support boolean types, and the @Formula annotation is purely static, and doesn’t allow for vendor-specific overrides. Are we going to rewrite that SQL query manually to make sure a SQL-92, vendor agnostic native SQL fragment is present that works on all dialects?

Or will we just plug in jOOQ’s parsing connection again? Let’s try the latter with:

Author author = em.find(Author.class, 1);

The MySQL log contains:

-- org.hibernate.SQL
   select 
     jpaauthorw0_.id as id1_4_0_,
     jpaauthorw0_.first_name as first_na2_4_0_, 
     jpaauthorw0_.last_name as last_nam3_4_0_, 
     jpaauthorw0_.year_of_birth between 1981 and 1996 as formula1_0_ 
   from author jpaauthorw0_ 
   where jpaauthorw0_.id=?
-- org.jooq.impl.ParsingConnection Translating from: [...]
-- org.jooq.impl.ParsingConnection Translation cache miss: [...]
-- org.jooq.impl.ParsingConnection Translating to: 
   select 
     jpaauthorw0_.id as id1_4_0_, 
     jpaauthorw0_.first_name as first_na2_4_0_, 
     jpaauthorw0_.last_name as last_nam3_4_0_, 
     jpaauthorw0_.year_of_birth between 1981 and 1996 as formula1_0_ 
   from author as jpaauthorw0_ 
   where jpaauthorw0_.id = ?

As you can see, jOOQ re-added the AS keyword to alias the on MySQL for good measure, because we like to be explicit about aliasing, and because that’s the default for Settings.renderOptionalAsKeywordForTableAliases

Whereas the SQL Server log contains:

-- org.hibernate.SQL 
   select 
     jpaauthorw0_.id as id1_4_0_, 
     jpaauthorw0_.first_name as first_na2_4_0_, 
     jpaauthorw0_.last_name as last_nam3_4_0_, 
     jpaauthorw0_.year_of_birth between 1981 and 1996 as formula1_0_ 
   from author jpaauthorw0_ 
   where jpaauthorw0_.id=?
-- org.jooq.impl.ParsingConnection Translating from: [...]
-- org.jooq.impl.ParsingConnection Translation cache miss: [...]
-- org.jooq.impl.ParsingConnection Translating to: 
   select
     jpaauthorw0_.id id1_4_0_, 
     jpaauthorw0_.first_name first_na2_4_0_, 
     jpaauthorw0_.last_name last_nam3_4_0_, 
     case 
       when jpaauthorw0_.year_of_birth between 1981 and 1996 
         then 1 
       when not (jpaauthorw0_.year_of_birth between 1981 and 1996) 
         then 0 
     end formula1_0_ 
   from author jpaauthorw0_ 
   where jpaauthorw0_.id = ?

A NULL-safe BOOLEAN type emulation (because if YEAR_OF_BIRTH is NULL (i.e. UNKNOWN), then MILLENIAL must be NULL, too (i.e. UNKNOWN))

Spring Data @Query annotation

Another case where native SQL can appear in JPA integrations is Spring Data JPA with its @Query annotation, especially when used with @Query(nativeQuery = true). Just like Hibernate’s @Formula, this annotation is compile-time static, with no way to override the value of the native query at runtime, short of maybe subtyping repositories per dialect.

But why go through all that hassle. It’s always the same thing. Just patch the DataSource with jOOQ’s parsing connection or parsing data source, and you’re set.

Conclusion

Even if you’re not using jOOQ’s DSL API, you can profit from jOOQ in many ways in your existing JDBC, R2DBC, JPA, MyBatis, etc. based applications by hooking in the jOOQ parsing connection and translating your vendor specific input dialect to any number of configurable output dialects.

If jOOQ’s parser can’t handle a feature, chances are, you can work around this limitation using the ParseListener SPI, e.g. when you want to support a hypothetical LOGICAL_XOR predicate (which is supported natively in MySQL):

Query query = configuration
    .derive(ParseListener.onParseCondition(ctx -> {
        if (ctx.parseFunctionNameIf("LOGICAL_XOR")) {
            ctx.parse('(');
            Condition c1 = ctx.parseCondition();
            ctx.parse(',');
            Condition c2 = ctx.parseCondition();
            ctx.parse(')');

            return CustomCondition.of(c -> {
                switch (c.family()) {
                    case MARIADB:
                    case MYSQL:
                        c.visit(condition("{0} xor {1}", c1, c2));
                        break;
                    default:
                        c.visit(c1.andNot(c2).or(c2.andNot(c1)));
                        break;
            });
        }

        // Let the parser take over if we don't know the token
        return null;
    })
    .dsl()
    .parser()
    .parseQuery(
        "select * from t where logical_xor(t.a = 1, t.b = 2)"
    );
  
System.out.println(DSL.using(SQLDialect.MYSQL).render(query));
System.out.println(DSL.using(SQLDialect.ORACLE).render(query));

The above program will print:

-- MYSQL:
select * 
from t
where (t.a = 1 xor t.b = 2);

-- ORACLE:
select * 
from t 
where (t.a = 1 and not (t.b = 2)) or (t.b = 2 and not (t.a = 1));

So, profit from using jOOQ to migrate your application’s vendor specific SQL off one RDBMS onto another, or to support multiple RDBMS products in a single application, with or without using jOOQ’s DSL!

Side note: Query transformation

This isn’t the topic of this blog post, but once you have jOOQ parse your every SQL statement, you can also use jOOQ to transform this SQL and tamper with the expression tree, for example by implementing client side row level security. The possibilities are endless!

Crazy Translations of Simple SQL Expressions to Various SQL Dialects

SQL is standardised by ISO / IEC. We have wonderful standards, such as SQL-92, SQL:1999, SQL:2003, SQL:2008. Right? In theory. In practice, many things do not have an equivalence between the actual SQL implementations. This is very true for common function support – most of which are not part of any standard. Here’s an interesting insight into equivalent expressions for very simple functions. For this article, I’m using jOOQ, a library that models SQL as an internal domain-specific language in Java. jOOQ builds an AST model of your SQL statement in Java memory, before rendering it to the chosen SQL dialect. This helps standardising SQL, such that your programming model stays simple, while being able to access useful functionality. The following little Java program is used:

package org.jooq.compatibility;

import static org.jooq.SQLDialect.SQL99;
import static org.jooq.impl.DSL.*;

import java.sql.Date;
import java.util.EnumSet;

import org.jooq.QueryPart;
import org.jooq.SQLDialect;
import org.jooq.impl.DSL;

public class Compatibility {

    // The main method prints a couple of simple
    // and common SQL functions
    public static void main(String[] args) {
        print(sin(3));
        print(power(2, 4));
        print(sinh(3));
        print(lpad(inline("abc"), 3));
        print(dateDiff(currentDate(),
                       Date.valueOf("2000-01-01")));
        print(bitCount((byte) 5));
    }

    private static void print(QueryPart part) {
        System.out.println("Printing " + part);
        System.out.println("---------------------");

        // Get only SQLDialect families, don't
        // distinguish between
        // SQL Server 2008 or SQL Server 20012
        EnumSet<SQLDialect> dialects =
            EnumSet.noneOf(SQLDialect.class);
        for (SQLDialect dialect:SQLDialect.values())
            if (dialect != SQL99)
                dialects.add(dialect.family());

        // Render the passed in SQL clause to
        // all supported SQL dialects
        for (SQLDialect dialect: dialects)
            System.out.println(
                String.format("%1$9s : %2$s",
                dialect, DSL.using(dialect)
                            .renderInlined(part)
            ));

        System.out.println();
        System.out.println();
    }
}

Now check out this output!

Printing sin(3)

This is the most straightforward among the tested functions. It renders the same in every dialect.
      ASE : sin(3)
   CUBRID : sin(3)
      DB2 : sin(3)
    DERBY : sin(3)
 FIREBIRD : sin(3)
       H2 : sin(3)
   HSQLDB : sin(3)
   INGRES : sin(3)
  MARIADB : sin(3)
    MYSQL : sin(3)
   ORACLE : sin(3)
 POSTGRES : sin(3)
   SQLITE : sin(3)
SQLSERVER : sin(3)
   SYBASE : sin(3)
Note that SQLite doesn’t really support this function, though. In subsequent listings, only those dialects are listed that can really emulate the function in question. In the above list, ASE is Sybase ASE whereas SYBASE is Sybase SQL Anywhere.

Printing power(2, 4)

This gets a little trickier, as the power function isn’t supported in all databases. But it’s easy to emulate it using logarithms and exp.
      ASE : power(2, 4)
   CUBRID : power(2, 4)
      DB2 : power(2, 4)
    DERBY : exp((ln(2) * 4))
 FIREBIRD : power(2, 4)
       H2 : power(2, 4)
   HSQLDB : power(2, 4)
   INGRES : power(2, 4)
  MARIADB : power(2, 4)
    MYSQL : power(2, 4)
   ORACLE : power(2, 4)
 POSTGRES : power(2, 4)
SQLSERVER : power(2, 4)
   SYBASE : power(2, 4)

Printing sinh(3)

OK. Hyperbolic Sine isn’t exactly used every day. But when you need it, you better remember the formula. Do you? Here’s how this function renders:
      ASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   CUBRID : ((exp((3 * 2)) - 1) / (exp(3) * 2))
      DB2 : sinh(3)
    DERBY : sinh(3)
 FIREBIRD : sinh(3)
       H2 : sinh(3)
   HSQLDB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   INGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
  MARIADB : ((exp((3 * 2)) - 1) / (exp(3) * 2))
    MYSQL : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   ORACLE : sinh(3)
 POSTGRES : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SQLITE : sinh(3)
SQLSERVER : ((exp((3 * 2)) - 1) / (exp(3) * 2))
   SYBASE : ((exp((3 * 2)) - 1) / (exp(3) * 2))

Printing lpad(‘abc’, 3)

Off to string functions, they’re useful every now and then. But even simple functions as LPAD aren’t supported in all databases. Here’s a couple of creative solutions using repeat.
      ASE : (replicate(' ', (3 - char_length('abc'))) || 'abc')
   CUBRID : lpad('abc', 3, ' ')
      DB2 : lpad('abc', 3, ' ')
 FIREBIRD : cast(lpad('abc', 3, ' ') as varchar(4000))
       H2 : lpad('abc', 3, ' ')
   HSQLDB : lpad('abc', 3, ' ')
   INGRES : lpad('abc', 3, ' ')
  MARIADB : lpad('abc', 3, ' ')
    MYSQL : lpad('abc', 3, ' ')
   ORACLE : lpad('abc', 3, ' ')
 POSTGRES : lpad('abc', 3, ' ')
   SQLITE : substr(replace(replace(substr(quote(zeroblob(((3 - length('abc') - 1 + length(' ')) / length(' ') + 1) / 2)), 3), '''', ''), '0', ' '), 1, (3 - length('abc'))) || 'abc'
SQLSERVER : (replicate(' ', (3 - len('abc'))) + 'abc')
   SYBASE : (repeat(' ', (3 - length('abc'))) || 'abc')
Eek! Did you see that emulation for SQLite? Awful. Some background can be seen here.

Printing datediff()

Now, datetime arithmetics is probably the least cross-vendor compatible area in SQL. You can only lose if you do not use a well integration-tested tool like jOOQ to abstract these functions. Here’s what the simple difference between two dates amounts to:
      ASE : datediff(day, '2000-01-01', current_date())
   CUBRID : current_date() - date '2000-01-01'
      DB2 : (days(current_date) - days(date '2000-01-01'))
    DERBY : {fn timestampdiff(sql_tsi_day, date('2000-01-01'), current_date) }
 FIREBIRD : datediff(day, date '2000-01-01', current_date)
       H2 : datediff('day', date '2000-01-01', current_date())
   HSQLDB : datediff('day', date '2000-01-01', current_date)
   INGRES : cast((current_date - date '2000-01-01') as integer)
  MARIADB : datediff(current_date(), date '2000-01-01')
    MYSQL : datediff(current_date(), date '2000-01-01')
   ORACLE : sysdate - date '2000-01-01'
 POSTGRES : current_date - date '2000-01-01'
   SQLITE : (strftime('%s', current_date) - strftime('%s', '2000-01-01')) / 86400
SQLSERVER : datediff(day, '2000-01-01', convert(date, current_timestamp))
   SYBASE : datediff(day, '2000-01-01', current date)
The good news is, almost every database has some means of actually implementing this. Except SQLite, where dividing by 86400 seconds is not really daylight savings save (!).

Printing bit_count(5)

Last but not least, let’s have a look at this monster. The above function counts the number of bits set to 1 in the argument value 5. It would result to 2. This example only deals with byte values (TINYINT). You can imagine what larger integer support would look like.
   CUBRID : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
      DB2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as integer)
 FIREBIRD : cast((bin_and(5, 1) + bin_shr(bin_and(5, 2), 1) + bin_shr(bin_and(5, 4), 2) + bin_shr(bin_and(5, 8), 3) + bin_shr(bin_and(5, 16), 4) + bin_shr(bin_and(5, 32), 5) + bin_shr(bin_and(5, 64), 6) + bin_shr(bin_and(5, -128), 7)) as integer)
       H2 : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
   HSQLDB : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as int)
  MARIADB : bit_count(5)
    MYSQL : bit_count(5)
   ORACLE : cast((bitand(5, 1) + (bitand(5, 2) / power(2, 1)) + (bitand(5, 4) / power(2, 2)) + (bitand(5, 8) / power(2, 3)) + (bitand(5, 16) / power(2, 4)) + (bitand(5, 32) / power(2, 5)) + (bitand(5, 64) / power(2, 6)) + (bitand(5, -128) / power(2, 7))) as number(10))
 POSTGRES : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SQLITE : cast(((5 & 1) + ((5 & 2) >> 1) + ((5 & 4) >> 2) + ((5 & 8) >> 3) + ((5 & 16) >> 4) + ((5 & 32) >> 5) + ((5 & 64) >> 6) + ((5 & -128) >> 7)) as int)
   SYBASE : cast(((5 & 1) + ((5 & 2) / power(2, 1)) + ((5 & 4) / power(2, 2)) + ((5 & 8) / power(2, 3)) + ((5 & 16) / power(2, 4)) + ((5 & 32) / power(2, 5)) + ((5 & 64) / power(2, 6)) + ((5 & -128) / power(2, 7))) as int)
The awful thing about the above is that not only the bit_count() function is hardly supported, but there’s a lot of lack of support for bitwise operations as well. Note, there is also a very interesting discussion on the jOOQ User Group about alternative, shorter ways to emulate the bit_count() function.

Scared?

Don’t be. There’s bigger minefields than function cross-vendor compatibility in SQL. If you want to know for yourself, download jOOQ and run your own function emulations with the above program. Stay tuned for future posts on these topics!