Never Again Forget to Call .execute() in jOOQ

jOOQ’s DSL, like any fluent API, has one big caveat. It’s very easy to forget to call .execute(). And when you do, chances are, you’re going to be staring at your code for minutes, because everything looks perfect:

ctx.insertInto(T)
   .columns(T.A, T.B)
   .values(1, 2);

Staring… staring… staring… Why is it not inserting that row?

“Aaaah, not again!!”

This is how it’s done:

ctx.insertInto(T)
   .columns(T.A, T.B)
   .values(1, 2)
   .execute();

In principle, this kind of mistake can happen with any fluent API. E.g. StringBuilder

sb.append("a").append("b"); // Not consuming the result

Or streams:

Stream.of(1, 2).peek(System.out::println); // Not so much peeking

But it usually doesn’t happen as much, because the difference to jOOQ is that

  • jOOQ’s DML statements (INSERT, UPDATE, DELETE, MERGE) and DDL statements (CREATE, ALTER, DROP, TRUNCATE), and a few other produce a side effect
  • That side effect is the only thing we care about. The result (the update count) is mostly irrelevant

And as such, we don’t care about the result of execute(), which is an int. No one forgets to call fetch() on a jOOQ ResultQuery:

ctx.select(T.A, T.B)
   .from(T); // Well duh

Because without calling fetch() or something similar, we’re not getting any results, and we want the results, just like with the StringBuilder or the Stream. But we don’t want the execute() result.

As such, even we, when writing jOOQ’s integration tests, occasionally forget to call this silly little method.

No more!

When it happened again this week…

… I finally created an issue to think about it: https://github.com/jOOQ/jOOQ/issues/11718. And I created an issue to wonder if JetBrains could do something about it: https://youtrack.jetbrains.com/issue/IDEA-265263

And they already can! Apart from the org.jetbrains.annotations.Contract annotation, which is there for precisely this reason, apparently, it’s also possible to mimick the JSR-305 @CheckReturnValue annotation on every method “whose return value needs to be checked” (i.e. a method that has no side effect, or whose side effect is to mutate only “this“).

I added this annotation, I added it to all the relevant jOOQ API, which was a bit of yak shaving (https://github.com/jOOQ/jOOQ/commit/f2b529a2305f8c5f8d037776687887a5acd50b11) and voilà

image

As you can see, IntelliJ now warns the user whenever they forget to consume the result of any of jOOQ’s DSL methods (by calling execute(), passing it to some method that consumes it, etc.)

Thanks again to Tagir Valeev from JetBrains for walking me through this and even improving the @Contract annotation, which jOOQ might use in a future version.

Calculating Pagination Metadata Without Extra Roundtrips in SQL

When paginating results in SQL, we use standard SQL OFFSET .. FETCH or a vendor specific version of it, such as LIMIT .. OFFSET. For example:

SELECT first_name, last_name
FROM actor
ORDER BY actor_id
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY

As always, we’re using the Sakila database for this example.

This is rather straightforward. It will give us page 2 out of N pages, with a page size of 10. But how do we calculate these values? How do we know we’re on page 2? How do we know the number of pages N? Can we calculate this without an extra round-trip e.g. to calculate the total number of actors:

-- Yuck, a second round-trip!
SELECT COUNT(*)
FROM actor

We can do it with a single SQL query and window functions, but before I explain how to do this, please consider reading this article on why OFFSET pagination is a bad thing for your performance

If you’re still convinced OFFSET pagination is what you need, as opposed to keyset pagination, let’s look at how to calculate the above meta data with SQL.

What Metadata Do We Need?

The metadata we typically need to paginate using OFFSET are these:

  • TOTAL_ROWS: The total number of records if we hadn’t paginated
  • CURRENT_PAGE: The current page we’re on
  • MAX_PAGE_SIZE: The maximum page size
  • ACTUAL_PAGE_SIZE: The actual page size (when on the last page)
  • ROW: The actual offsets of the returned rows
  • LAST_PAGE: Whether we are on the last page

The maximum page size is something we set to the query, so it doesn’t have to be calculated. Everything else needs to be calculated. And here’s how to do that in a single query

SELECT 
  t.first_name, 
  t.last_name,

  -- Calculate some meta data
  COUNT(*) OVER () AS actual_page_size,
  MAX(row) OVER () = total_rows AS last_page,

  -- Metadata from the subquery
  total_rows,
  row,
  ((row - 1) / :max_page_size) + 1 AS current_page
FROM (
  SELECT
    u.*,

    -- Calculate some meta data, repeating the ORDER BY from
    -- the original query
    COUNT(*) OVER () AS total_rows,
    ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row

  -- Original query with all the predicates, joins, as a derived table
  FROM (
    SELECT *
    FROM actor
  ) AS u

  -- Ordering and pagination done here, where :offset is
  -- The maximum row value of the previous page + 1
  ORDER BY u.actor_id
  OFFSET :offset ROWS
  FETCH NEXT :max_page_size ROWS ONLY
) AS t
ORDER BY t.actor_id

That’s it. Impressive? Don’t be scared, I’ll walk you through these things step by step. And if you ever get confused by SQL syntax, consider this article explaining the logical order of SQL operations, which is, for our example:

  • FROM (recurse ordering for derived tables)
  • WHERE (which the example omitted)
  • WINDOW calculations
  • SELECT (the projection)
  • ORDER BY
  • OFFSET .. FETCH

Annotating our query, ordering operations logically as 1.1, 1.2, 2.1, 2.2, 2.3, 2.4, 2.5, 3.1, 3.2, 3.3, 3.4:

-- 3.3
SELECT 
  t.first_name, 
  t.last_name,

  -- 3.2
  COUNT(*) OVER () AS actual_page_size,
  MAX(row) OVER () = total_rows AS last_page,

  -- 3.3
  total_rows,
  row,
  ((row - 1) / :max_page_size) + 1 AS current_page
-- 3.1
FROM (
  -- 2.3
  SELECT
    u.*,

    -- 2.2
    COUNT(*) OVER () AS total_rows,
    ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row

  -- 2.1
  FROM (

    -- 1.2
    SELECT *
    -- 1.1
    FROM actor
  ) AS u

  -- 2.4
  ORDER BY u.actor_id

  -- 2.5
  OFFSET :offset ROWS
  FETCH NEXT :max_page_size ROWS ONLY
) AS t

-- 3.4
ORDER BY t.actor_id

Step by step explanation

First off, the original query SELECT * FROM actor is wrapped as a derived table called u. You can do almost anything you want with this original query, applying only a few transformations:

  • 1.1, 1.2, 2.1: You need to project (SELECT clause) the columns that your original query projected, plus the columns that you need for ORDER BY. Because I’m projecting the right things in the outermost query, and because there’s no DISTINCT clause in the original query, I conveniently projected *. Alternatively, I could have projected FIRST_NAME, LAST_NAME (because that is projected in the original query), and ACTOR_ID (because that’s what we ORDER BY).
  • 2.2: On that derived table u, we’re now able to calculate some metadata, including TOTAL_ROWS as COUNT(*) OVER () and ROW as ROW_NUMBER () OVER (ORDER BY t.actor_id). The COUNT(*) OVER () window function has an empty window specification OVER (), meaning it calculates all the rows that result from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from u in our particular example. Without a second round-trip! The ROW_NUMBER () OVER (ORDER BY u.actor_id) orders all the rows in u by u.actor_id and assigns unique row numbers to them, according to that ordering.
  • 2.3: The window functions are calculated implicitly because they’re located in the projection of this derived table. We’re also again going to conveniently project everything from u.*, because the outer-most query is the one that projects columns explicitly.
  • 2.4: The original ordering has been moved here because there is no guarantee that the ordering would have been maintained if we had ordered the contents of u. But we need the ordering to calculate OFFSET .. FETCH right after
  • 2.5: This is where we paginate. The OFFSET corresponds to the maximum ROW value that we’ve encountered before. We start at 0, and with a page size of 15, we use 15 on the next page. Remember that while indexes are 1 based in SQL, OFFSET is 0 based.
  • 3.1: All of the above is wrapped again in a derived table, in order to make further calculations on it, namely:
  • 3.2: We can again calculate COUNT(*) OVER (), calculating the total number of rows that result from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from t in our particular example. This time, the number of rows can be no more than MAX_PAGE_SIZE, because that’s what the FETCH (or LIMIT) clause inside of t says. But it can be less, too, so this is what we use to calculate the ACTUAL_PAGE_SIZE. Finally, we compare MAX(row) OVER () = total_rows to see if we’re on the last page, meaning the highest value for row in the current page resulting from t is compared to the total row count. Another way to calculate the LAST_PAGE value would be if ACTUAL_PAGE_SIZE < MAX_PAGE_SIZE, i.e. COUNT(*) OVER () < :MAX_PAGE_SIZE.
  • 3.3: In addition to the usual projection of the original columns FIRST_NAME, LAST_NAME (we’re no longer projecting * now!), we’re doing some final calculations including dividing ROW / TOTAL_ROWS to get the page number. You can calculate more things, such as TOTAL_ROWS / MAX_PAGE_SIZE to get the TOTAL_PAGES value.
  • 3.4: Finally, we have to ORDER BY t.actor_id again, don’t let anyone tell you otherwise. In SQL, if you do not ORDER BY, then the ordering is undefined. Sure, it would be silly for an optimiser to re-order things without any good reason. We’ve already ordered the contents of our subqueries in 2.4, but there is no guarantee, that this ordering is stable. Just add DISTINCT, UNION, or a JOIN resulting in a hash join or some random other operator to your query, and the ordering breaks. So, always ORDER BY if ordering is important to you.

And we’re done!

How to do it in jOOQ?

This is the kind of use-case where jOOQ really really shines, because all of this is about dynamic SQL. Your actual business logic is contained in the deeply nested u table. Everything else is “presentation logic”, which is implemented in SQL for very obvious reasons: To improve performance.

And because you want to implement all of this only once in some library of yours, instead of having to play this game on every query, you make this kind of query dynamic. The utility will look like this:

// Assuming as always the usual static imports, including:
// import static org.jooq.impl.DSL.*;
// import com.generated.code.Tables.*;

static Select<?> paginate(
    DSLContext ctx,
    Select<?> original,
    Field<?>[] sort,
    int limit,
    int offset
) {
    Table<?> u = original.asTable("u");
    Field<Integer> totalRows = count().over().as("total_rows");
    Field<Integer> row = rowNumber().over().orderBy(u.fields(sort))
        .as("row");

    Table<?> t = ctx
        .select(u.asterisk())
        .select(totalRows, row)
        .from(u)
        .orderBy(u.fields(sort))
        .limit(limit)
        .offset(offset)
        .asTable("t");

    Select<?> result = ctx
        .select(t.fields(original.getSelect().toArray(Field[]::new)))
        .select(
            count().over().as("actual_page_size"),
            field(max(t.field(row)).over().eq(t.field(totalRows)))
                .as("last_page"),
            t.field(totalRows),
            t.field(row),
            t.field(row).minus(inline(1)).div(limit).plus(inline(1))
                .as("current_page"))
        .from(t)
        .orderBy(t.fields(sort));

    // System.out.println(result);
    return result;
}

Notice the println for debugging? It will print again something like our original query (but you’ll also see that in your debug log output, by default, with jOOQ):

select
  t.ACTOR_ID,
  t.FIRST_NAME,
  t.LAST_NAME,
  count(*) over () as actual_page_size,
  (max(t.row) over () = t.total_rows) as last_page,
  t.total_rows,
  t.row,
  ((t.row / 15) + 1) as current_page
from (
  select
    u.*,
    count(*) over () as total_rows,
    row_number() over (order by u.ACTOR_ID) as row
  from (
    select
      ACTOR.ACTOR_ID,
      ACTOR.FIRST_NAME,
      ACTOR.LAST_NAME
    from ACTOR
  ) as u
  order by u.ACTOR_ID
  offset 30 rows
  fetch next 15 rows only
) as t
order by t.ACTOR_ID

And here’s how you call the utility:

System.out.println(
    paginate(
        ctx,
        ctx.select(ACTOR.ACTOR_ID, ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
           .from(ACTOR),
        new Field[] { ACTOR.ACTOR_ID },
        15,
        30
    ).fetch()
);

Notice that you can plug in arbitrary SQL fragments into that utility and paginate them. Irrespective of the complexity (including joins, other window functions, grouping, recursion, and what not), jOOQ will have you covered and will now paginate things for you.

The result of the above is:

+--------+----------+---------+----------------+---------+----------+----+------------+
|ACTOR_ID|FIRST_NAME|LAST_NAME|actual_page_size|last_page|total_rows| row|current_page|
+--------+----------+---------+----------------+---------+----------+----+------------+
|      31|SISSY     |SOBIESKI |              15|false    |       200|  31|           3|
|      32|TIM       |HACKMAN  |              15|false    |       200|  32|           3|
|      33|MILLA     |PECK     |              15|false    |       200|  33|           3|
|      34|AUDREY    |OLIVIER  |              15|false    |       200|  34|           3|
|      35|JUDY      |DEAN     |              15|false    |       200|  35|           3|
|      36|BURT      |DUKAKIS  |              15|false    |       200|  36|           3|
|      37|VAL       |BOLGER   |              15|false    |       200|  37|           3|
|      38|TOM       |MCKELLEN |              15|false    |       200|  38|           3|
|      39|GOLDIE    |BRODY    |              15|false    |       200|  39|           3|
|      40|JOHNNY    |CAGE     |              15|false    |       200|  40|           3|
|      41|JODIE     |DEGENERES|              15|false    |       200|  41|           3|
|      42|TOM       |MIRANDA  |              15|false    |       200|  42|           3|
|      43|KIRK      |JOVOVICH |              15|false    |       200|  43|           3|
|      44|NICK      |STALLONE |              15|false    |       200|  44|           3|
|      45|REESE     |KILMER   |              15|false    |       200|  45|           3|
+--------+----------+---------+----------------+---------+----------+----+------------+

Or, on the last page, with offset 195

+--------+----------+---------+----------------+---------+----------+----+------------+
|ACTOR_ID|FIRST_NAME|LAST_NAME|actual_page_size|last_page|total_rows| row|current_page|
+--------+----------+---------+----------------+---------+----------+----+------------+
|     196|BELA      |WALKEN   |               5|true     |       200| 196|          14|
|     197|REESE     |WEST     |               5|true     |       200| 197|          14|
|     198|MARY      |KEITEL   |               5|true     |       200| 198|          14|
|     199|JULIA     |FAWCETT  |               5|true     |       200| 199|          14|
|     200|THORA     |TEMPLE   |               5|true     |       200| 200|          14|
+--------+----------+---------+----------------+---------+----------+----+------------+

Conclusion

jOOQ is all about dynamic SQL. There’s hardly any SQL feature left that jOOQ doesn’t support. This includes window functions, for example, but also making sure that your dynamic SQL works on a large number of SQL dialects, irrespective of the little syntactic details.

You can build your own libraries to construct re-usable SQL elements from other SQL building blocks as this article has shown, to dynamically create single-query OFFSET pagination meta data calculation, without performing additional database round trips.

Simulating Latency with SQL / JDBC

I’ve run across a fun little trick to simulate latency in your development environments when testing some SQL queries. Possible use-cases including to validate that backend latency won’t bring down your frontend, or that your UX is still bearable, etc.

The solution is PostgreSQL and Hibernate specific, though to doesn’t have to be. Besides, it uses a stored function to work around the limitations of a VOID function in PostgreSQL, but that can be worked around differently as well, without storing anything auxiliary to the catalog.

To remove the Hibernate dependency, you can just use the pg_sleep function directly using a NULL predicate, but don’t try it like this!

select 1
from t_book
-- Don't do this!
where pg_sleep(1) is not null;

This will sleep 1 second per row (!). As can be seen in the explain plan. Let’s limit to 3 rows to see:

explain analyze
select 1
from t_book
where pg_sleep(1) is not null
limit 3;

And the result is:

Limit  (cost=0.00..1.54 rows=3 width=4) (actual time=1002.142..3005.374 rows=3 loops=1)
   ->  Seq Scan on t_book  (cost=0.00..2.05 rows=4 width=4) (actual time=1002.140..3005.366 rows=3 loops=1)
         Filter: (pg_sleep('1'::double precision) IS NOT NULL)
 Planning Time: 2.036 ms
 Execution Time: 3005.401 ms

As you can see, the whole query took about 3 seconds for 3 rows. In fact, this is also what happens in Gunnar’s example from the tweet, except that he was filtering by ID, which “helps” hide this effect.

We can use what Oracle calls scalar subquery caching, the fact that a scalar subquery can be reasonably expected to be side-effect free (despite the obvious side-effect of pg_sleep), meaning that some RDBMS cache its result per query execution.

explain analyze
select 1
from t_book
where (select pg_sleep(1)) is not null
limit 3;

The result is now:

Limit  (cost=0.01..1.54 rows=3 width=4) (actual time=1001.177..1001.178 rows=3 loops=1)
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=1001.148..1001.148 rows=1 loops=1)
   ->  Result  (cost=0.00..2.04 rows=4 width=4) (actual time=1001.175..1001.176 rows=3 loops=1)
         One-Time Filter: ($0 IS NOT NULL)
         ->  Seq Scan on t_book  (cost=0.00..2.04 rows=4 width=0) (actual time=0.020..0.021 rows=3 loops=1)
 Planning Time: 0.094 ms
 Execution Time: 1001.223 ms

We’re now getting the desired one-time filter. However, I don’t really like this hack, because it depends on an optimisation, which is optional, not a formal guarantee. This may be good enough for a quick simulation of latency, but don’t depend on this kind of optimisation in production lightheartedly.

Another approach that seems to guarantee this behaviour would be to use a MATERIALIZED CTE:

explain
with s (x) as materialized (select pg_sleep(1))
select *
from t_book
where (select x from s) is not null;

I’m now again using a scalar subquery, because I somehow need to access the CTE, and I don’t want to place it in the FROM clause, where it would impact my projection.

The plan being:

Result  (cost=0.03..2.07 rows=4 width=943) (actual time=1001.289..1001.292 rows=4 loops=1)
   One-Time Filter: ($1 IS NOT NULL)
   CTE s
     ->  Result  (...) (actual time=1001.262..1001.263 rows=1 loops=1)
   InitPlan 2 (returns $1)
     ->  CTE Scan on s  (cost=0.00..0.02 rows=1 width=4) (actual time=1001.267..1001.268 rows=1 loops=1)
   ->  Seq Scan on t_book  (cost=0.03..2.07 rows=4 width=943) (actual time=0.015..0.016 rows=4 loops=1)
 Planning Time: 0.049 ms
 Execution Time: 1001.308 ms

Again, containing a one-time filter, which is what we want here.

Using a JDBC based approach

If your application is JDBC based, you don’t have to simulate the latency by tweaking the query. You can simply proxy JDBC in one way or another. Let’s look at this little program:

try (Connection c1 = db.getConnection()) {

    // A Connection proxy that intercepts preparedStatement() calls
    Connection c2 = new DefaultConnection(c1) {
        @Override
        public PreparedStatement prepareStatement(String sql) 
        throws SQLException {
            sleep(1000L);
            return super.prepareStatement(sql);
        }
    };

    long time = System.nanoTime();
    String sql = "SELECT id FROM book";

    // This call now has a 1 second "latency"
    try (PreparedStatement s = c2.prepareStatement(sql);
        ResultSet rs = s.executeQuery()) {
        while (rs.next())
            System.out.println(rs.getInt(1));
    }

    System.out.println("Time taken: " + 
       (System.nanoTime() - time) / 1_000_000L + "ms");
}

Where:

public static void sleep(long time) {
    try {
        Thread.sleep(time);
    }
    catch (InterruptedException e) {
        Thread.currentThread().interrupt();
    }
}

For simplicity reasons, this uses jOOQ’s DefaultConnection which acts as a proxy, conveniently delegating all the methods to some delegate connection, allowing for overriding only specific methods. The output of the program is:

1
2
3
4
Time taken: 1021ms

This simulates the latency on the prepareStatement() event. Obviously, you’d be extracting the proxying into some utility in order not to clutter your code. You could even proxy all your queries in development and enable the sleep call only based on a system property.

Alternatively, we could also simulate it on the executeQuery() event:

try (Connection c = db.getConnection()) {
    long time = System.nanoTime();

    // A PreparedStatement proxy intercepting executeQuery() calls
    try (PreparedStatement s = new DefaultPreparedStatement(
        c.prepareStatement("SELECT id FROM t_book")
    ) {
        @Override
        public ResultSet executeQuery() throws SQLException {
            sleep(1000L);
            return super.executeQuery();
        };
    };

        // This call now has a 1 second "latency"
        ResultSet rs = s.executeQuery()) {
        while (rs.next())
            System.out.println(rs.getInt(1));
    }

    System.out.println("Time taken: " +
        (System.nanoTime() - time) / 1_000_000L + "ms");
}

This is now using the jOOQ convenience class DefaultPreparedStatement. If you need these, just add the jOOQ Open Source Edition dependency (there’s nothing RDBMS specific in these classes), with any JDBC based application, including Hibernate:

<dependency>
  <groupId>org.jooq</groupId>
  <artifactId>jooq</artifactId>
</dependency>

Alternatively, just copy the sources of the classes DefaultConnection or DefaultPreparedStatement if you don’t need the entire dependency, or you just proxy the JDBC API yourself.

A jOOQ based solution

If you’re already using jOOQ (and you should be!), you can do this even more easily, by implementing an ExecuteListener. Our program would now look like this:

try (Connection c = db.getConnection()) {
    DSLContext ctx = DSL.using(new DefaultConfiguration()
        .set(c)
        .set(new CallbackExecuteListener()
            .onExecuteStart(x -> sleep(1000L))
        )
    );

    long time = System.nanoTime();
    System.out.println(ctx.fetch("SELECT id FROM t_book"));
    System.out.println("Time taken: " +
        (System.nanoTime() - time) / 1_000_000L + "ms");
}

Still the same result:

+----+
|id  |
+----+
|1   |
|2   |
|3   |
|4   |
+----+
Time taken: 1025ms

The difference is that with a single intercepting callback, we can now add this sleep to all types of statements, including prepared statements, static statements, statements returning result sets, or update counts, or both.

jOOQ Internals: Pushing up SQL fragments

Over the past 13 years, jOOQ has accrued quite some internal features, which you, the user, are not exposed to. One very interesting feature is the capability for any arbitrary jOOQ expression tree element to push a SQL fragment up to a higher level.

How does it work?

The jOOQ expression tree model

When you write a jOOQ query, you’re really creating an expression tree of your SQL statement that just happens to look like actual SQL syntax. For example:

ctx.select(BOOK.ID, BOOK.TITLE)
   .from(BOOK)
   .where(BOOK.AUTHOR_ID.eq(1))
   .fetch();

In jOOQ, this is nothing else than an expression tree of this form

                     Select (Statement)
                             |
          +------------------+-----------------+
          |                  |                 |
        Select             From              Where
          |                  |                 |
    +-----+------+           |                 |
    |            |           |                 |
TableField  TableField     Table        CompareCondition
  |     |     |     |        |           |     |       |
Table Field Table Field    BOOK    TableField  EQ     Val
  |     |     |     |                |     |           |
BOOK    ID   BOOK TITLE            Table Field         1
                                     |     |
                                    BOOK AUTHOR_ID

When generating the SQL, jOOQ mostly just traverses the tree brepth first (just kidding. It’s mostly depth first, but some child elements on the same level are often considered, prior to going down one level), collecting each element into a StringBuilder to the expected form:

SELECT book.id, book.title
FROM book
WHERE book.author_id = 1

Each of these so-called QueryPart expression tree elements can decide for itself how it’s going to render its SQL. For example, the CompareCondition will roughly generate this sequence:

<lhs> <operator> <rhs>

… further delegating SQL generation to its children, whatever they may be. A TableField will decide whether to fully / partially / or not qualify its Field reference at all, etc., e.g. based on the schema mapping (multi tenancy) feature.

If you’re using functions, such as Substring, that function can decide on its own how it should generate its SQL. As can be seen in the manual, these are all the same:

-- ACCESS
mid('hello world', 7)

-- ASE, SQLDATAWAREHOUSE, SQLSERVER
substring('hello world', 7, 2147483647)

-- AURORA_MYSQL, AURORA_POSTGRES, COCKROACHDB, CUBRID, H2, 
-- HANA, HSQLDB, IGNITE, MARIADB, MEMSQL, MYSQL, POSTGRES, REDSHIFT, 
-- SNOWFLAKE, SYBASE, VERTICA
substring('hello world', 7)

-- DB2, DERBY, INFORMIX, ORACLE, SQLITE
substr('hello world', 7)

-- FIREBIRD, TERADATA
substring('hello world' FROM 7)

This way, a jOOQ expression tree can emulate any syntax in any dialect.

But what if the emulation is not local?

Non local emulations

Sometimes, a QueryPart needs to assume the presence of a non-local syntactic element to work. A recent case was https://github.com/jOOQ/jOOQ/issues/11366.

When writing this procedural logic in jOOQ:

for_(i).in(1, 10).loop(
    insertInto(t).columns(a).values(i)
)

Of course, you wouldn’t do this. You’d write a bulk insert statement instead and solve this with SQL only! But you have your reasons, right?

Then, the indexed FOR loop may have to be emulated in some dialects, using an equivalent WHILE statement. So, instead of this straight forward procedural SQL generation, which we might get in Oracle for example:

FOR i IN 1 .. 10 LOOP
  INSERT INTO t (a) VALUES (i);
END LOOP;

… we generate this in MySQL, more or less:

DECLARE i INT DEFAULT 1;
WHILE i <= 10 DO
  INSERT INTO t (a) VALUES (i);
  SET i = i + 1;
END WHILE;

This can still be done completely locally, as shown before. There’s a FOR expression tree element that can locally generate the DECLARE and WHILE query parts, instead. But what if local variables aren’t possible? What if there’s no block scope, as in Firebird?

In Firebird, all your local variables must be declared in a top level declaration section. If we run the above in an anonymous block, the correct generated procedural SQL would be this:

EXECUTE BLOCK AS
  DECLARE i INT;
BEGIN
  :i = 1; -- The loop variable must still be initialised locally
  WHILE (:i <= 10) DO BEGIN
    INSERT INTO t (a) VALUES (i);
    :i = :i + 1;
  END
END

This is still the case when we further nest the loop in procedural control flow elements, such as:

for_(i).in(1, 10).loop(
    for_(j).in(1, 10).loop(
        insertInto(t).columns(a, b).values(i, j)
    )
)

Of course, you still wouldn’t do this. You’d write a bulk insert statement from a cartesian product instead and solve this with SQL only! But alas, let’s keep the example simple

We now have nested variable declarations, which still work well in MySQL

DECLARE i INT DEFAULT 1;
WHILE i <= 10 DO 
  BEGIN
    DECLARE j INT DEFAULT 1;
    WHILE j <= 10 DO
      INSERT INTO t (a, b) VALUES (i, j);
      SET j = j + 1;
    END WHILE;
  END
  SET i = i + 1;
END WHILE;

But in Firebird, the declarations will both have to be pushed up to the top:

EXECUTE BLOCK AS
  DECLARE i INT;
  DECLARE j INT;
BEGIN
  :i = 1; -- The loop variable must still be initialised locally
  WHILE (:i <= 10) DO BEGIN
    :j = 1; -- The loop variable must still be initialised locally
    WHILE (:j <= 10) DO BEGIN
      INSERT INTO t (a, b) VALUES (i, j);
      :j = :j + 1;
    END
    :i = :i + 1;
  END
END

This doesn’t handle all the edge cases yet (e.g. some dialects allow for “hiding” local variables, such as PL/SQL) but it goes a long way already for simple procedures, functions, and triggers, all of which will be supported starting in jOOQ 3.15.

How does it work?

Alternative 1: Expression tree transformation

There are numerous ways to make such an expression tree transformation work. In the long run, we will re-design our internal expression tree model, and make it available also as a public API to those who wish to use the jOOQ parser and expression tree transformation as a standalone product. To some extent, this is already possible using the VisitListener SPI as shown in this post about row level security, but the current implementation is complex.

Also, it is relatively rare (so far) for an expression tree to require non-local transformation. This means that eagerly trying to look for possible candidates every time is probably overkill.

Alternative 2: Lazy expression tree transformation

We could transform the expression tree “lazily”, i.e. still assuming it is unnecessary, and when it is, throw an exception and start again. We actually do this, the “pattern” in jOOQ is called a ControlFlowSignal, and it is mainly used to work around the maximum number of bind parameters per statement limit of different dialects. I.e. we just count bind values, and if there are more than 2000 in SQL Server (2100 are supported by SQL Server, but only 2000 by jtds), then we just re-generate the SQL from scratch using inline values in a static statement.

As always with jOOQ, you can re-configure these limits to your own value.

Another case is when you forgot to turn on the ROWNUM to LIMIT transformation when migrating off Oracle’s old ROWNUM filtering. It would be silly to eagerly search for ROWNUM instances every time. Instead, we just re-generate the entire SQL query when we encounter one, and when you’re not using Oracle.

The assumption here is that these things happens very rarely, and if they do, you haven’t thought of it, and you don’t want the query to fail in production. The fact that a probably already slow query takes just a bit more time for jOOQ to generate is a price worth paying for the query to still Just Work™

Alternative 3: Patching generated SQL strings

Now, this is what we are actually doing.

Better assume almost all SQL transformations are local (as in the Substring example), and patch the SQL in case they’re not. In the end, we’re just generating SQL strings! And as such, why not introduce an infrastructure where we can put special markers to special text areas, and then replace that area with alternative SQL content.

Without the fix #11366, the generated code may have looked like this:

EXECUTE BLOCK AS
  -- Special marker here
BEGIN 
  -- Entering scope
  DECLARE i INT DEFAULT 1;
  WHILE (:i <= 10) DO BEGIN
    DECLARE j INT DEFAULT 1;
    WHILE (:j <= 10) DO BEGIN
      INSERT INTO t (a, b) VALUES (i, j);
      :j = :j + 1;
    END
    :i = :i + 1;
  END
  -- Exiting scope
END

This doesn’t work in Firebird, so we apply the fix. Note there’s a cheap, special marker that is placed by the SQL generation of anonymous blocks, but also for procedures, functions, and triggers, e.g.

CREATE FUNCTION func1()
RETURNS INTEGER
AS 
  -- Special marker here
BEGIN
  -- Entering scope
  RETURN 1;
  -- Exiting scope
END

Now, whenever the org.jooq.impl.DeclarationImpl query part generates its SQL locally, then, instead of generating something like:

DECLARE i INT DEFAULT 1;
DECLARE j INT;

We generate (locally)

:i = 1;
-- j isn't initialised here, so nothing needs to be done locally

At the same time, we push the org.jooq.impl.DeclarationImpl into some context variable that is visible to the entire scope (see “entering scope” and “exiting scope” comments).

As soon as we exit the scope, we must render all collected declarations, this time without the defaults, e.g.

DECLARE i INT;
DECLARE j INT;

… and then insert that rendered SQL right where the marker was located. All subsequent markers, if any, are then shifted by the difference of text lengths, of course.

Applications in jOOQ

This is currently used a few times within jOOQ:

  • To emulate calling Oracle PL/SQL functions with BOOLEAN parameters / return values. We patch the generated SQL producing a synthetic WITH clause with some BOOLEAN to NUMBER translation logic. Since Oracle 12c, Oracle supports PL/SQL embedded in WITH, which is quite a nifty feature!
  • The entire implicit JOIN feature is implemented this way! Markers delimit each table in the FROM clause (e.g. FROM BOOK), and if any path originating from any such marked table is encountered in the query (e.g. SELECT BOOK.AUTHOR.FIRST_NAME), then 1) instead of generating the path, a synthetic alias for the path is used to qualify the column, 2) instead of generating the FROM BOOK table, a synthetic LEFT JOIN or INNER JOIN is generated, joining all the necessary to-one relationships. An example will be shown below.
  • The above Firebird (and possibly also T-SQL, let’s see) procedural local variable scoping fixes are implemented this way.
  • A few emulations that require prepending SQL to the complete statement, such as CREATE OR REPLACE PROCEDURE x emulations that prepend DROP PROCEDURE x to CREATE PROCEDURE x work in a similar fashion. These types of emissions are “special” in that they add another statement to the statement batch. This means we’ll also have to be careful to skip any possible result sets or update counts this produces when calling the batch from JDBC.

Future applications may include:

  • More top level CTE, which are quite handy for a variety of emulations

An example for implicit joins:

SELECT
  book.author.first_name,
  book.author.last_name
FROM book -- Special marker around book

The above SQL doesn’t work in any dialect, it’s just jOOQ specific. We generate an alias for each unique path, based on the path’s hash code, so the query might look like this, instead:

SELECT
  -- The path isn't generated, but an alias for it is
  alias_xyz.first_name,
  alias_xyz.last_name
FROM (
  -- The marked "book" table is replaced by a join tree
  book 
    LEFT JOIN author AS alias_xyz 
    ON book.author_id = author.id
)

We simply replace book by (book LEFT JOIN ...) in the resulting SQL string. Thanks to our infrastructure of being able to define scopes and register context and variables for each scope, this works for arbitrary levels of nesting. We can always identify the appropriate book identifier for each path expression, even for things like this:

SELECT
  book.author.first_name,
  book.author.last_name,

  -- Different book tables here, because the nested scope 
  -- hides the "book" identifier from the outer scope
  (SELECT count(*) FROM book),
  (SELECT count(DISTINCT book.author.first_name) FROM book),

  -- Outer book, again
  (SELECT book.author.first_name)
FROM 
  book

The above being emulated like this, by patching two marked occurrences of book by the same join graph:

SELECT
  alias_xyz.first_name,
  alias_xyz.last_name,

  -- No patching done to this book
  (SELECT count(*) FROM book),

  -- The alias_xyz alias is used again, the path is the same
  (SELECT count(DISTINCT alias_xyz.first_name) 

  -- And the book table is patched again with the same left join
   FROM (
     book 
       LEFT JOIN author AS alias_xyz 
       ON book.author_id = author.id
  )),

  -- Outer book, again
  (SELECT alias_xyz.first_name)
FROM (
  book 
    LEFT JOIN author AS alias_xyz 
    ON book.author_id = author.id
)

It sounds as sophisticated as it sounds complex, but it works really really well.

Perhaps, in the future, expression tree transformation will be preferred over patching the result string. So far, with the current applications, this was the path of least resistance, and highest performance.

jOOQ 3.14 Released With SQL/XML and SQL/JSON Support

jOOQ 3.14 has been released with support for SQL/XML, SQL/JSON, Kotlin code generation, embeddable types, and domain types, synthetic constraints, better MERGE support, and more SQL transformations.

In this release, we’ve sorted our github issues according to user feedback and
finally implemented some of the most wanted features, which include better
Kotlin support, embeddable types, and domain type support.

In addition to this, we believe that our newly added XML and JSON operation
support will be a leading game changer in how ORMs interact with SQL databases
in the future.

XML and JSON

Standard SQL has long supported SQL/XML, and since recently, most RDBMS also
support standard SQL/JSON or vendor specific variants thereof. While ORDBMS
extensions have never seen the adoption they deserve, XML and JSON do. It makes
sense to occasionally denormalise data and store documents in the database
directly. However, this is not what we’re most excited about.

We’re excited about our support for all the fancy operators like:

  • JSON_TABLE to turn JSON values into SQL tables
  • JSON_ARRAY, JSON_OBJECT, JSON_VALUE to construct JSON data from values
  • JSON_ARRAYAGG, JSON_OBJECTAGG to aggregate data into nested JSON documents
  • JSON_EXISTS to query documents with JSON path

Similar functions are available for XML, like XMLTABLE, XMLELEMENT, XMLAGG, etc.
All editions of jOOQ 3.14 support standard SQL XML and JSON operators, and
emulate them where non-standard support is available (e.g. PostgreSQL and SQL
Server).

The commercial editions also support SQL Server’s very convenient FOR XML and
FOR JSON APIs, emulating that syntax using standard operators elsewhere. See:

https://blog.jooq.org/2020/05/05/using-sql-server-for-xml-and-for-json-syntax-on-other-rdbms-with-jooq/

But that’s not all of it! If you have Gson, Jackson, or JAXB on the classpath,
we can use that to map org.jooq.XML, org.jooq.JSON, org.jooq.JSONB types from
your query results to your nested data structures, automatically. See:

https://blog.jooq.org/2020/10/09/nesting-collections-with-jooq-3-14s-sql-xml-or-sql-json-support/

These approaches are extremely powerful. In many cases, you might just skip
most of your middleware layer and bind a REST or similar endpoint directly to a
jOOQ / SQL query, producing JSON for your frontend:

https://blog.jooq.org/2019/11/13/stop-mapping-stuff-in-your-middleware-use-sqls-xml-or-json-operators-instead/

Kotlin support

We’ve long supported some Scala extensions and a ScalaGenerator. Kotlin is an
additional very promising JVM language where a new jOOQ-kotlin module as well as
a KotlinGenerator will add a lot of value for your jOOQ/Kotlin integration.

The KotlinGenerator offers, among other things:

  • Data class support for generated POJOs
  • Property support for generated POJOs, interfaces and records
  • Better nullability support

The jOOQ-kotlin module offers some useful extension functions to further improve
the experience of the jOOQ/Kotlin integration.

In addition to the above, we’ve annotated the entire jOOQ API with nullability
annotations from JetBrains:

  • org.jetbrains.annotations.Nullable
  • org.jetbrains.annotations.NotNull

This will remove many of the annoying T! types in your Kotlin/jOOQ code,
turning them into T or T? types instead, giving you more confidence.

With these improvements, we’ve also critically reviewed our existing
ScalaGenerator, fixing a lot of bugs.

Embeddable types

One of the biggest new features in jOOQ 3.14 is inspired by JPA, which ships
with embeddable types. An embeddable type is an emulation of a database user-
defined type (UDT), which are supported natively only in Oracle and PostgreSQL.
The biggest gain of such types is to create more semantic, composite data types
in your database schema, and profit from the additional type safety.

Our interpretation of the feature is mostly in the source code generator, whose
output now becomes even more valuable. All jOOQ editions support the basic
infrastructure to pattern-match column sets and turn them into synthetic
embeddable types.

In addition to the above, our commercial editions offer some auto configuration
of embeddable types in cases where they really shine:

  • For primary/unique constraints and their matching foreign keys
  • For DOMAIN types (see below)
  • Handling overlapping embeddable types
  • Allowing for embeddable types to replace their underlying columns

We took the concept a step further than JPA. In jOOQ, embeddable types can act
as views on the underlying columns, without replacing them, or as a replacement
like in JPA. jOOQ respects all levels of relational modelling, including
overlapping constraints and thus allowing for two embeddable types to overlap.

For more information, please refer to:

https://www.jooq.org/doc/3.14/manual/code-generation/codegen-embeddable-types/

DOMAIN types

Speaking of types, some database dialects support standard SQL DOMAIN types,
which are a simpler form of UDTs. Instead of working with low level technical
types like VARCHAR(10), why not give your single-column types a name, and add
a few re-usable CHECK constraints to them?

That’s what a DOMAIN type is:

  • A named type
  • Aliasing a technical type, like VARCHAR(10)
  • Possibly adding a DEFAULT expression
  • Possibly adding a NOT NULL constraint
  • Possibly adding a set of CHECK constraints

All of the above is reusable across your schema and if you’re commercially
licensed, you can even have the code generator auto-generate embeddable types
for all of your domains to profit from the additional type safety in Java.

For more information, please refer to:

Synthetic constraints

Related to the above improved code generator output are synthetic objects,
such as the previously supported synthetic primary keys, and now also synthetic
unique and foreign keys.

If you invest heavily in security and re-usable components within your database,
you will make heavy use of SQL views. Unfortunately, views do not have any meta
data like foreign key constraints – despite the meta data being “obvious” to you
the database designer. With synthetic constraints, you can tell jOOQ about your
knowledge of underlying constraints.

The main benefits of meta data being available to jOOQ being:

  • You can now use implicit joins on views as well
  • You can now use the JOIN .. ON KEY syntax on views as well
  • You can use embeddable key types from before on views just like on tables

For more information, please refer to:

Better MERGE support

We’ve finally tackled support for more advanced MERGE statement clauses and now
support:

  • Multiple WHEN MATCHED AND condition THEN UPDATE clauses
  • Multiple WHEN MATCHED AND condition THEN DELETE clauses
  • UpdatableRecord.merge() and all the related goodies to simplify record merging

Transformations

With the parser and our translator tool (https://www.jooq.org/translate), we’ll
invest more and more in new use-cases for putting jOOQ to use other than as an
embeddable DSL in Java.

Our translation capabilities have already been strong, and with a new set of SQL
transformations, they become even stronger helping customers migrate off RDBMS A
towards RDBMS B (and back, if they made a mistake).

While our website translator is free of charge, the jOOQ library can always be
used programmatically, or as a command line utility:

https://www.jooq.org/doc/3.14/manual/sql-building/sql-parser/sql-parser-cli/

To make this use-case even more useful, new transformation features have been
added, including:

  • ROWNUM to LIMIT or to ROW_NUMBER()
  • Table lists to ANSI JOIN (including Oracle (+) support)
  • Unnecessary arithmetic expressions

This is an exciting area that we’ll be exploring for our commercially licensed
customers in the future, while even the jOOQ Open Source Edition profits from
these improvements. For example, the infrastructure created for transformations
finally enabled emulating PostgreSQL’s DISTINCT ON clause, elsewhere.

For more information, please refer to:

Better manual

We’ve taken a step back and reviewed a few important parts of our documentation.
We’re now offering:

  • Sticky tabs for code generator techniques (XML, Programmatic, Gradle): If
    you’re using Gradle with jOOQ’s code generator, you don’t want to look at the
    XML configuration again, in the manual. These tabs finally hide unneeded
    information.
  • Subsections for each function: We’ve started documenting each SQL function
    individually, explaining how it works in SQL, and providing some examples and
    example results.
  • Generate example vendor specific rendering of SQL: We’re using jOOQ when
    generating the manual, translating some jOOQ API usage to all of our supported
    dialects, and displaying how the function renders in each dialect.
  • Show imports button and display context sensitive imports: All the examples in
    the manual can be overwhelming. We’re assuming a lot of (static) imports,
    which we’re finally documenting in an expandable “show imports” section of
    each code fragment.
  • We’ve rewritten some sections to be much more complete with examples, such as
    the data import section.
  • A new API diff page displays what has changed between each minor release in
    terms of list of Javadoc links: https://www.jooq.org/api-diff

Using jOOQ 3.14 Synthetic Foreign Keys to Write Implicit Joins on Views

jOOQ has supported one of JPQL’s most cool features for a while now: implicit joins. Using jOOQ, you can navigate your to-one relationships in a type safe way, generating LEFT JOIN operations implicitly without the effort of having to keep thinking about join predicates, and the correct join order. Consider this Sakila database query here, if SQL supported implicit joins natively:

SELECT
cu.first_name,
cu.last_name
FROM customer AS cu
WHERE cu.address.city.country.country = 'Switzerland'

It translates to this query in native SQL:

SELECT
cu.first_name,
cu.last_name
FROM customer AS cu
JOIN address AS ad ON cu.address_id = ad.address_id
JOIN city AS ci ON a.city_id = ci.city_id
JOIN country AS co ON ci.country_id = co.country_id
WHERE co.country = 'Switzerland'

Note: Inner joins are supported starting from jOOQ 3.14, depending on whether the foreign key is mandatory / not null. The default behaviour is to produce LEFT JOIN which are the correct way to implicitly join optional foreign keys.

Implicit joins aren’t a silver bullet. Not every JOIN graph can be completely transformed into implicit join usage, and not every implicit join usage is more readable than native SQL JOIN graphs. But to have this option is great. Especially, when your keys are composite keys.

Classic joins on views

In classic relational design, surrogate keys are often avoided, and I think we should still avoid them in many cases. Even if you don’t agree, you may occasionally work on a schema where there are few to no surrogate keys. One such example is the standard SQL INFORMATION_SCHEMA which is implemented, for example, in H2, HSQLDB, MariaDB, MySQL, PostgreSQL, or SQL Server.

For example, when querying HSQLDB’s DOMAIN_CONSTRAINTS view to reverse engineer DOMAIN types. The jOOQ query for that used to be:

Domains d = DOMAINS.as("d");
DomainConstraints dc = DOMAIN_CONSTRAINTS.as("dc");
CheckConstraints cc = CHECK_CONSTRAINTS.as("cc");

for (Record record : create()
.select(
d.DOMAIN_SCHEMA,
d.DOMAIN_NAME,
d.DATA_TYPE,
d.CHARACTER_MAXIMUM_LENGTH,
d.NUMERIC_PRECISION,
d.NUMERIC_SCALE,
d.DOMAIN_DEFAULT,
cc.CHECK_CLAUSE)
.from(d)
.join(dc)
.on(row(d.DOMAIN_CATALOG, d.DOMAIN_SCHEMA, d.DOMAIN_NAME)
.eq(dc.DOMAIN_CATALOG, dc.DOMAIN_SCHEMA, dc.DOMAIN_NAME))
.join(cc)
.on(row(dc.CONSTRAINT_CATALOG,
dc.CONSTRAINT_SCHEMA,
dc.CONSTRAINT_NAME)
.eq(cc.CONSTRAINT_CATALOG,
cc.CONSTRAINT_SCHEMA,
cc.CONSTRAINT_NAME))
.where(d.DOMAIN_SCHEMA.in(getInputSchemata()))
.orderBy(d.DOMAIN_SCHEMA, d.DOMAIN_NAME)
) { ... }

So, the query joined the many-to-many relationship between DOMAINS - DOMAIN_CONSTRAINTS - CHECK_CONSTRAINTS to get all the information required for generating domain types.

These views are not updatable, nor do they have any constraint information associated with them, but what if we were able to define synthetic constraints? jOOQ has supported synthetic primary keys to help make views updatable.

Synthetic foreign keys

Starting with jOOQ 3.14, we’ve reworked the way synthetic keys work, and the commercial editions will support synthetic foreign keys as well.

You can specify a configuration like this:

<configuration>
<generator>
<database>
<syntheticObjects>
<primaryKeys>
<primaryKey>
<tables>
CHECK_CONSTRAINTS|CONSTRAINTS|TABLE_CONSTRAINTS
</tables>
<fields>
<field>CONSTRAINT_(CATALOG|SCHEMA|NAME)</field>
</fields>
</primaryKey>
<primaryKey>
<tables>DOMAINS</tables>
<fields>
<field>DOMAIN_(CATALOG|SCHEMA|NAME)</field>
</fields>
</primaryKey>
</primaryKeys>
<foreignKeys>
<foreignKey>
<tables>DOMAIN_CONSTRAINTS</tables>
<fields>
<field>CONSTRAINT_(CATALOG|SCHEMA|NAME)</field>
</fields>
<referencedTable>CHECK_CONSTRAINTS</referencedTable>
</foreignKey>
<foreignKey>
<tables>DOMAIN_CONSTRAINTS</tables>
<fields>
<field>DOMAIN_(CATALOG|SCHEMA|NAME)</field>
</fields>
<referencedTable>DOMAINS</referencedTable>
</foreignKey>
</foreignKeys>
</syntheticObjects>
</database>
</generator>
</configuration>

And already jOOQ’s code generator will think that these views were tables that actually had constraints like the below:

ALTER TABLE CHECK_CONSTRAINTS
ADD PRIMARY KEY (
CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA, CONSTRAINT_NAME
);

ALTER TABLE DOMAINS
ADD PRIMARY KEY (
DOMAIN_CATALOG, DOMAIN_SCHEMA, DOMAIN_NAME
);

ALTER TABLE DOMAIN_CONSTRAINTS
ADD FOREIGN KEY (
CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA, CONSTRAINT_NAME
)
REFERENCES CHECK_CONSTRAINTS;

ALTER TABLE DOMAIN_CONSTRAINTS
ADD FOREIGN KEY (
DOMAIN_CATALOG, DOMAIN_SCHEMA, DOMAIN_NAME
)
REFERENCES DOMAINS;

More sophisticated configuration is possible, e.g. to assign names to constraints, to have composite constraints using field ordering differing from the ordering in the table, or foreign keys referencing unique keys rather than primary keys. For a full description of what’s available, please refer to the manual.

With the above synthetic meta data available to the code generator, all the numerous goodies are now available on views as well, including:

Let’s look at

Implicit joins

Implicit joins are now also possible on these views, meaning:

  1. You’ll never have to remember including all the key columns in join predicates anymore (bye bye accidental cartesian products)
  2. Your code will still be correct in case your composite key changes to something else!

So, this is more than just merely a convenience thing, it’s also a correctness thing. Our query from before can now be written like this, in a much more concise way:

DomainConstraints dc = DOMAIN_CONSTRAINTS.as("dc");

for (Record record : create()
.select(
dc.domains().DOMAIN_SCHEMA,
dc.domains().DOMAIN_NAME,
dc.domains().DATA_TYPE,
dc.domains().CHARACTER_MAXIMUM_LENGTH,
dc.domains().NUMERIC_PRECISION,
dc.domains().NUMERIC_SCALE,
dc.domains().DOMAIN_DEFAULT,
dc.checkConstraints().CHECK_CLAUSE)
.from(dc)
.where(dc.domains().DOMAIN_SCHEMA.in(getInputSchemata()))
.orderBy(dc.domains().DOMAIN_SCHEMA, dc.domains().DOMAIN_NAME)
) { ... }

Notice how we’re using the relationship table as the only table to put in the FROM clause. This way, we can navigate in both directions of the to-one relationships from DOMAIN_CONSTRAINTS -> DOMAINS and DOMAIN_CONSTRAINTS -> CHECK_CONSTRAINTS. The resulting SQL query is equivalent to the previous one, but all the nastiness of joining by 3-column-composite-keys is gone. I personally find this much more readable.

Future work

So far, only to-one relationships can be navigated this way. JPQL also offers navigating to-many relationships with a few restrictions. This is a slippery slope. When offering to-many relationships, some use-cases are obvious, but the semantics of others is less so. For example, it is not a good idea to let the SELECT clause produce more (or less) rows depending on the presence of a projected column. This is why jOOQ so far produced only LEFT JOIN for implicit joins, because that would guarantee that an implicitly joined column does not reduce the number of rows because of an INNER JOIN not producing any matches.

Nevertheless, there is a lot that we can still add in #7536, including:

  • Implicit to-many joins in the FROM clause, where they don’t cause any trouble
  • Implicit joins in DML
  • Parser support to offer this functionality also in https://www.jooq.org/translate and to everyone working with jOOQ SQL through jOOQ’s ParsingConnection

And much more!

Nesting Collections With jOOQ 3.14’s SQL/XML or SQL/JSON support

One of the main features of ORMs is M as in Mapping. Libraries like jOOQ help auto-mapping flat or nested database records onto Java classes that have the same structure as the SQL result set.

The following has always been possible in jOOQ, assuming PostgreSQL’s INFORMATION_SCHEMA (using the generated code from the jOOQ-meta module):

class Column {
    String tableSchema;
    String tableName;
    String columnName;
}

for (Column c :
    ctx.select(
            COLUMNS.TABLE_SCHEMA, 
            COLUMNS.TABLE_NAME, 
            COLUMNS.COLUMN_NAME)
       .from(COLUMNS)
       .where(COLUMNS.TABLE_NAME.eq("t_author"))
       .orderBy(COLUMNS.ORDINAL_POSITION)
       .fetchInto(Column.class))
    System.out.println(
        c.tableSchema + "." + c.tableName + "." + c.columnName
    );

The above resulting in something like:

public.t_author.id
public.t_author.first_name
public.t_author.last_name
public.t_author.date_of_birth
public.t_author.year_of_birth
public.t_author.address

The mapping is straight forward, as explained in jOOQ’s DefaultRecordMapper.

Nested mappings

A lesser known feature that we’ve offered for a while was to use a dot notation to emulate nesting records into nested Java classes. Assuming you want to use a re-usable data type description in your columns and elsewhere:

class Type {
    String name;
    int precision;
    int scale;
    int length;
}

class Column {
    String tableSchema;
    String tableName;
    String columnName;
    Type type;
}

You can now write this query where you’ll alias some columns using the dot notation to type.name, for example (several nesting levels are possible):

for (Column c :
    ctx.select(
            COLUMNS.TABLE_SCHEMA,
            COLUMNS.TABLE_NAME,
            COLUMNS.COLUMN_NAME,
            COLUMNS.DATA_TYPE.as("type.name"),
            COLUMNS.NUMERIC_PRECISION.as("type.precision"),
            COLUMNS.NUMERIC_SCALE.as("type.scale"),
            COLUMNS.CHARACTER_MAXIMUM_LENGTH.as("type.length")
       )
       .from(COLUMNS)
       .where(COLUMNS.TABLE_NAME.eq("t_author"))
       .orderBy(COLUMNS.ORDINAL_POSITION)
       .fetchInto(Column.class))

    System.out.println(String.format(
        "%1$-30s: %2$s",
        c.tableSchema + "." + c.tableName + "." + c.columnName,
        c.type.name + (c.type.precision != 0
               ? "(" + c.type.precision + ", " + c.type.scale + ")"
               :       c.type.length != 0
               ? "(" + c.type.length + ")"
               : "")
    ));

The above will print:

public.t_author.id            : integer(32, 0)
public.t_author.first_name    : character varying(50)
public.t_author.last_name     : character varying(50)
public.t_author.date_of_birth : date
public.t_author.year_of_birth : integer(32, 0)
public.t_author.address       : USER-DEFINED

Using XML or JSON

Using XML or JSON, starting from jOOQ 3.14, you can also nest collections in your result set mapping very easily. First, let’s look again at how to use a JSON query using jOOQ, e.g. to find all columns per table:

for (Record1<JSON> record :
    ctx.select(
            jsonObject(
                key("tableSchema").value(COLUMNS.TABLE_SCHEMA),
                key("tableName").value(COLUMNS.TABLE_NAME),
                key("columns").value(jsonArrayAgg(
                    jsonObject(
                        key("columnName").value(COLUMNS.COLUMN_NAME),
                        key("type").value(jsonObject(
                            "name", COLUMNS.DATA_TYPE)
                        )
                    )
                ).orderBy(COLUMNS.ORDINAL_POSITION))
            )
       )
       .from(COLUMNS)
       .where(COLUMNS.TABLE_NAME.in("t_author", "t_book"))
       .groupBy(COLUMNS.TABLE_SCHEMA, COLUMNS.TABLE_NAME)
       .orderBy(COLUMNS.TABLE_SCHEMA, COLUMNS.TABLE_NAME)
       .fetch())
    System.out.println(record.value1());

The following JSON documents are returned:

{
  "tableSchema": "public", 
  "tableName": "t_author", 
  "columns": [{
    "columnName": "id", 
    "type": {"name": "integer"}
  }, {
    "columnName": "first_name", 
    "type": {"name": "character varying"}
  }, {...}]
}

{
  "tableSchema": "public", 
  "tableName": "t_book", 
  "columns": [{...}, ...]
}

That’s already awesome, isn’t it? We’ve blogged about this previously here and here. Starting with jOOQ 3.14, you can remove all the other middleware and mapping and what not, and produce your XML or JSON documents directly from your database using standard SQL/XML or SQL/JSON API!

But that’s not all!

Maybe, you don’t actually need the JSON document, you just want to use JSON to allow for nesting data structures, mapping them back to Java.

What about these nested Java classes:

public static class Type {
    public String name;
}

public static class Column {
    public String columnName;
    public Type type;
}

public static class Table {
    public String tableSchema;
    public String tableName;

    public List<Column> columns;
}

Assuming you have gson or Jackson or JAXB on your classpath (or you configure them directly), you can write the exact same query as before, and use jOOQ’s DefaultRecordMapper using the fetchInto(Table.class) call:

for (Table t :
    ctx.select(
            jsonObject(
                key("tableSchema").value(COLUMNS.TABLE_SCHEMA),
                key("tableName").value(COLUMNS.TABLE_NAME),
                key("columns").value(jsonArrayAgg(
                    jsonObject(
                        key("columnName").value(COLUMNS.COLUMN_NAME),
                        key("type").value(jsonObject(
                            "name", COLUMNS.DATA_TYPE)
                        )
                    )
                ).orderBy(COLUMNS.ORDINAL_POSITION))
            )
       )
       .from(COLUMNS)
       .where(COLUMNS.TABLE_NAME.in("t_author", "t_book"))
       .groupBy(COLUMNS.TABLE_SCHEMA, COLUMNS.TABLE_NAME)
       .orderBy(COLUMNS.TABLE_SCHEMA, COLUMNS.TABLE_NAME)
       .fetchInto(Table.class))
    System.out.println(t.tableName + ":\n" + t.columns
       .stream()
       .map(c -> c.columnName + " (" + c.type.name + ")")
       .collect(joining("\n  ")));

The output being:

t_author:
  id (integer)
  first_name (character varying)
  last_name (character varying)
  date_of_birth (date)
  year_of_birth (integer)
  address (USER-DEFINED)
t_book:
  id (integer)
  author_id (integer)
  co_author_id (integer)
  details_id (integer)
  title (character varying)
  published_in (integer)
  language_id (integer)
  content_text (text)
  content_pdf (bytea)
  status (USER-DEFINED)
  rec_version (integer)
  rec_timestamp (timestamp without time zone)

No join magic. No cartesian products. No data deduplication. Just SQL-native nested collections, using an intuitive, declarative approach to creating the document data structure, combined with the usual awesomeness of SQL.

Using this without the jOOQ DSL

Of course, this also works without the jOOQ API, e.g. using our parser. Check out our translator tool. Plug in this native SQL beauty:

SELECT
  json_object(
    KEY 'tableSchema' VALUE columns.table_schema,
    KEY 'tableName' VALUE columns.table_name,
    KEY 'columns' VALUE json_arrayagg(
      json_object(
        KEY 'columnName' VALUE columns.column_name,
        KEY 'type' VALUE json_object(
          KEY 'name' VALUE columns.data_type
        )
      )
    )
  )
FROM columns
WHERE columns.table_name IN ('t_author', 't_book')
GROUP BY columns.table_schema, columns.table_name
ORDER BY columns.table_schema, columns.table_name

And, because the devil of SQL agnosticity and translation is in the detail, take out the vendor-specific version, e.g. for PostgreSQL:

SELECT json_build_object(
  'tableSchema', columns.table_schema,
  'tableName', columns.table_name,
  'columns', json_agg(json_build_object(
    'columnName', columns.column_name,
    'type', json_build_object('name', columns.data_type)
  ))
)
FROM columns
WHERE columns.table_name IN (
  't_author', 't_book'
)
GROUP BY
  columns.table_schema,
  columns.table_name
ORDER BY
  columns.table_schema,
  columns.table_name

You might need to run this, before:

SET search_path = 'information_schema'

Conclusion

We’ve waited way too long with this game changing feature. I truly think this approach will change how we perceive ORMs in the future. The database first approach, where we can use SQL and only SQL to map SQL data onto any hierarchical data structure is very commpelling.

On the jOOQ side, we’re far from done yet. What if we can auto-generate some of the JSON document declaration from other types of meta data for you? What if you can do that, yourself? E.g. to map a GraphQL specification to jOOQ API based JSON queries? On all the SQL dialects that support these features!

The future of mapping nested data structures from SQL to any client, XML, JSON, objects is bright. jOOQ 3.14 is around the corner and will be released within the next 2 weeks. You can already build it from github: https://github.com/jOOQ/jOOQ, or if you have a license, download a nightly build from here: https://www.jooq.org/download/versions

Looking forward to your feedback.

What’s a “String” in the jOOQ API?

One of jOOQ’s biggest strength is the fact that it is a type safe SQL API. “Type safe”, in this context, means that every object that you put in a jOOQ query has a well defined type, such as:

These can be used in jOOQ in a type safe way as such:

ctx.select(T.A)      // A Field
   .from(T)          // A Table
   .where(T.B.eq(1)) // A Condition
   .fetch();

No Strings here, no string concatenation.

However, there are a few cases where you want to bypass type safety, for whatever reason, including to extend jOOQ e.g. with plain SQL templating. In those cases, you would pass a “String” object to the jOOQ API. But not every such String object is the same. What different types of strings are there? There are 4 main types of String in the jOOQ API:

1. The bind value

The most obvious type of string is the bind value or literal. You can create those explicitly using:

// As always, this static import is implied
import static org.jooq.impl.DSL.*;

Field<String> bind = val("abc");
Field<String> literal = inline("xyz");

By default, the first value will produce a bind parameter marker "?" in the generated SQL, whereas the second one will produce an escaped string literal 'xyz'. You’re probably already doing this implicitly. Whenever you pass a String value where the jOOQ API expects a T type, you’re implicitly wrapping your String value using DSL.val(T):

ctx.select(T.A)
   .from(T)
   .where(T.C.eq("xyz")) // Implicit bind value
   .fetch();

This is still type safe usage of a String value, as it is really wrapped as a Field<String>

2. The plain SQL template

Whenever jOOQ is missing some vendor specific functionality, the backdoor is to use plain SQL templating. You can create plain SQL templates explicitly for the most popular types as shown above, like this:

Field<Integer> field = field("(1 + 2)", SQLDataType.INTEGER);
Table<?> table       = table("generate_series(1, 10)");
Condition condition  = condition("some_function() = 1");

These expressions can now be embedded in your queries, like any other:

ctx.select(field)
   .from(table)
   .where(condition)
   .fetch();

Alternatively, there also exists convenience overloads on some query methods to make this simpler:

ctx.select(field("(1 + 2)", SQLDataType.INTEGER)) // Not on SELECT
   .from("generate_series(1, 10)")
   .where("some_function() = 1")
   .fetch();

Notice that the select() method doesn’t have such convenience API yet, as of jOOQ 3.13

Important disclaimer: Using these APIs, you’re going to expose yourself to the usual SQL injection risk that is present also with JDBC or JPQL usage, when composing SQL from strings. NEVER concatenate plain SQL templates, nor use user input in these strings. Use the templating language instead, and turn every user input into bind variables. Examples:

ctx.select(...)
   .from(...)
   .where("some_function() = ?", 1) // Bind variable
   .fetch();

ctx.select(...)
   .from(...)
   .where("some_function() = {0}", val(1)) // Templating
   .fetch();

If you encounter String types on most query API in jOOQ, it’s for plain SQL templating. All of this API is annotated with @org.jooq.PlainSQL for additional documentation purposes, and to be preprocessed with a static checker that can be used to disallow such API usage by default, for added security.

3. Names (identifiers)

But on some query API in jOOQ, Strings are not convenience for plain SQL templating, but for names and identifiers. Namely, all DDL statements use strings this way in their API.

You can create qualified or unqualified identifiers explicitly as follows:

// Unqualified table identifier
Name table = name("t");

// Qualified column identifier
Name field = name("t", "col");

And then, use those identifiers in your DDL statements, e.g. to create the table:

ctx.createTable(table)
   .column(field, SQLDataType.INTEGER)
   .execute();

Depending on the context, qualification is necessary or not. In this case, field qualification would not have been necessary.

For convenience, you can also just use String types on the createTable(String) API:

ctx.createTable("t")
   .column("col", SQLDataType.INTEGER)
   .execute();

These strings will simply be wrapped in DSL.name(String) as illustrated before.

Beware: In jOOQ, all identifiers are quoted by default (RenderQuotedNames.EXPLICIT_DEFAULT_QUOTED). This has two benefits:

  • Special characters, and more importantly, keyword conflicts are handled correctly out of the box
  • Quoting prevents SQL injection
  • Case sensitivity is handled correctly in those dialects that support it for quoted identifiers

The price to pay for this convenience is that quoted identifiers might turn case sensitive, when this is undesired. To work around this, you can turn off quoting using your settings, e.g. by setting RenderQuotedNames.EXPLICIT_DEFAULT_UNQUOTED. But beware, this will put you again at SQL injection risk if you don’t sanitise your identifier names first!

4. Keywords

Keywords are strings too, in jOOQ. In rare cases, you may want to wrap your keyword-as-string representation in a org.jooq.Keyword type. The main benefit (as of jOOQ 3.13) is a consistent keyword style. There is no convenience API for this, as it is very rare for client code to use this feature. There’s only DSL.keyword(String):

Keyword current = keyword("current");
Keyword current = keyword("time");

You can now use keywords in plain SQL templates:

Field<Time> currentTime = field(
  "{0} {1}", 
  SQLDataType.TIME, 
  current, time
);

Create Empty Optional SQL Clauses with jOOQ

When using jOOQ to create dynamic SQL statements (one of jOOQ’s core value propositions), it is often necessary to add query elements conditionally, with a default “No-op” behaviour. For first time users, this default “no-op” behaviour is not always obvious as the jOOQ API is vast, and as with any vast API, there are many different options to do similar things.

How not to do it

A common pitfall is to be tempted to work with the many XYZStep types. What types are these? They are usually invisible to the developer as developers use jOOQ’s DSL API in a fluent fashion, just like the JDK Stream API. For example:

DSLContext ctx = ...;

Result<?> result =
ctx.select(T.A, T.B)
   .from(T)
   .where(T.C.eq(1))
   .and(T.D.eq(2))
   .fetch();

Let’s decompose the above query to see what happens in the API. We could assign every method result to a local variable:

SelectFromStep<?> s1 = ctx.select(T.A, T.B);
SelectWhereStep<?> s2 = s1.from(T);
SelectConditionStep<?> s3 = s2.where(T.C.eq(1));
SelectConditionStep<?> s4 = s3.and(T.D.eq(2))

Result<?> result = s4.fetch();

Our previous fluent API design blog post explains this API design technique.

This is not what people usually do with “static SQL” statements, but they might be tempted to do this if they wanted to add the last predicate (T.D = 2) conditionally, e.g:

DSLContext ctx = ...;

SelectConditionStep<?> c =
ctx.select(T.A, T.B)
   .from(T)
   .where(T.C.eq(1));

if (something)
    c = c.and(T.D.eq(2));

Result<?> result = c.fetch();

This is perfectly valid API usage, but we do not recommend it because it is very messy and leads to difficult to maintain client code. Also, it is absolutely unnecessary, because there is a better way:

Composing queries from its parts

The problem with the above approach is that it is trying to use an imperative approach of adding things to a query step by step. This is how many developers tend to structure their code, but with SQL (and by consequence, jOOQ) that can turn out to be difficult to get right. A functional approach tends to work better.

Notice that not only the entire DSL structure could be assigned to local variables, but also the individual SELECT clause arguments. For example:

DSLContext ctx = ...;

List<SelectField<?>> select = Arrays.asList(T.A, T.B);
Table<?> from = T;
Condition where = T.C.eq(1).and(T.D.eq(2));

Result<?> result =
ctx.select(select)
   .from(from)
   .where(where)
   .fetch();

In fact, every jOOQ query is a dynamic SQL query. Many queries just happen to look like static queries, due to jOOQ’s API design.

Again, we wouldn’t be assigning every SELECT clause argument to a local variable, only the truly dynamic ones. For example:

DSLContext ctx = ...;

Condition where = T.C.eq(1);

if (something)
    where = where.and(T.D.eq(2));

Result<?> result =
ctx.select(T.A, T.B)
   .from(T)
   .where(where)
   .fetch();

This already looks quite decent.

Avoid breaking readability

A lot of people aren’t happy with this approach either, because it breaks a query’s readability by making its components non-local. The predicates in the query are declared up front, away from the query itself. This isn’t how many people like to reason about SQL.

And you don’t have to! It is totally possible to embed the condition directly in the WHERE clause like this:

DSLContext ctx = ...;

Result<?> result =
ctx.select(T.A, T.B)
   .from(T)

   // We always need this predicate
   .where(T.C.eq(1))

   // This is only added conditionally
   .and(something
      ? T.D.eq(2)
      : DSL.noCondition())
   .fetch();

The magic is in the above usage of DSL.noCondition, which is a pseudo predicate that does not generate any content. It is a placeholder where an org.jooq.Condition type is required without actually materialising one.

There is also:

… but that requires having to think about these identities and the reductions all the time. Also, if you append many of these trueCondition() or falseCondition() to a query, the resulting SQL tends to be quite ugly, for example for people having to analyse performance in production. noCondition() just never generates any content at all.

Note that noCondition() does not act as an identity! If your noCondition() is the only predicate left in a WHERE clause, there will not be any WHERE clause, regardless if you work with AND predicates or OR predicates.

No-op expressions in jOOQ

When using dynamic SQL like this, and adding things conditionally to queries, such “no-op expressions” become mandatory. In the previous example, we’ve seen how to add a “no-op predicate” to a WHERE clause (the same approach would obviously work with HAVING and all other clauses that work with boolean expressions).

The three most important jOOQ query types are:

Users may want to add all of these conditionally to queries.

org.jooq.Condition

We’ve already seen how to do this with org.jooq.Condition.

org.jooq.Field

What about dynamic column expressions in the projection (the SELECT clause)? Assuming you want to project columns only in certain cases. In our example, the T.B column is something we don’t always need. That’s easy! The same approach can be used (assuming T.B is a string column):

DSLContext ctx = ...;

Result<Record2<String, String>> result =
ctx.select(T.A, something ? T.B : DSL.inline("").as(T.B))
   .from(T)
   .where(T.C.eq(1))
   .and(T.D.eq(2))
   .fetch();

Using inlined parameters via DSL.inline(), you can easily produce a no-op value in your projection if you don’t want to modify the projection’s row type. The advantage is that you can now use this subquery in a union that expects two columns:

DSLContext ctx = ...;

Result<Record2<String, String>> result =

// First union subquery has a conditionally projected column
ctx.select(T.A, something ? T.B : DSL.inline("").as(T.B))
   .from(T)
   .where(T.C.eq(1))
   .and(T.D.eq(2))

   .union(

// Second union subquery has no such conditions
    select(U.A, U.B)
   .from(U))
   .fetch();

You can take this one step further, and make an entire union subquery conditional this way!

DSLContext ctx = ...;

Result<Record2<String, String>> result =

// First union subquery has a conditionally projected column
ctx.select(T.A, T.B)
   .from(T)
   .union(
      something
        ? select(U.A, U.B).from(U)
        : select(inline(""), inline("")).where(falseCondition())
   )
   .fetch();

This is a bit more syntactic work, but it’s nice to see how easy it is to add something conditionally to a jOOQ query without making the query completely unreadable. Everything is local to where it is being used. No local variables are needed, no imperative control flow is invoked.

And because everything is now an expression (and not a statement / no control flow), we can factor out parts of this query into auxiliary methods, that can be made reusable.

org.jooq.Table

Conditional table expressions usually appear when doing conditional joins. This is usually not done in isolation, but together with other conditional elements in a query. E.g. if some columns are projected conditionally, those columns may require an additional join, as they originate from another table than the tables that are used unconditionally. For example:

DSLContext ctx = ...;

Result<?> result =
ctx.select(
      T.A, 
      T.B, 
      something ? U.X : inline(""))
   .from(
      something
      ? T.join(U).on(T.Y.eq(U.Y))
      : T)
   .where(T.C.eq(1))
   .and(T.D.eq(2))
   .fetch();

There isn’t a more simple way to produce the conditional JOIN expression, because JOIN and ON need to be provided independently. For simple cases as shown above, this is perfectly fine. In more complex cases, some auxiliary methods may be needed, or views.

Conclusion

There are two important messages here in this post:

  1. The XYZStep types are auxiliary types only. They are there to make your dynamically constructed SQL statement look like static SQL. But you should never feel the need to assign them to local variables, or return them from methods. While it is not wrong to do so, there is almost always a better way to write dynamic SQL.
  2. In jOOQ, every query is a dynamic query. This is the benefit of composing SQL queries using an expression tree like the one that is used in jOOQ’s internals. You may not see the expression tree because the jOOQ DSL API mimicks static SQL statement syntax. But behind the scenes, you’re effectively building this expression tree. Every part of the expression tree can be produced dynamically, from local variables, methods, or expressions, such as conditional expressions. I’m looking forward to using the new JEP 361 switch expressions in dynamic SQL. Just like a SQL CASE expression, some SQL statement parts can be constructed dynamically in the client, prior to passing them to the server.

Once these two things are internalised, you can write very fancy dynamic SQL, including using FP approaches to constructing data structures, such as a jOOQ query object.

Using Java 13+ Text Blocks for Plain SQL with jOOQ

Most jOOQ users use the jOOQ DSL API, which provides compile time type safety and an easy way to write dynamic SQL.

But occasionally, this DSL get in the way, because it might be

  • Overkill for some very simple quick and dirty SQL queries
  • Too limited when running very advanced vendor specific SQL, such as Oracle’s MODEL or MATCH_RECOGNIZE clauses

In such cases, you can still benefit from jOOQ’s many secondary features, including for example its nice integration with the Stream API, export functionality, and much more. Think of jOOQ as an improved JDBC!

Starting with Java 13, when enabling preview features, you can now make use of text blocks, or “multi line strings”, which are very useful for embedding static SQL strings in Java code (and XML, JSON, regular expressions, etc). There are two main use cases for using text blocks in jOOQ:

Plain SQL

The main use case is to use plain SQL and also plain SQL templating. For example, to run a quick and dirty jOOQ query like this on H2 or PostgreSQL, for example:

System.out.println(ctx.fetch("""
        SELECT table_schema, count(*)
        FROM information_schema.tables
        GROUP BY table_schema
        ORDER BY table_schema
        """));

(unfortunately, this blog’s syntax highlighter is not quite there yet…)

The output is jOOQ’s nicely formatted text result:

+------------------+--------+
|TABLE_SCHEMA      |COUNT(*)|
+------------------+--------+
|INFORMATION_SCHEMA|      33|
|MCVE              |       2|
|PUBLIC            |       1|
+------------------+--------+

The above is a simple plain SQL query, but we can also make use of plain SQL templating with text blocks! What if we wanted to have a dynamic GROUP BY clause?

Always assuming this static import:

import static org.jooq.impl.DSL.*;

We can now write:

Stream.of(
        field("table_schema"),
        list(field("table_schema"), field("table_type")))
    .forEach(q -> {
        System.out.println(ctx.fetch("""
          SELECT {0}, count(*), row_number() OVER (ORDER BY {0}) AS rn
          FROM information_schema.tables
          GROUP BY {0}
          ORDER BY {0}
          """, q));
    });

We’ve written a dynamic SQL query with jOOQ, but without using much type safety (we could still use some type safe, generated schema object references if we wanted to). jOOQ’s plain SQL templating capabilities are a much underestimated feature set. People have been using MyBatis for many years precisely for these purposes, or maybe built home grown, velocity template based frameworks to produce dynamic SQL using native SQL strings.

You can also use jOOQ for this, and profit from all of the jOOQ API, including, again the formatted output from the above:

+------------------+--------+----+
|TABLE_SCHEMA      |COUNT(*)|  RN|
+------------------+--------+----+
|INFORMATION_SCHEMA|      33|   1|
|MCVE              |       2|   2|
|PUBLIC            |       1|   3|
+------------------+--------+----+

+------------------+------------+--------+----+
|TABLE_SCHEMA      |TABLE_TYPE  |COUNT(*)|  RN|
+------------------+------------+--------+----+
|INFORMATION_SCHEMA|SYSTEM TABLE|      33|   1|
|MCVE              |TABLE       |       1|   2|
|MCVE              |VIEW        |       1|   3|
|PUBLIC            |TABLE       |       1|   4|
+------------------+------------+--------+----+

All of this was available before text blocks, but with text blocks, it makes even more sense to use these features. There is not just one way to use jOOQ, but this approach is really very underestimated!

The parser

Another very important jOOQ feature that works with string based SQL is the jOOQ parser. There are a variety of use cases for using jOOQ’s parser, as we’re adding more and more features to it. One use-case is simple formatting of SQL. Why would you want to do this? One example is to standardise your SQL strings for better execution plan cache usage. As you may know, some database vendors (e.g. Oracle) treat these two SQL strings as different:

SELECT 1 FROM dual
SELECT  1 FROM dual

Observe the irrelevant whitespace difference. Oracle will parse a new SQL string, produce a new SQL_ID, and yet come up with the same execution plan. You don’t want this to happen too often, because it puts a lot of contention on the plan cache and its locks and latches, and creates extra work.

This can happen, nonetheless, with string based dynamic SQL. Using jOOQ’s parser, you can normalise all sorts of SQL strings (including removing comments)

System.out.println(
    ctx.parser()
       .parseResultQuery("""
            SELECT table_schema, count(*)
            FROM information_schema.tables
            GROUP BY table_schema
            -- Order by column index!
            ORDER BY 1
            """)
       .fetch()
);

The SQL string that gets sent to the JDBC driver is this:

select table_schema, count(*) from information_schema.tables group by table_schema order by 1

Alternatively, you could specify some formatting in jOOQ:

DSLContext ctx = DSL.using(connection, 
    new Settings().withRenderFormatted(true));

And now, the SQL string that gets sent to the JDBC driver is this:

select 
  table_schema, 
  count(*)
from information_schema.tables
group by table_schema
order by 1

You can tweak the various supported formatting rules, and use this for logging your hand written SQL strings in a human readable way, for example, rather than executing the SQL through jOOQ. The options are endless. You could also translate parser output to some other SQL dialect. Assuming you run your queries against SQL Server, but want to translate them to H2. No problem! The following SQL Server specific query can be run easily on H2 as well:

System.out.println(
    ctx.parser()
       .parseResultQuery("""
            SELECT TOP 1 table_schema, count(*)
            FROM information_schema.tables
            GROUP BY table_schema
            ORDER BY count(*) DESC
            """)
       .fetch()
);

The translated SQL that gets executed on H2 is this:

select 
  table_schema, 
  count(*)
from information_schema.tables
group by table_schema
order by count(*) desc
limit 1

Or, because I like showing off:

System.out.println(
    ctx.parser()
       .parseResultQuery("""
            SELECT TOP 1 WITH TIES table_schema, count(*)
            FROM information_schema.tables
            GROUP BY table_schema
            ORDER BY count(*) DESC
            """)
       .fetch()
);

Producing this in H2:

select 
  TABLE_SCHEMA, 
  count(*)
from INFORMATION_SCHEMA.TABLES
group by TABLE_SCHEMA
order by 2 desc
fetch next 1 rows with ties

Or this, in PostgreSQL:

select 
  "v0" as table_schema, 
  "v1" as "count"
from (
  select 
    table_schema as "v0", 
    count(*) as "v1", 
    rank() over (order by 2 desc) as "rn"
  from information_schema.tables
  group by table_schema
) "x"
where "rn" > 0
and "rn" <= (0 + 1)
order by "rn"

I know, right?

Parser-derived features

The parser is used in more and more of jOOQ’s many features. Including for example in jOOQ 3.13’s schema diff and management utility.

Run this with jOOQ:

System.out.println(
    ctx.meta("""
    create table t (
      i int
    )
    """).apply("""
    alter table t
      add j int;
    alter table t
      add constraint t_pk primary key (i)
    """)
);

I.e. just copy paste your plain SQL schema definitions and increments (as used in Flyway, for example, or hand written while developing them against your database), and start using the entire tool chain on it. For example, the above prints:

create table T(
  I int null,
  J int null,
  constraint T_PK
    primary key (I)
);

Instead of applying increments to a schema, you can also calculate the diff between two schema versions:

System.out.println(
    ctx.meta("""
    create table t (
      i int
    )
    """).migrateTo(ctx.meta("""
    create table t (
      i int,
      j int,
      constraint t_pk primary key (i)
    )
    """))
);

And now, the output is the increment we’ve applied earlier

alter table T
  add J int null;
alter table T
  add constraint T_PK
    primary key (I);

Conclusion

jOOQ’s DSL is very powerful because:

  • It offers type safety
  • It is compile time checked
  • It helps with auto completion

But sometimes, it gets in the way and plain SQL strings work better, especially when using text blocks. In that case, jOOQ is still also very powerful. The entirety of the jOOQ tool chain is still available to you behind the scenes. You’re just not using the DSL as an API anymore.