Top 5 Hidden jOOQ Features

jOOQ’s main value proposition is obvious: Type safe embedded SQL in Java.

People who actively look for such a SQL builder will inevitably stumble upon jOOQ and love it, of course. But a lot of people don’t really need a SQL builder – yet, jOOQ can still be immensely helpful in other situations, through its lesser known features.

Here’s a list of top 5 “hidden” jOOQ features.

1. Working with JDBC ResultSet

Even if you’re otherwise not using jOOQ but JDBC (or Spring JdbcTemplate, etc.) directly, one of the things that’s most annoying is working with ResultSet. A JDBC ResultSet models a database cursor, which is essentially a pointer to a collection on the server, which can be positioned anywhere, e.g. to the 50th record via ResultSet.absolute(50) (remember to start counting at 1).

The JDBC ResultSet is optimised for lazy data processing. This means that we don’t have to materialise the entire data set produced by the server in the client. This is a great feature for large (and even large-ish) data sets, but in many cases, it’s a pain. When we know we’re fetching only 10 rows and we know that we’re going to need them in memory anyway, a List<Record> type would be much more convenient.

jOOQ’s org.jooq.Result is such a List, and fortunately, you can easily import any JDBC ResultSet easily as follows by using DSLContext.fetch(ResultSet):

try (ResultSet rs = stmt.executeQuery()) {
    Result<Record> result = DSL.using(connection).fetch(rs);
    System.out.println(result);
}

With that in mind, you can now access all the nice jOOQ utilities, such as formatting a result, e.g. as TEXT (see The second feature for more details):

+---+---------+-----------+
| ID|AUTHOR_ID|TITLE      |
+---+---------+-----------+
|  1|        1|1984       |
|  2|        1|Animal Farm|
+---+---------+-----------+

Of course, the inverse is always possible as well. Need a JDBC ResultSet from a jOOQ Result? Call Result.intoResultSet() and you can inject dummy results to any application that operates on JDBC ResultSet:

DSLContext ctx = DSL.using(connection);

// Get ready for Java 10 with var!
var result = ctx.newResult(FIRST_NAME, LAST_NAME);
result.add(ctx.newRecord(FIRST_NAME, LAST_NAME)
              .values("John", "Doe"));

// Pretend this is a real ResultSet
try (ResultSet rs = result.intoResultSet()) {
  while (rs.next())
    System.out.println(rs.getString(1) + " " + rs.getString(2));
}

2. Exporting a Result as XML, CSV, JSON, HTML, TEXT, ASCII Chart

As we’ve seen in the previous section, jOOQ Result types have nice formatting features. Instead of just text, you can also format as XML, CSV, JSON, HTML, and again TEXT

The format can usually be adapted to your needs.

For instance, this text format is possible as well:

ID AUTHOR_ID TITLE      
------------------------
 1         1 1984       
 2         1 Animal Farm

When formatting as CSV, you’ll get:

ID,AUTHOR_ID,TITLE
1,1,1984
2,1,Animal Farm

When formatting as JSON, you might get:

[{"ID":1,"AUTHOR_ID":1,"TITLE":"1984"},
 {"ID":2,"AUTHOR_ID":1,"TITLE":"Animal Farm"}]

Or, depending on your specified formatting options, perhaps you’ll prefer the more compact array of array style?

[[1,1,"1984"],[2,1,"Animal Farm"]]

Or XML, again with various common formatting styles, among which:

<result>
  <record>
    <ID>1</ID>
    <AUTHOR_ID>1</AUTHOR_ID>
    <TITLE>1984</TITLE>
  </record>
  <record>
    <ID>2</ID>
    <AUTHOR_ID>1</AUTHOR_ID>
    <TITLE>Animal Farm</TITLE>
  </record>
</result>

HTML seems kind of obvious. You’ll get:

ID AUTHOR_ID TITLE
1 1 1984
2 1 Animal Farm

Or, in code:

<table>
<tr><th>ID</th><th>AUTHOR_ID</th><th>TITLE</th></tr>
<tr><td>1</td><td>1</td><td>1984</td></tr>
<tr><td>2</td><td>1</td><td>Animal Farm</td></tr>
</table>

As a bonus, you could even export the Result as an ASCII chart:

These features are obvious additions to ordinary jOOQ queries, but as I’ve shown in Section 1, you can get free exports from JDBC results as well!

3. Importing these text formats again

After the previous section’s export capabilities, it’s natural to think about how to import such data again back into a more usable format. For instance, when you write integration tests, you might expect a database query to return a result like this:

ID AUTHOR_ID TITLE      
-- --------- -----------
 1         1 1984       
 2         1 Animal Farm

Simply import the above textual representation of your result set into an actual jOOQ Result using Result.fetchFromTXT(String) and you can continue operating on a jOOQ Result (or as illustrated in Section 1, with a JDBC ResultSet!).

Most of the other export formats (except charts, of course) can be imported as well.

Now, don’t you wish for a second that Java has multi-line strings (in case of which this would be very nice looking):

Result<?> result = ctx.fetchFromTXT(
    "ID AUTHOR_ID TITLE      \n" +
    "-- --------- -----------\n" +
    " 1         1 1984       \n" +
    " 2         1 Animal Farm\n"
);
ResultSet rs = result.intoResultSet();

These types can now be injected anywhere where a service or DAO produces a jOOQ Result or a JDBC ResultSet. The most obvious application for this is mocking. The second most obvious application is testing. You can easily test that a service produces an expected result of the above form.

Let’s talk about mocking:

4. Mocking JDBC

Sometimes, mocking is cool. With the above tools, it’s only natural for jOOQ to provide a full-fledged, JDBC-based mocking SPI. I’ve written about this feature before and again here.

Essentially, you can implement a single FunctionalInterface called MockDataProvider. The simplest way to create one is by using the Mock.of() methods, e.g.:

MockDataProvider provider = Mock.of(ctx.fetchFromTXT(
    "ID AUTHOR_ID TITLE      \n" +
    "-- --------- -----------\n" +
    " 1         1 1984       \n" +
    " 2         1 Animal Farm\n"
));

What this provider does is it simply ignores all the input (queries, bind variables, etc.) and always returns the same simple result set. You can now plug this provider into a MockConnection and use it like any ordinary JDBC connection:

try (Connection c = new MockConnection(provider);
     PreparedStatement s = c.prepareStatement("SELECT foo");
     ResultSet rs = s.executeQuery()) {

    while (rs.next()) {
        System.out.println("ID        : " + rs.getInt(1));
        System.out.println("First name: " + rs.getString(2));
        System.out.println("Last name : " + rs.getString(3));
    }
}

The output being (completely ignoring the SELECT foo statement):

ID        : 1
First name: 1
Last name : 1984
ID        : 2
First name: 1
Last name : Animal Farm

This client code doesn’t even use jOOQ (although it could)! Meaning, you can use jOOQ as a JDBC mocking framework on any JDBC-based application, including a Hibernate based one.

Of course, you don’t always want to return the exact same result. This is why a MockDataProvider offers you an argument with all the query information in it:

try (Connection c = new MockConnection(ctx -> {
    if (ctx.sql().toLowerCase().startsWith("select")) {
        // ...
    }
})) {
    // Do stuff with this connection
}

You can almost implement an entire JDBC driver with a single lambda expression. Read more here. Cool, eh?

Side note: Don’t get me wrong: I don’t think you should mock your entire database layer just because you can. My thoughts are available in this tweet storm:

Speaking of “synthetic JDBC connections”

5. Parsing Connections

jOOQ 3.9 introduced a SQL parser, whose main use case so far is to parse and reverse engineer DDL scripts for the code generator.

Another feature that has not been talked about often yet (because still a bit experimental) is the parsing connection, available through DSLContext.parsingConnection(). Again, this is a JDBC Connection implementation that wraps a physical JDBC connection but runs all SQL queries through the jOOQ parser before generating them again.

What’s the point?

Let’s assume for a moment that we’re using SQL Server, which supports the following SQL standard syntax:

SELECT * FROM (VALUES (1), (2), (3)) t(a)

The result is:

 a
---
 1
 2
 3

Now, let’s assume we are planning to migrate our application to Oracle and we have the following JDBC code that doesn’t work on Oracle, because Oracle doesn’t support the above syntax:

try (Connection c = DriverManager.getConnection("...");
     Statement s = c.createStatement();
     ResultSet rs = s.executeQuery(
         "SELECT * FROM (VALUES (1), (2), (3)) t(a)")) {

    while (rs.next())
        System.out.println(rs.getInt(1));
}

Now, we have three options (hint #1 sucks, #2 and #3 are cool):

  1. Tediously migrate all such manually written JDBC based SQL to Oracle syntax and hope we don’t have to migrate back again
  2. Upgrade our JDBC based application to use jOOQ instead (that’s the best option, of course, but it also takes some time)
  3. Simply use the jOOQ parsing connection as shown below, and a lot of code will work right out of the box! (and then, of course, gradually migrate to jOOQ, see option #2)
try (DSLContext ctx = DSL.using("...");
     Connection c = ctx.parsingConnection(); // Magic here
     Statement s = c.createStatement();
     ResultSet rs = s.executeQuery(
         "SELECT * FROM (VALUES (1), (2), (3)) t(a)")) {

    while (rs.next())
        System.out.println(rs.getInt(1));
}

We haven’t touched any of our JDBC based client logic. We’ve only introduced a proxy JDBC connection that runs every statement through the jOOQ parser prior to re-generating the statement on the wrapped, physical JDBC connection.

What’s really executed on Oracle is this emulation here:

select t.a from (
  (select null a from dual where 1 = 0) union all 
  (select * from (
    (select 1 from dual) union all 
    (select 2 from dual) union all 
    (select 3 from dual)
  ) t)
) t

Looks funky, eh? The rationale for this emulation is described here.

Every SQL feature that jOOQ can represent with its API and that it can emulate between databases will be supported! This includes far more trivial things, like parsing this query:

SELECT substring('abcdefg', 2, 4)

… and running this one on Oracle instead:

select substr('abcdefg', 2, 4) from dual

You’re all thinking

Want to learn more about jOOQ?

There are many more such nice little things in the jOOQ API, which help make you super productive. Some examples include:

jOOQ 3.10 Supports JPA AttributeConverter

One of the cooler hidden features in jOOQ is the JPADatabase, which allows for reverse engineering a pre-existing set of JPA-annotated entities to generate jOOQ code.

For instance, you could write these entities here:

@Entity
public class Actor {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    public Integer actorId;

    @Column
    public String firstName;

    @Column
    public String lastName;

    @ManyToMany(fetch = LAZY, mappedBy = "actors", 
        cascade = CascadeType.ALL)
    public Set<Film> films = new HashSet<>();

    public Actor(String firstName, String lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }
}

@Entity
public class Film {

    @Id
    @GeneratedValue(strategy = IDENTITY)
    public Integer filmId;

    @Column
    public String title;

    @Column(name = "RELEASE_YEAR")
    @Convert(converter = YearConverter.class)
    public Year releaseYear;

    @ManyToMany(fetch = LAZY, cascade = CascadeType.ALL)
    public Set<Actor> actors = new HashSet<>();

    public Film(String title, Year releaseYear) {
        this.title = title;
        this.releaseYear = releaseYear;
    }
}

// Imagine also a Language entity here...

(Just a simple example. Let’s not discuss the caveats of @ManyToMany mapping).

For more info, the full example can be found on Github:

Now observe the fact that we’ve gone through all the trouble of mapping the database type INT for the RELEASE_YEAR column to the cool JSR-310 java.time.Year type for convenience. This has been done using a JPA 2.1 AttributeConverter, which simply looks like this:

public class YearConverter 
implements AttributeConverter<Year, Integer> {

    @Override
    public Integer convertToDatabaseColumn(Year attribute) {
        return attribute == null ? null : attribute.getValue();
    }

    @Override
    public Year convertToEntityAttribute(Integer dbData) {
        return dbData == null ? null : Year.of(dbData);
    }
}

Using jOOQ’s JPADatabase

Now, the JPADatabase in jOOQ allows you to simply configure the input entities (e.g. their package names) and generate jOOQ code from it. This works behind the scenes with this algorithm:

  • Spring is used to discover all the annotated entities on the classpath
  • Hibernate is used to generate an in-memory H2 database from those entities
  • jOOQ is used to reverse-engineer this H2 database again to generate jOOQ code

This works pretty well for most use-cases as the JPA annotated entities are already very vendor-agnostic and do not provide access to many vendor-specific features. We can thus perfectly easily write the following kind of query with jOOQ:

ctx.select(
        ACTOR.FIRSTNAME,
        ACTOR.LASTNAME,
        count().as("Total"),
        count().filterWhere(LANGUAGE.NAME.eq("English"))
          .as("English"),
        count().filterWhere(LANGUAGE.NAME.eq("German"))
          .as("German"),
        min(FILM.RELEASE_YEAR),
        max(FILM.RELEASE_YEAR))
   .from(ACTOR)
   .join(FILM_ACTOR)
     .on(ACTOR.ACTORID.eq(FILM_ACTOR.ACTORS_ACTORID))
   .join(FILM)
     .on(FILM.FILMID.eq(FILM_ACTOR.FILMS_FILMID))
   .join(LANGUAGE)
     .on(FILM.LANGUAGE_LANGUAGEID.eq(LANGUAGE.LANGUAGEID))
   .groupBy(
        ACTOR.ACTORID,
        ACTOR.FIRSTNAME,
        ACTOR.LASTNAME)
   .orderBy(ACTOR.FIRSTNAME, ACTOR.LASTNAME, ACTOR.ACTORID)
   .fetch()

(more info about the awesome FILTER clause here)

In this example, we’re also using the LANGUAGE table, which we omitted in the article. The output of the above query is something along the lines of:

+---------+---------+-----+-------+------+----+----+
|FIRSTNAME|LASTNAME |Total|English|German|min |max |
+---------+---------+-----+-------+------+----+----+
|Daryl    |Hannah   |    1|      1|     0|2015|2015|
|David    |Carradine|    1|      1|     0|2015|2015|
|Michael  |Angarano |    1|      0|     1|2017|2017|
|Reece    |Thompson |    1|      0|     1|2017|2017|
|Uma      |Thurman  |    2|      1|     1|2015|2017|
+---------+---------+-----+-------+------+----+----+

As we can see, this is a very suitable combination of jOOQ and JPA. JPA was used to insert the data through JPA’s useful object graph persistence capabilities, whereas jOOQ is used for reporting on the same tables.

Now, since we already wrote this nice AttributeConverter, we certainly want to apply it also to the jOOQ query and get the java.time.Year data type also in jOOQ, without any additional effort.

jOOQ 3.10 auto conversion

In jOOQ 3.10, we don’t have to do anything anymore. The existing JPA converter will automatically mapped to a jOOQ converter as the generated jOOQ code reads:

// Don't worry about this generated code
public final TableField<FilmRecord, Year> RELEASE_YEAR = 
    createField("RELEASE_YEAR", org.jooq.impl.SQLDataType.INTEGER, 
        this, "", new JPAConverter(YearConverter.class));

… which leads to the previous jOOQ query now returning a type:

Record7<String, String, Integer, Integer, Integer, Year, Year>

Luckily, this was rather easy to implement as the Hibernate meta model allows for navigating the binding between entities and tables very conveniently as described in this article here:

How to get the entity mapping to database table binding metadata from Hibernate

More similar features are coming up in jOOQ 3.11, e.g. when we look into reverse engineering JPA @Embedded types as well. See https://github.com/jOOQ/jOOQ/issues/6518

If you want to run this example, do check out our jOOQ/JPA example on GitHub:

How to Execute SQL Batches With JDBC and jOOQ

Some databases (in particular MySQL and T-SQL databases like SQL Server and Sybase) support a very nice feature: They allow for running a “batch” of statements in a single statement. For instance, in SQL Server, you can do something like this:

-- Statement #1
DECLARE @table AS TABLE (id INT);

-- Statement #2
SELECT * FROM @table;

-- Statement #3
INSERT INTO @table VALUES (1),(2),(3);

-- Statement #4
SELECT * FROM @table;

This is a batch of 4 statements, and it can be executed as a single statement both with JDBC and with jOOQ. Let’s see how:

Executing a batch with JDBC

Unfortunately, the term “batch” has several meanings, and in this case, I don’t mean the JDBC Statement.addBatch() method, which is actually a bit clumsy as it doesn’t allow for fetching mixed update counts and result sets.

Instead, what I’ll be doing is this:

String sql =
    "\n  -- Statement #1                              "
  + "\n  DECLARE @table AS TABLE (id INT);            "
  + "\n                                               "
  + "\n  -- Statement #2                              "
  + "\n  SELECT * FROM @table;                        "
  + "\n                                               "
  + "\n  -- Statement #3                              "
  + "\n  INSERT INTO @table VALUES (1),(2),(3);       "
  + "\n                                               "
  + "\n  -- Statement #4                              "
  + "\n  SELECT * FROM @table;                        ";

try (PreparedStatement s = c.prepareStatement(sql)) {
    fetchLoop:
    for (int i = 0, updateCount = 0;; i++) {
        boolean result = (i == 0)
            ? s.execute()
            : s.getMoreResults();

        if (result)
            try (ResultSet rs = s.getResultSet()) {
                System.out.println("\nResult:");

                while (rs.next())
                    System.out.println("  " + rs.getInt(1));
            }
        else if ((updateCount = s.getUpdateCount()) != -1)
            System.out.println("\nUpdate Count: " + updateCount);
        else
            break fetchLoop;
    }
}

The output of the above program being:

Result:

Update Count: 3

Result:
  1
  2
  3

The above API usage is a somewhat “hidden” – or at least not every day usage of the JDBC API. Mostly, you’ll be using Statement.executeQuery() when you’re expecting a ResultSet, or Statement.executeUpdate() otherwise.

But in our case, we don’t really know what’s happening. We’re going to discover the result types on the fly, when executing the statement. Here are the main JDBC API features that we’re using, along with an explanation:

  • Statement.execute(): This method should be used if we don’t know the result type. The method returns a boolean, which is true when the first statement in the batch produced a ResultSet and false otherwise.
  • Statement.getMoreResults(): This method returns the same kind of boolean value as the previous Statement.execute() method, but it does so for the next statement in the batch (i.e. for every statement except the first).
  • If the current result is a ResultSet (the boolean was true), then we’ll obtain that ResultSet through Statement.getResultSet() (we can obviously no longer call the usual Statement.executeQuery() to obtain the ResultSet).
  • If the current result is not a ResultSet (the boolean was true), then we’ll check the update count value through Statement.getUpdateCount().
  • If the update count is -1, then we’ve reached the end of the batch.

What a nice state machine!

The nice thing about this is that a batch may be completely nondeterministic. E.g. there may be triggers, T-SQL blocks (e.g. an IF statement), stored procedures, and many other things that contribute result sets and/or update counts. In some cases, we simply don’t know what we’ll get.

Executing a batch with jOOQ

It’s great that the JDBC API designers have thought of this exotic API usage on a rather low level. This makes JDBC extremely powerful. But who remembers the exact algorithm all the time? After all, the above minimalistic version required around 20 lines of code for something as simple as that.

Compare this to the following jOOQ API usage:

System.out.println(
    DSL.using(c).fetchMany(sql)
);

The result being:

Result set:
+----+
|  id|
+----+
Update count: 3
Result set:
+----+
|  id|
+----+
|   1|
|   2|
|   3|
+----+

Huh! Couldn’t get much simpler than that! Let’s walk through what happens:

The DSLContext.fetchMany() method is intended for use when users know there will be many result sets and/or update counts. Unlike JDBC which reuses ordinary JDBC API, jOOQ has a different API here to clearly distinguish between behaviours. The method then eagerly fetches all the results and update counts in one go (lazy fetching is on the roadmap with issue #4503).

The resulting type is org.jooq.Results, a type that extends List<Result>, which allows for iterating over the results only, for convenience. If a mix of results or update counts need to be consumed, the Results.resultsOrRows() method can be used.

A note on warnings / errors

Note that if your batch raises errors, then the above JDBC algorithm is incomplete. Read more about this in this follow-up post.

A Functional Programming Approach to Dynamic SQL with jOOQ

Typesafe embedded DSLs like jOOQ are extremely powerful for dynamic SQL, because the query you’re constructing with the jOOQ DSL is a dynamic query by nature. You’re constructing a query expression tree using a convenient API (the “DSL”), even if you think your SQL statement is static. For instance:

for (Record rec : ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                     .from(ACTOR)
                     .where(ACTOR.FIRST_NAME.like("A%")))

    System.out.println(rec.get(ACTOR.FIRST_NAME) 
               + " " + rec.get(ACTOR.LAST_NAME));

The above query looks like a static SQL statement, the way you would write it in PL/SQL, for instance:

FOR rec IN (
  SELECT first_name, last_name
  FROM actor
  WHERE first_name LIKE 'A%'
) LOOP
  dbms_output.put_line(rec.first_name
             || ' ' || rec.last_name);
END LOOP;

The PL/SQL implicit cursor loop runs over the records produced by a pre-compiled SQL statement. That’s not the case with the jOOQ statement, in case of which the Java runtime re-creates the jOOQ statement expression tree every time afresh by dynamically creating an org.jooq.Select object, step by step (more about how the DSL works here).

Using jOOQ for actual dynamic SQL

As we’ve seen before, all jOOQ statements are dynamic statements, even if they “feel” static. Sometimes, you actually want a dynamic SQL query, e.g. when the user is allowed to specify custom predicates. In this case, you could do something like this:

// By default, make the dynamic predicate "TRUE"
Condition condition = DSL.trueCondition();

// If the user entered something in the text search field...
if (hasFirstNameSearch())
    condition = condition.and(FIRST_NAME.like(firstNameSearch()));

// If the user entered something in another text search field...
if (hasLastNameSearch())
    condition = condition.and(LAST_NAME.like(lastNameSearch()));

// The query now uses a dynamically created predicate
for (Record rec : ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                     .from(ACTOR)
                     .where(condition))

    System.out.println(rec.get(ACTOR.FIRST_NAME) 
               + " " + rec.get(ACTOR.LAST_NAME));

The above is not possible with PL/SQL easily, you’d have to resort to the dynamic SQL API called DBMS_SQL, which is about as verbose (and error-prone) as JDBC, as you’re concatenating SQL strings.

Adding functional programming to the mix

If you’re able to construct the entire query in a local scope, e.g. inside of a method, the above imperative style is quite sufficient. But sometimes, you may have something like a “base” query that you want to re-use all the time, and only sometimes, you want to add a custom predicate, or JOIN operation, etc.

In this case, using a more functional approach is optimal. For instance, you could offer a convenience API that produces a query fetching actor first and last names, with custom predicates:

// Higher order, SQL query producing function:
public static ResultQuery<Record2<String, String>> actors(
    Function<Actor, Condition> where
) {
    return ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
              .from(ACTOR)
              .where(where.apply(ACTOR)));
}

The above utility method doesn’t actually execute the query, it just constructs it and takes a function as an argument. In the old days, this used to be called the “strategy pattern”, which is implemented much more easily with a function, than with an object oriented approach (see also Mario Fusco’s interesting blog series about the Gang of Four design patterns).

How to call the above utility? Easy!

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%")))
    System.out.println(rec);

Now, this is not versatile enough yet, as we can pass only one function. How about this, instead:

@SafeVarargs
public static ResultQuery<Record2<String, String>> actors(
    Function<Actor, Condition>... where
) {
    return dsl().select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                .from(ACTOR)
                .where(Arrays.stream(where)
                             .map(f -> f.apply(ACTOR))
                             .collect(Collectors.toList()));
}

(notice how we can immediately execute and iterate over the ResultQuery, as it implements Iterable)

We can now call this with any number of input functions to form dynamic predicates. E.g.:

// Get all actors
for (Record rec : actors())
    System.out.println(rec);

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%"))) {
    System.out.println(rec);

// Get actors whose first/last name matches "A% B%"
for (Record rec : actors(
        a -> a.FIRST_NAME.like("A%"),
        a -> a.LAST_NAME.like("B%"))) {
    System.out.println(rec);

You get the idea.

Conclusion

… the idea is that jOOQ is an extremely powerful SQL expression tree API, which allows you to dynamically construct SQL queries of arbitrary complexity. If you’re running a static query, this just means that all of your SQL expressions are constant every time you execute the query.

There are no limits to how far you can push this. We’ve seen jOOQ users write queries that dynamically assemble dozens of common table expressions with several levels of dynamically nested derived tables, too. If you have a crazy example to share, we’re looking forward to it!

Why Most Programmers Get Pagination Wrong

Pagination is one of those things that almost everyone gets wrong for two reasons:

  • User experience
  • Database performance

Here’s why.

What’s wrong with pagination?

Most applications blindly produce pagination like this:

pagination

This is how GMail implements pagination. With my current settings, it displays 100 E-Mails at a time and also shows how many E-Mails there are in total, namely 1094. Those aren’t the total number of E-Mails I’ve ever had, they’re the total number of E-Mails in my “blog-idea” label (I’m using GMail as a TODO list, and yes, this blog won’t run out of articles any time soon).

What’s wrong with this practice?

Bad user experience

As a user, in most cases, I don’t really care about the total number of objects that are in my result set. At least, I don’t care about the exact number. Does it make any difference if I have 1094 or 1093 E-Mails? What about if I had 1067? Or 1000? 1000 would be precise enough for what I’m concerned.

Also, as a user, in most cases, I don’t care that I’m on page 317 of my paginated screen that displays me rows 3170-3179 (assuming 10 rows per page). I really don’t. The page number is absolutely useless in terms of user experience.

Who got it right?

  • Facebook
  • Twitter
  • Reddit

And all the other websites that do timelines. Yes, I want to display only 10 rows at a time (or perhaps 100, but almost never all). So, pagination is important. But I don’t care about the fact that I’ve clicked 317 times on that “next page” button. If I ever browse that many pages (and I hardly ever do), then the only thing that matters is the next 10 rows. Just like when you play Civilization. You don’t care that you’re in turn 317. You just want to play one more turn:

4b665f86cbb10d44e2db6ae4c96fef4050f0ce42878015ab30cf681b84537a30[1]

Moreover, I never ever ever want to jump to page 317 right from the beginning. There’s absolutely no use case out there, where I search for something, and then I say, hey, I believe my search result will be item #3175 in the current sort order. Exactly. Instead, I will do any of these:

  • Refine the search
  • Sort the result

In both cases, I will get a result where the record that I’m looking for is much more likely to appear on page #1 or perhaps #2. Again, when was the last time you googled for SQL and then went to page #18375 to find that particular blog post that you were looking for? No. You searched for “Java 8 SQL” to find jOOQ, the best way to write SQL in Java 8. For instance.

How to implement a timeline with SQL

If your data source is a SQL database, you might have implemented pagination by using LIMIT .. OFFSET, or OFFSET .. FETCH or some ROWNUM / ROW_NUMBER() filtering (see the jOOQ manual for some syntax comparisons across RDBMS). OFFSET is the right tool to jump to page 317, but remember, no one really wants to jump to that page, and besides, OFFSET just skips a fixed number of rows. If there are new rows in the system between the time page number 316 is displayed to a user and when the user skips to page number 317, the rows will shift, because the offsets will shift. No one wants that either, when they click on “next”.

Instead, you should be using what we refer to as “keyset pagination” (as opposed to “offset pagination”). We’ve described this in past articles here:

The SQL syntax is a bit cumbersome as the pagination criteria becomes an ordinary predicate, but if you’re using jOOQ, you can use the simple synthetic SQL SEEK clause as such:

DSL.using(configuration)
   .select(PLAYERS.PLAYER_ID,
           PLAYERS.FIRST_NAME,
           PLAYERS.LAST_NAME,
           PLAYERS.SCORE)
   .from(PLAYERS)
   .where(PLAYERS.GAME_ID.eq(42))
   .orderBy(PLAYERS.SCORE.desc(),
            PLAYERS.PLAYER_ID.asc())
   .seek(949, 15)
   .limit(10)
   .fetch();

The above will fetch the next 10 players after the player with SCORE 949 and ID 15. The pagination really depends on the ORDER BY clause, which is why you have to provide as many values in the pagination as you provided columns in the ORDER BY clause.

Now, that we’ve fixed the user experience let’s also look at …

How OFFSET pagination is bad for performance

The previously linked articles about keyset pagination also mention the poor performance of OFFSET pagination. Which is kind of obvious as OFFSET has to skip a given number of rows after applying all predicates and sorting, etc. So the database has to do all the work and then throw away 3170 records (if you’re jumping to page 317 with 10 rows per page). What a waste.

The following diagram shows very nicely how OFFSET gets slower and slower for large offsets:

Reproduced from use-the-index-luke.com with permission by Markus Winand

That’s the obvious problem, but there’s another one. People always count the total number of rows to calculate the total number of possible pages. Why? To display nonsense like the following:

Page number:
1 2 3 ... 315 316 317 318 319 ... 50193 50194

Wow. OK so we’re on page number 317, which we don’t really care about in the first place, but we could just as well jump to page number 50194. This means that the database needed to run the query across all the rows just to be sure we get exactly 50194 pages in total.

Google something like page number pagination and observe the number of tutorials that show how you can implement the above nonsense. On Google Image search, you’ll find:

pagination-google

At the same time, the Google search itself reveals:

pagination-google-search

As you can see, Google estimates that there are probably at least 10 pages for your search and you can go “next”. Yes, you can skip some pages, but you cannot skip to a page number 50194, because, again:

  • No one wants that
  • It’s costly to predict, even for Google

In fact, Google search implements keyset pagination as well, just like Twitter, Facebook, Reddit. And they don’t display the total number of pages because counting that total can be very costly, depending on your database.

In particular, databases that do not support window functions will require you to run two separate queries:

  1. The actual query with a LIMIT clause
  2. An additional query replacing the SELECT column list with a simple COUNT(*)

Needless to say that this is not the best approach. If your database supports window functions (read about that miraculous SQL feature here on the jOOQ blog), you could produce the total row count in one go as such:

SELECT 
  rental_date, 
  inventory_id,
  COUNT(*) OVER()
FROM rental
WHERE customer_id = 1
ORDER BY rental_date
LIMIT 10

That COUNT(*) OVER() window function is like an ordinary aggregate function, except that it doesn’t group your results. It just counts all the rows of your result and produces that count in each row, prior to limiting the result to 10.

When run against the Sakila database, the above produces:

rental_date          inventory_id  count
2005-05-25 11:30:37          3021     32
2005-05-28 10:35:23          4020     32
2005-06-15 00:54:12          2785     32
2005-06-15 18:02:53          1021     32
2005-06-15 21:08:46          1407     32
2005-06-16 15:18:57           726     32
2005-06-18 08:41:48           197     32
2005-06-18 13:33:59          3497     32
2005-06-21 06:24:45          4566     32
2005-07-08 03:17:05          1443     32

So, we’re displaying the first page with 10 rows and we need to provide navigational links for a total of 4 pages because we have a total of 32 rows.

What happens when we benchmark this query on PostgreSQL? The first run doesn’t calculate this COUNT(*) OVER() value, whereas the second one does:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 10000;
  rec RECORD;
BEGIN
  v_ts := clock_timestamp();

  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id
      FROM rental
      WHERE customer_id = 1
      ORDER BY rental_date
      LIMIT 10
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  RAISE INFO 'Statement 1: %', (clock_timestamp() - v_ts); 
  v_ts := clock_timestamp();

  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id,
        COUNT(*) OVER()
      FROM rental
      WHERE customer_id = 1
      ORDER BY rental_date
      LIMIT 10
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  RAISE INFO 'Statement 2: %', (clock_timestamp() - v_ts); 
END$$;

The result clearly indicates that in PostgreSQL, there’s a significant overhead in calculating this value:

INFO:  Statement 1: 00:00:01.041823
INFO:  Statement 2: 00:00:03.57145

Oracle optimises things a bit better when you’re using ROWNUM to paginate:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT NUMBER := 5000;
BEGIN
  v_ts := SYSTIMESTAMP;
     
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id
      FROM (
        SELECT 
          rental.*, 
          ROWNUM rn
        FROM rental
        WHERE customer_id = 1
        ORDER BY rental_date
      ) rental
      WHERE rn < 5
      ORDER BY rn
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
     
  dbms_output.put_line('Statement 1: ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
     
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id,
        COUNT(*) OVER()
      FROM (
        SELECT 
          rental.*,  
          ROWNUM rn
        FROM rental
        WHERE customer_id = 1
        ORDER BY rental_date
      ) rental
      WHERE rn < 5
      ORDER BY rn
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
     
  dbms_output.put_line('Statement 2: ' || (SYSTIMESTAMP - v_ts));
END;
/

Result:

Statement 1: X
Statement 2: X +/- 1%

So, the COUNT(*) seems to be calculated “for free”. Bonus question: Why is that?

Due to Oracle license restrictions, we cannot publish benchmark results here, comparing Oracle with PostgreSQL, sorry, but you can run the above code yourself against the Sakila database:
https://www.jooq.org/sakila

Conclusion

TL;DR: OFFSET pagination bad. Keyset pagination good.

no-offset-banner-468x60.white

If you want to paginate in your application, please make sure whether you really, really, really need:

  • Exact page number
  • High page numbers
  • The last page number
  • The total number of rows

Because if you don’t (and in 98% of all UIs, you really don’t), then you can drastically speed up your queries while providing your users a much better experience. If that’s not a win-win situation worth thinking about…?

And don’t forget, jOOQ ships with native keyset pagination support!

Using Stored Procedures With JPA, JDBC… Meh, Just Use jOOQ

The current edition of the Java magazine has an article about Big Data Best Practices for JDBC and JPA by Josh Juneau:
http://www.javamagazine.mozaicreader.com/MayJune2016

The article shows how to use a stored procedure with JDBC (notice how resources aren’t closed, unfortunately. This is commonly forgotten, even in Java Magazine articles)

// Using JDBC to call upon a database stored
// procedure
CallableStatement cs = null;
try {
    cs = conn.prepareCall("{call DUMMY_PROC(?,?)}");
    cs.setString(1, "This is a test");
    cs.registerOutParameter(2, Types.VARCHAR);
    cs.executeQuery();

    // Do something with result
    String returnStr = cs.getString(2);
} catch (SQLException ex){
    ex.printStackTrace();
}

And with JPA:

// Utilize JPA to call a database stored procedure
// Add @NamedStoredProcedureQuery to entity class
@NamedStoredProcedureQuery(
    name="createEmp", procedureName="CREATE_EMP",
    parameters = {
        @StoredProcedureParameter(
            mode= ParameterMode.IN,
            type=String.class,
            name="first"),
        @StoredProcedureParamter(
            mode = ParameterMode.IN,
            type=String.class,
            name="last")
    })

// Calling upon stored procedure
StoredProcedureQuery qry =
    em.createStoredProcedureQuery("createEmp");
qry.setParameter("first", "JOSH");
qry.setParameter("last","JUNEAU");
qry.execute();

Specifically the latter was also recently discussed in blog posts by Vlad Mihalcea and Thorben Janssen.

Do you like verbosity and complexity?

No? We neither. This is why we give you a third option instead: Just use jOOQ. Here’s the equivalent jOOQ code:

// JDBC example:
String returnStr = Routines.dummyProc(
    config, "This is a test");

// JPA example
Routines.createEmp(config, "JOSH", "JUNEAU");

Yes! That’s it. Don’t waste time manually configuring your bind variables with JDBC API calls, or JPA annotations. No one likes writing annotations for stored procedures. With jOOQ and jOOQ’s code generator, procedure calls are:

  • A one-liner
  • A no-brainer
  • A way to bring back the fun to stored procedures

Learn more about using Oracle stored procedures with nested collections and object types here:
https://blog.jooq.org/2014/11/04/painless-access-from-java-to-plsql-procedures-with-jooq

Using jOOQ’s ExecuteListener to Prevent Write Operations on a Connection

Security is important, especially on the data access layer. Most commercial databasese allow for fine-grained privilege control using database access grants. For instance, you would be restricting access from a user to a certain set of tables (or even better: views), via GRANT statements:

GRANT SELECT ON table TO user;

With this fine-grained access control, write operations on certain database objects can be prevented directly in the database.

What if that’s not possible?

Not all databases ship with sophisticated access privilege implementations, or perhaps, your application cannot profit from those features for operational reasons. In that case, you should at least be able to implement security on the client, e.g. by using jOOQ’s ExecuteListener (for coarse grained access control), or by using jOOQ’s VisitListener (for fine grained access control).

An example using an ExecuteListener might look like this:

class ReadOnlyListener extends DefaultExecuteListener {
    @Override
    public void executeStart(ExecuteContext ctx) {
        if (ctx.type() != READ)
            throw new DataAccessException("No privilege to execute " + ctx.sql());
    }
}

If you hook this listener into your jOOQ Configuration, you will no longer be able to execute any write operations on that Configuration. It’s that easy!

For more fine-grained control (e.g. a per-table ACL), a VisitListener will do the trick. An (very much simplified) example implementation that shows what can be done can be seen here:

static class ACLListener extends DefaultVisitListener {

    @Override
    public void visitStart(VisitContext context) {
        if (context.queryPart() instanceof Table
                && Arrays.asList(context.clauses()).contains(INSERT_INSERT_INTO)
                && ((Table<?>) context.queryPart()).getName().equals("AUTHOR"))
            throw new DataAccessException("No privilege to insert into AUTHOR");
    }
}

Essentially, this check prevents a client session from running insert statements into the AUTHOR table. A future version of jOOQ will ship with this kind of ACL VisitListener out of the box, when https://github.com/jOOQ/jOOQ/issues/5197 is implemented.