jOOQ’s DSL, like any fluent API, has one big caveat. It’s very easy to forget to call .execute(). And when you do, chances are, you’re going to be staring at your code for minutes, because everything looks perfect:
In principle, this kind of mistake can happen with any fluent API. E.g. StringBuilder
sb.append("a").append("b"); // Not consuming the result
Or streams:
Stream.of(1, 2).peek(System.out::println); // Not so much peeking
But it usually doesn’t happen as much, because the difference to jOOQ is that
jOOQ’s DML statements (INSERT, UPDATE, DELETE, MERGE) and DDL statements (CREATE, ALTER, DROP, TRUNCATE), and a few other produce a side effect
That side effect is the only thing we care about. The result (the update count) is mostly irrelevant
And as such, we don’t care about the result of execute(), which is an int. No one forgets to call fetch() on a jOOQ ResultQuery:
ctx.select(T.A, T.B)
.from(T); // Well duh
Because without calling fetch() or something similar, we’re not getting any results, and we want the results, just like with the StringBuilder or the Stream. But we don’t want the execute() result.
As such, even we, when writing jOOQ’s integration tests, occasionally forget to call this silly little method.
No more!
When it happened again this week…
10 minutes trying to figure out why my JOOQ insert didn't seem to be working…
… had forgotten `.execute()`
… fuckin' stunning colleague over here
— 💀 r͍̣̼ͯ̑ͪò̜b̹ ̽ͦfͤ͐l̫̺̙et̘̯̫́̃͐c͙̪̤ͧ̏ͩh̙͗er͒̔ 💀 (@_fletchr) March 25, 2021
And they already can! Apart from the org.jetbrains.annotations.Contract annotation, which is there for precisely this reason, apparently, it’s also possible to mimick the JSR-305 @CheckReturnValue annotation on every method “whose return value needs to be checked” (i.e. a method that has no side effect, or whose side effect is to mutate only “this“).
As you can see, IntelliJ now warns the user whenever they forget to consume the result of any of jOOQ’s DSL methods (by calling execute(), passing it to some method that consumes it, etc.)
Thanks again to Tagir Valeev from JetBrains for walking me through this and even improving the @Contract annotation, which jOOQ might use in a future version.
When paginating results in SQL, we use standard SQL OFFSET .. FETCH or a vendor specific version of it, such as LIMIT .. OFFSET. For example:
SELECT first_name, last_name
FROM actor
ORDER BY actor_id
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY
As always, we’re using the Sakila database for this example.
This is rather straightforward. It will give us page 2 out of N pages, with a page size of 10. But how do we calculate these values? How do we know we’re on page 2? How do we know the number of pages N? Can we calculate this without an extra round-trip e.g. to calculate the total number of actors:
-- Yuck, a second round-trip!
SELECT COUNT(*)
FROM actor
If you’re still convinced OFFSET pagination is what you need, as opposed to keyset pagination, let’s look at how to calculate the above meta data with SQL.
What Metadata Do We Need?
The metadata we typically need to paginate using OFFSET are these:
TOTAL_ROWS: The total number of records if we hadn’t paginated
CURRENT_PAGE: The current page we’re on
MAX_PAGE_SIZE: The maximum page size
ACTUAL_PAGE_SIZE: The actual page size (when on the last page)
ROW: The actual offsets of the returned rows
LAST_PAGE: Whether we are on the last page
The maximum page size is something we set to the query, so it doesn’t have to be calculated. Everything else needs to be calculated. And here’s how to do that in a single query
SELECT
t.first_name,
t.last_name,
-- Calculate some meta data
COUNT(*) OVER () AS actual_page_size,
MAX(row) OVER () = total_rows AS last_page,
-- Metadata from the subquery
total_rows,
row,
((row - 1) / :max_page_size) + 1 AS current_page
FROM (
SELECT
u.*,
-- Calculate some meta data, repeating the ORDER BY from
-- the original query
COUNT(*) OVER () AS total_rows,
ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row
-- Original query with all the predicates, joins, as a derived table
FROM (
SELECT *
FROM actor
) AS u
-- Ordering and pagination done here, where :offset is
-- The maximum row value of the previous page + 1
ORDER BY u.actor_id
OFFSET :offset ROWS
FETCH NEXT :max_page_size ROWS ONLY
) AS t
ORDER BY t.actor_id
-- 3.3
SELECT
t.first_name,
t.last_name,
-- 3.2
COUNT(*) OVER () AS actual_page_size,
MAX(row) OVER () = total_rows AS last_page,
-- 3.3
total_rows,
row,
((row - 1) / :max_page_size) + 1 AS current_page
-- 3.1
FROM (
-- 2.3
SELECT
u.*,
-- 2.2
COUNT(*) OVER () AS total_rows,
ROW_NUMBER () OVER (ORDER BY u.actor_id) AS row
-- 2.1
FROM (
-- 1.2
SELECT *
-- 1.1
FROM actor
) AS u
-- 2.4
ORDER BY u.actor_id
-- 2.5
OFFSET :offset ROWS
FETCH NEXT :max_page_size ROWS ONLY
) AS t
-- 3.4
ORDER BY t.actor_id
Step by step explanation
First off, the original query SELECT * FROM actor is wrapped as a derived table called u. You can do almost anything you want with this original query, applying only a few transformations:
1.1, 1.2, 2.1: You need to project (SELECT clause) the columns that your original query projected, plus the columns that you need for ORDER BY. Because I’m projecting the right things in the outermost query, and because there’s no DISTINCT clause in the original query, I conveniently projected *. Alternatively, I could have projected FIRST_NAME, LAST_NAME (because that is projected in the original query), and ACTOR_ID (because that’s what we ORDER BY).
2.2: On that derived table u, we’re now able to calculate some metadata, including TOTAL_ROWS as COUNT(*) OVER () and ROW as ROW_NUMBER () OVER (ORDER BY t.actor_id). The COUNT(*) OVER () window function has an empty window specification OVER (), meaning it calculates all the rows that result from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from u in our particular example. Without a second round-trip! The ROW_NUMBER () OVER (ORDER BY u.actor_id) orders all the rows in u by u.actor_id and assigns unique row numbers to them, according to that ordering.
2.3: The window functions are calculated implicitly because they’re located in the projection of this derived table. We’re also again going to conveniently project everything from u.*, because the outer-most query is the one that projects columns explicitly.
2.4: The original ordering has been moved here because there is no guarantee that the ordering would have been maintained if we had ordered the contents of u. But we need the ordering to calculate OFFSET .. FETCH right after
2.5: This is where we paginate. The OFFSET corresponds to the maximum ROW value that we’ve encountered before. We start at 0, and with a page size of 15, we use 15 on the next page. Remember that while indexes are 1 based in SQL, OFFSET is 0 based.
3.1: All of the above is wrapped again in a derived table, in order to make further calculations on it, namely:
3.2: We can again calculate COUNT(*) OVER (), calculating the total number of rows that result from the FROM, WHERE, GROUP BY, HAVING clauses, i.e. from t in our particular example. This time, the number of rows can be no more than MAX_PAGE_SIZE, because that’s what the FETCH (or LIMIT) clause inside of t says. But it can be less, too, so this is what we use to calculate the ACTUAL_PAGE_SIZE. Finally, we compare MAX(row) OVER () = total_rows to see if we’re on the last page, meaning the highest value for row in the current page resulting from t is compared to the total row count. Another way to calculate the LAST_PAGE value would be if ACTUAL_PAGE_SIZE < MAX_PAGE_SIZE, i.e. COUNT(*) OVER () < :MAX_PAGE_SIZE.
3.3: In addition to the usual projection of the original columns FIRST_NAME, LAST_NAME (we’re no longer projecting * now!), we’re doing some final calculations including dividing ROW / TOTAL_ROWS to get the page number. You can calculate more things, such as TOTAL_ROWS / MAX_PAGE_SIZE to get the TOTAL_PAGES value.
3.4: Finally, we have toORDER BY t.actor_id again, don’t let anyone tell you otherwise. In SQL, if you do not ORDER BY, then the ordering is undefined. Sure, it would be silly for an optimiser to re-order things without any good reason. We’ve already ordered the contents of our subqueries in 2.4, but there is no guarantee, that this ordering is stable. Just add DISTINCT, UNION, or a JOIN resulting in a hash join or some random other operator to your query, and the ordering breaks. So, always ORDER BY if ordering is important to you.
And we’re done!
How to do it in jOOQ?
This is the kind of use-case where jOOQ really really shines, because all of this is about dynamic SQL. Your actual business logic is contained in the deeply nested u table. Everything else is “presentation logic”, which is implemented in SQL for very obvious reasons: To improve performance.
And because you want to implement all of this only once in some library of yours, instead of having to play this game on every query, you make this kind of query dynamic. The utility will look like this:
select
t.ACTOR_ID,
t.FIRST_NAME,
t.LAST_NAME,
count(*) over () as actual_page_size,
(max(t.row) over () = t.total_rows) as last_page,
t.total_rows,
t.row,
((t.row / 15) + 1) as current_page
from (
select
u.*,
count(*) over () as total_rows,
row_number() over (order by u.ACTOR_ID) as row
from (
select
ACTOR.ACTOR_ID,
ACTOR.FIRST_NAME,
ACTOR.LAST_NAME
from ACTOR
) as u
order by u.ACTOR_ID
offset 30 rows
fetch next 15 rows only
) as t
order by t.ACTOR_ID
Notice that you can plug in arbitrary SQL fragments into that utility and paginate them. Irrespective of the complexity (including joins, other window functions, grouping, recursion, and what not), jOOQ will have you covered and will now paginate things for you.
jOOQ is all about dynamic SQL. There’s hardly any SQL feature left that jOOQ doesn’t support. This includes window functions, for example, but also making sure that your dynamic SQL works on a large number of SQL dialects, irrespective of the little syntactic details.
You can build your own libraries to construct re-usable SQL elements from other SQL building blocks as this article has shown, to dynamically create single-query OFFSET pagination meta data calculation, without performing additional database round trips.