Let me explain…
It hasn’t been until the recent SQL:2008 standard that what MySQL users know asLIMIT .. OFFSET
was standardised into the following simple statement:
SELECT *
FROM BOOK
OFFSET 2 ROWS
FETCH NEXT 1 ROWS ONLY

LIMIT .. OFFSET
clause, which is why we chose that for the jOOQ DSL API
In SQL:
SELECT * FROM BOOK LIMIT 1 OFFSET 2
select().from(BOOK).limit(1).offset(2);
-- MySQL, H2, HSQLDB, Postgres, and SQLite
SELECT * FROM BOOK LIMIT 1 OFFSET 2
-- CUBRID supports a MySQL variant of the
-- LIMIT .. OFFSET clause
SELECT * FROM BOOK LIMIT 2, 1
-- Derby, SQL Server 2012, Oracle 12, SQL:2008
SELECT * FROM BOOK
OFFSET 2 ROWS FETCH NEXT 1 ROWS ONLY
-- Ingres. Eek, almost the standard. Almost!
SELECT * FROM BOOK
OFFSET 2 FETCH FIRST 1 ROWS ONLY
-- Firebird
SELECT * FROM BOOK ROWS 2 TO 3
-- Sybase SQL Anywhere
SELECT TOP 1 ROWS START AT 3 * FROM BOOK
-- DB2 (without OFFSET)
SELECT * FROM BOOK FETCH FIRST 1 ROWS ONLY
-- Sybase ASE, SQL Server 2008 (without OFFSET)
SELECT TOP 1 * FROM BOOK
TOP
clause before the SELECT
list. This is easy to emulate. Now what about:
- Oracle 11g and less
- SQL Server 2008 and less
- DB2 with OFFSET
OFFSET .. FETCH
in those older databases. The optimal solutions always involve:
- Using doubly-nested derived tables with
ROWNUM
filtering in Oracle - Using single-nested derived tabels with
ROW_NUMBER()
filtering in SQL Server and DB2

Do you think you will get it right?
;-) Let us go through a couple of issues that you may not have thought about. First off, Oracle. Oracle obviously wanted to create a maximum vendor-lockin, which is only exceeded by Apple’s recent introduction of Swift. This is whyROWNUM
solutions perform best, even better than SQL:2003 standard window function based solutions. Don’t believe it? Read this very interesting article on Oracle offset pagination performance.
So, the optimal solution in Oracle is:
-- PostgreSQL syntax:
SELECT ID, TITLE
FROM BOOK
LIMIT 1 OFFSET 2
-- Oracle equivalent:
SELECT *
FROM (
SELECT b.*, ROWNUM rn
FROM (
SELECT ID, TITLE
FROM BOOK
) b
WHERE ROWNUM <= 3 -- (1 + 2)
)
WHERE rn > 2
So that’s really the equivalent?
Of course not. You’re selecting an additional column, thern
column. You might just not care in most cases, but what if you wanted to make a limited subquery to be used with an IN
predicate?
-- PostgreSQL syntax:
SELECT *
FROM BOOK
WHERE AUTHOR_ID IN (
SELECT ID
FROM AUTHOR
LIMIT 1 OFFSET 2
)
-- Oracle equivalent:
SELECT *
FROM BOOK
WHERE AUTHOR_ID IN (
SELECT * -- Ouch. These are two columns!
FROM (
SELECT b.*, ROWNUM rn
FROM (
SELECT ID
FROM AUTHOR
) b
WHERE ROWNUM <= 3
)
WHERE rn > 2
)
LIMIT .. OFFSET
, then you might just patch the ID
column into the subquery:
SELECT *
FROM BOOK
WHERE AUTHOR_ID IN (
SELECT ID -- better
FROM (
SELECT b.ID, ROWNUM rn -- better
FROM (
SELECT ID
FROM AUTHOR
) b
WHERE ROWNUM <= 3
)
WHERE rn > 2
)
So now, it is correct?
Of course not! Because you can have ambiguous column names in top-levelSELECT
s, but not in nested selects. What if you want to do this:
-- PostgreSQL syntax:
-- Perfectly valid repetition of two ID columns
SELECT BOOK.ID, AUTHOR.ID
FROM BOOK
JOIN AUTHOR
ON BOOK.AUTHOR_ID = AUTHOR.ID
LIMIT 1 OFFSET 2
-- Oracle equivalent:
SELECT *
FROM (
SELECT b.*, ROWNUM rn
FROM (
-- Ouch! ORA-00918: column ambiguously defined
SELECT BOOK.ID, AUTHOR.ID
FROM BOOK
JOIN AUTHOR
ON BOOK.AUTHOR_ID = AUTHOR.ID
) b
WHERE ROWNUM <= 3
)
WHERE rn > 2
ID
instances. And renaming the columns to random values is nasty, because the user of your home-grown in-house database framework wants to receive well-defined column names. I.e. ID
and… ID
.
So, the solution is to rename the columns twice. Once in each derived table:
-- Oracle equivalent:
-- Rename synthetic column names back to original
SELECT c1 ID, c2 ID
FROM (
SELECT b.c1, b.c2, ROWNUM rn
FROM (
-- synthetic column names here
SELECT BOOK.ID c1, AUTHOR.ID c2
FROM BOOK
JOIN AUTHOR
ON BOOK.AUTHOR_ID = AUTHOR.ID
) b
WHERE ROWNUM <= 3
)
WHERE rn > 2
But now, we’re done?
Of course not! What if you doubly nest such a query? Will you think about doubly renamingID
columns to synthetic names, and back? … ;-) Let’s leave it here and talk about something entirely different:
Does the same thing work for SQL Server 2008?
Of course not! In SQL Server 2008, the most popular approach is to use window functions. Namely,ROW_NUMBER()
. So, let’s consider:
-- PostgreSQL syntax:
SELECT ID, TITLE
FROM BOOK
LIMIT 1 OFFSET 2
-- SQL Server equivalent:
SELECT b.*
FROM (
SELECT ID, TITLE,
ROW_NUMBER() OVER (ORDER BY ID) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
So that’s it, right?
Of course not! ;-) OK, we’ve already had this issue. We should not select*
, because that would generate too many columns in the case that we’re using this as a subquery for an IN
predicate. So let’s consider the correct solution with synthetic column names:
-- SQL Server equivalent:
SELECT b.c1 ID, b.c2 TITLE
FROM (
SELECT ID c1, TITLE c2,
ROW_NUMBER() OVER (ORDER BY ID) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
But now we got it, right?
Make an educated guess: Nope!
What happens, if you add an
ORDER BY
clause to the original query?
-- PostgreSQL syntax:
SELECT ID, TITLE
FROM BOOK
ORDER BY SOME_COLUMN
LIMIT 1 OFFSET 2
-- Naive SQL Server equivalent:
SELECT b.c1 ID, b.c2 TITLE
FROM (
SELECT ID c1, TITLE c2,
ROW_NUMBER() OVER (ORDER BY ID) rn
FROM BOOK
ORDER BY SOME_COLUMN
) b
WHERE rn > 2 AND rn <= 3
ORDER BY
clause, unless they also have a TOP
clause (or an OFFSET .. FETCH
clause in SQL Server 2012).
OK, we can probably tweak this using TOP 100 PERCENT
to make SQL Server happy.
-- Better SQL Server equivalent:
SELECT b.c1 ID, b.c2 TITLE
FROM (
SELECT TOP 100 PERCENT
ID c1, TITLE c2,
ROW_NUMBER() OVER (ORDER BY ID) rn
FROM BOOK
ORDER BY SOME_COLUMN
) b
WHERE rn > 2 AND rn <= 3
SOME_COLUMN
in the outer query, you’d have to again transform the SQL statement to add another synthetic column:
-- Better SQL Server equivalent:
SELECT b.c1 ID, b.c2 TITLE
FROM (
SELECT TOP 100 PERCENT
ID c1, TITLE c2,
SOME_COLUMN c99,
ROW_NUMBER() OVER (ORDER BY ID) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
ORDER BY b.c99
This is the correct solution!
Of course not! What if the original query hadDISTINCT
in it?
-- PostgreSQL syntax:
SELECT DISTINCT AUTHOR_ID
FROM BOOK
LIMIT 1 OFFSET 2
-- Naive SQL Server equivalent:
SELECT b.c1 AUTHOR_ID
FROM (
SELECT DISTINCT AUTHOR_ID c1,
ROW_NUMBER() OVER (ORDER BY AUTHOR_ID) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
DISTINCT
keyword should remove such duplicates, and effectively, the PostgreSQL query will correctly remove duplicates first, and then apply LIMIT
and OFFSET
.
However, the ROW_NUMBER()
predicate always generates distinct row numbers before DISTINCT
can remove them again. In other words, DISTINCT
has no effect.
Luckily, we can tweak this SQL again, using this neat little trick:
-- Better SQL Server equivalent:
SELECT b.c1 AUTHOR_ID
FROM (
SELECT DISTINCT AUTHOR_ID c1,
DENSE_RANK() OVER (ORDER BY AUTHOR_ID) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
ORDER BY
clause must contain all columns from the SELECT
field list. Obviously, this will limit the acceptable columns in the SELECT DISTINCT
field list to columns that are allowed in a window function’s ORDER BY
clause (e.g. no other window functions).
We could of course try to fix that as well using common table expressions, or we consider
Yet another issue??
Yes, of course! Do you even know what the column(s) in the window function’sORDER BY
clause should be? Have you just picked any column, at random? What if that column doesn’t have an index on it, will your window function still perform?
The answer is easy when your original SELECT
statement also has an ORDER BY
clause, then you should probably take that one (plus all the columns from the SELECT DISTINCT
clause if applicable).
But what if you don’t have any ORDER BY
clause?
Yet another trick! Use a “constant” variable:
-- Better SQL Server equivalent:
SELECT b.c1 AUTHOR_ID
FROM (
SELECT AUTHOR_ID c1,
ROW_NUMBER() OVER (ORDER BY @@version) rn
FROM BOOK
) b
WHERE rn > 2 AND rn <= 3
ORDER BY
clauses, in SQL Server. Painful, I know.
Read more about this @@version trick here.
Are we done yet!?!?
Probably not ;-) But we have probably covered around 99% of the common and edge cases. We can sleep nicely, now. Note that all of these SQL transformations are implemented in jOOQ. jOOQ is the only SQL abstraction framework that takes SQL seriously (with all its warts and caveats), standardising over all of this madness. As mentioned in the beginning, with jOOQ, you just write:
// Don't worry about general emulation
select().from(BOOK).limit(1).offset(2);
// Don't worry about duplicate column names
// in subselects
select(BOOK.ID, AUTHOR.ID)
.from(BOOK)
.join(AUTHOR)
.on(BOOK.AUTHOR_ID.eq(AUTHOR.ID))
.limit(1).offset(2);
// Don't worry about invalid IN predicates
select()
.from(BOOK)
.where(BOOK.AUTHOR_ID).in(
select(AUTHOR.ID)
.from(AUTHOR)
.limit(1).offset(2)
);
// Don't worry about the ROW_NUMBER() vs.
// DENSE_RANK() distinction
selectDistinct(AUTHOR_ID)
.from(BOOK).limit(1).offset(2);

Keyset paging
Now, of course, if you have been reading our blog, or our partner blog SQL Performance Explained, you should know by now thatOFFSET
pagination is often a bad choice in the first place. You should know that keyset pagination almost always outperforms OFFSET
pagination.
Read about how jOOQ natively supports keyset pagination using the SEEK clause, here.