Slow OFFSETIn order to understand the Seek Method, let’s first understand what problem it solves: SQL OFFSET clauses are slow. They’re slow for a simple reason. In order to reach a high offset from a result set, all previous records have to be skipped and counted. While a query with no
OFFSETcan be very fast (using MySQL syntax):
Skipping to page number 10’000 will be much slower:
SELECT first_name, last_name, score FROM players WHERE game_id = 42 ORDER BY score DESC LIMIT 10;
Even if the tuple
SELECT first_name, last_name, score FROM players WHERE game_id = 42 ORDER BY score DESC LIMIT 10 OFFSET 100000;
(game_id, score)is indexed, we’ll have to actually traverse the whole index in order to count how many records we’ve already skipped. While this problem can be somewhat lessened by a trick, joining
playersto a derived table, there is an alternative, much faster approach to tackling paging: the Seek Method.
The Seek MethodWhile it is not quite clear who originally invented the Seek Method (some also call it “keyset paging”), a very prominent advocate for it is Markus Winand. He describes the Seek Method on his blog (and in his book): http://use-the-index-luke.com/sql/partial-results/fetch-next-page Essentially, the Seek Method does not skip records before an OFFSET, but it skips records until the last record previously fetched. Think about paging on Google. From a usability point of view, you hardly ever skip exactly 100’000 records. You mostly want to skip to the next page and then again, to the next page, i.e. just past the last record / search result previously fetched. Take the following top 10 players (fake names generated with name generator):
first_name | last_name | score ------------------------------ Mary | Paige | 1098 Tracey | Howard | 1087 Jasmine | Butler | 1053 Zoe | Piper | 1002 Leonard | Peters | 983 Jonathan | Hart | 978 Adam | Morrison | 976 Amanda | Gibson | 967 Alison | Wright | 958 Jack | Harris | 949The above are the first 10 players ordered by score. This can be achieved quite quickly using
LIMIT 10only. Now, when skipping to the next page, you can either just use an
OFFSET 10clause, or you skip all users with a score higher than
This will then give you the players on the next page:
SELECT first_name, last_name, score FROM players WHERE game_id = 42 -- Let's call this the "seek predicate" AND score < 949 ORDER BY score DESC LIMIT 10;
first_name | last_name | score ------------------------------ William | Fraser | 947 Claire | King | 945 Jessica | McDonald | 932 ... | ... | ...Note that the previous query assumes that the
scoreis unique within the
playerstable, which is unlikely, of course. If William Fraser also had
949points, just as Jack Harris, the last player on the first page, he would be “lost between pages”. It is thus important to create a non-ambiguous ORDER BY clause and “seek predicate”, by adding an additional unique column:
Now, the “seek predicate” depends on the
SELECT player_id, first_name, last_name, score FROM players WHERE game_id = 42 -- assuming 15 is Jack Harris's player_id AND (score, player_id) < (949, 15) ORDER BY score DESC, player_id DESC LIMIT 10;
ORDER BYclause. Here are a couple of possible, alternative configurations:
If columns in the
-- "consistent" ASC and DESC correspond to > and < AND (score, player_id) > (949, 15) ORDER BY score ASC, player_id ASC -- "mixed" ASC and DESC complicate things a bit AND ((score < 949) OR (score = 949 AND player_id > 15)) ORDER BY score DESC, player_id ASC -- The above might be further performance-tweaked AND (score <= 949) AND ((score < 949) OR (score = 949 AND player_id > 15)) ORDER BY score DESC, player_id ASC
ORDER BYclause are nullable,
NULLS LASTmight apply and further complicate the “seek predicate”.
How is this better than OFFSET?The Seek Method allows for avoiding expensive “skip-and-count” operations, replacing them with a simple range scan on an index that might cover the “seek predicate”. Since you’re applying ORDER BY on the columns of the “seek predicate” anyway, you might have already chosen to index them appropriately. While the Seek Method doesn’t improve queries for low page numbers, fetching higher page numbers is significantly faster as proven in this nice benchmark: More interesting feedback on the subject can be found in this reddit.com thread, in which even Tom Kyte himself added a couple of remarks.
A side effect of the Seek MethodA side effect of the Seek Method is the fact that the paging is more “stable”. When you’re about to display page 2 and a new player has reached page 1 in the mean time, or if any player is removed entirely, you will still display the same players on page 2. In other words, when using the Seek Method, there is no guarantee that the first player on page 2 has rank 11. This may or may not be desired. It might be irrelevant on page 10’000, though.
jOOQ 3.3 support for the Seek MethodThe upcoming jOOQ 3.3 (due for late 2013) will include support for the Seek Method on a SQL DSL API level. In addition to jOOQ’s existing LIMIT .. OFFSET support, a “seek predicate” can then be specified through the synthetic
SEEKclause (similar to jOOQ’s synthetic
Instead of explictly phrasing the “seek predicate”, just pass the last record from the previous query, and jOOQ will see that all records before and including this record are skipped, given the
DSL.using(configuration) .select(PLAYERS.PLAYER_ID, PLAYERS.FIRST_NAME, PLAYERS.LAST_NAME, PLAYERS.SCORE) .from(PLAYERS) .where(PLAYERS.GAME_ID.eq(42)) .orderBy(PLAYERS.SCORE.desc(), PLAYERS.PLAYER_ID.asc()) .seek(949, 15) // (!) .limit(10) .fetch();
ORDER BYclause. This appears much more readable than the actual SQL rendered because the “seek predicate” is closer to the
ORDER BYclause where it belongs. Also, jOOQ’s usual row value typesafety is applied here helping you find the right degree / arity and data types for your
SEEKclause. In the above example, the following method calls would not compile in Java:
// Not enough arguments in seek() .orderBy(PLAYERS.SCORE.desc(), PLAYERS.PLAYER_ID.asc()) .seek(949) // Wrong argument types in seek() .orderBy(PLAYERS.SCORE.desc(), PLAYERS.PLAYER_ID.asc()) .seek(949, "abc")
Get to work with the Seek MethodWith native API support for a
SEEKclause, you can get in control of your SQL again and implement high-performing SQL quite easily. Early adopters can already play around with the current state of jOOQ’s 3.3.0 Open Source Edition, which is available on GitHub. And even if you don’t use jOOQ, give the Seek Method a try. You may just have a much faster application afterwards!