Why Most Programmers Get Pagination Wrong

Pagination is one of those things that almost everyone gets wrong for two reasons:

  • User experience
  • Database performance

Here’s why.

What’s wrong with pagination?

Most applications blindly produce pagination like this:

pagination

This is how GMail implements pagination. With my current settings, it displays 100 E-Mails at a time and also shows how many E-Mails there are in total, namely 1094. Those aren’t the total number of E-Mails I’ve ever had, they’re the total number of E-Mails in my “blog-idea” label (I’m using GMail as a TODO list, and yes, this blog won’t run out of articles any time soon).

What’s wrong with this practice?

Bad user experience

As a user, in most cases, I don’t really care about the total number of objects that are in my result set. At least, I don’t care about the exact number. Does it make any difference if I have 1094 or 1093 E-Mails? What about if I had 1067? Or 1000? 1000 would be precise enough for what I’m concerned.

Also, as a user, in most cases, I don’t care that I’m on page 317 of my paginated screen that displays me rows 3170-3179 (assuming 10 rows per page). I really don’t. The page number is absolutely useless in terms of user experience.

Who got it right?

  • Facebook
  • Twitter
  • Reddit

And all the other websites that do timelines. Yes, I want to display only 10 rows at a time (or perhaps 100, but almost never all). So, pagination is important. But I don’t care about the fact that I’ve clicked 317 times on that “next page” button. If I ever browse that many pages (and I hardly ever do), then the only thing that matters is the next 10 rows. Just like when you play Civilization. You don’t care that you’re in turn 317. You just want to play one more turn:

4b665f86cbb10d44e2db6ae4c96fef4050f0ce42878015ab30cf681b84537a30[1]

Moreover, I never ever ever want to jump to page 317 right from the beginning. There’s absolutely no use case out there, where I search for something, and then I say, hey, I believe my search result will be item #3175 in the current sort order. Exactly. Instead, I will do any of these:

  • Refine the search
  • Sort the result

In both cases, I will get a result where the record that I’m looking for is much more likely to appear on page #1 or perhaps #2. Again, when was the last time you googled for SQL and then went to page #18375 to find that particular blog post that you were looking for? No. You searched for “Java 8 SQL” to find jOOQ, the best way to write SQL in Java 8. For instance.

How to implement a timeline with SQL

If your data source is a SQL database, you might have implemented pagination by using LIMIT .. OFFSET, or OFFSET .. FETCH or some ROWNUM / ROW_NUMBER() filtering (see the jOOQ manual for some syntax comparisons across RDBMS). OFFSET is the right tool to jump to page 317, but remember, no one really wants to jump to that page, and besides, OFFSET just skips a fixed number of rows. If there are new rows in the system between the time page number 316 is displayed to a user and when the user skips to page number 317, the rows will shift, because the offsets will shift. No one wants that either, when they click on “next”.

Instead, you should be using what we refer to as “keyset pagination” (as opposed to “offset pagination”). We’ve described this in past articles here:

The SQL syntax is a bit cumbersome as the pagination criteria becomes an ordinary predicate, but if you’re using jOOQ, you can use the simple synthetic SQL SEEK clause as such:

DSL.using(configuration)
   .select(PLAYERS.PLAYER_ID,
           PLAYERS.FIRST_NAME,
           PLAYERS.LAST_NAME,
           PLAYERS.SCORE)
   .from(PLAYERS)
   .where(PLAYERS.GAME_ID.eq(42))
   .orderBy(PLAYERS.SCORE.desc(),
            PLAYERS.PLAYER_ID.asc())
   .seek(949, 15)
   .limit(10)
   .fetch();

The above will fetch the next 10 players after the player with SCORE 949 and ID 15. The pagination really depends on the ORDER BY clause, which is why you have to provide as many values in the pagination as you provided columns in the ORDER BY clause.

Now, that we’ve fixed the user experience let’s also look at …

How OFFSET pagination is bad for performance

The previously linked articles about keyset pagination also mention the poor performance of OFFSET pagination. Which is kind of obvious as OFFSET has to skip a given number of rows after applying all predicates and sorting, etc. So the database has to do all the work and then throw away 3170 records (if you’re jumping to page 317 with 10 rows per page). What a waste.

The following diagram shows very nicely how OFFSET gets slower and slower for large offsets:

Reproduced from use-the-index-luke.com with permission by Markus Winand

That’s the obvious problem, but there’s another one. People always count the total number of rows to calculate the total number of possible pages. Why? To display nonsense like the following:

Page number:
1 2 3 ... 315 316 317 318 319 ... 50193 50194

Wow. OK so we’re on page number 317, which we don’t really care about in the first place, but we could just as well jump to page number 50194. This means that the database needed to run the query across all the rows just to be sure we get exactly 50194 pages in total.

Google something like page number pagination and observe the number of tutorials that show how you can implement the above nonsense. On Google Image search, you’ll find:

pagination-google

At the same time, the Google search itself reveals:

pagination-google-search

As you can see, Google estimates that there are probably at least 10 pages for your search and you can go “next”. Yes, you can skip some pages, but you cannot skip to a page number 50194, because, again:

  • No one wants that
  • It’s costly to predict, even for Google

In fact, Google search implements keyset pagination as well, just like Twitter, Facebook, Reddit. And they don’t display the total number of pages because counting that total can be very costly, depending on your database.

In particular, databases that do not support window functions will require you to run two separate queries:

  1. The actual query with a LIMIT clause
  2. An additional query replacing the SELECT column list with a simple COUNT(*)

Needless to say that this is not the best approach. If your database supports window functions (read about that miraculous SQL feature here on the jOOQ blog), you could produce the total row count in one go as such:

SELECT 
  rental_date, 
  inventory_id,
  COUNT(*) OVER()
FROM rental
WHERE customer_id = 1
ORDER BY rental_date
LIMIT 10

That COUNT(*) OVER() window function is like an ordinary aggregate function, except that it doesn’t group your results. It just counts all the rows of your result and produces that count in each row, prior to limiting the result to 10.

When run against the Sakila database, the above produces:

rental_date          inventory_id  count
2005-05-25 11:30:37          3021     32
2005-05-28 10:35:23          4020     32
2005-06-15 00:54:12          2785     32
2005-06-15 18:02:53          1021     32
2005-06-15 21:08:46          1407     32
2005-06-16 15:18:57           726     32
2005-06-18 08:41:48           197     32
2005-06-18 13:33:59          3497     32
2005-06-21 06:24:45          4566     32
2005-07-08 03:17:05          1443     32

So, we’re displaying the first page with 10 rows and we need to provide navigational links for a total of 4 pages because we have a total of 32 rows.

What happens when we benchmark this query on PostgreSQL? The first run doesn’t calculate this COUNT(*) OVER() value, whereas the second one does:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 10000;
  rec RECORD;
BEGIN
  v_ts := clock_timestamp();

  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id
      FROM rental
      WHERE customer_id = 1
      ORDER BY rental_date
      LIMIT 10
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  RAISE INFO 'Statement 1: %', (clock_timestamp() - v_ts); 
  v_ts := clock_timestamp();

  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id,
        COUNT(*) OVER()
      FROM rental
      WHERE customer_id = 1
      ORDER BY rental_date
      LIMIT 10
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  RAISE INFO 'Statement 2: %', (clock_timestamp() - v_ts); 
END$$;

The result clearly indicates that in PostgreSQL, there’s a significant overhead in calculating this value:

INFO:  Statement 1: 00:00:01.041823
INFO:  Statement 2: 00:00:03.57145

Oracle optimises things a bit better when you’re using ROWNUM to paginate:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT NUMBER := 5000;
BEGIN
  v_ts := SYSTIMESTAMP;
     
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id
      FROM (
        SELECT 
          rental.*, 
          ROWNUM rn
        FROM rental
        WHERE customer_id = 1
        ORDER BY rental_date
      ) rental
      WHERE rn < 5
      ORDER BY rn
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
     
  dbms_output.put_line('Statement 1: ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
     
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        rental_date, 
        inventory_id,
        COUNT(*) OVER()
      FROM (
        SELECT 
          rental.*,  
          ROWNUM rn
        FROM rental
        WHERE customer_id = 1
        ORDER BY rental_date
      ) rental
      WHERE rn < 5
      ORDER BY rn
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
     
  dbms_output.put_line('Statement 2: ' || (SYSTIMESTAMP - v_ts));
END;
/

Result:

Statement 1: X
Statement 2: X +/- 1%

So, the COUNT(*) seems to be calculated “for free”. Bonus question: Why is that?

Due to Oracle license restrictions, we cannot publish benchmark results here, comparing Oracle with PostgreSQL, sorry, but you can run the above code yourself against the Sakila database:
https://www.jooq.org/sakila

Conclusion

TL;DR: OFFSET pagination bad. Keyset pagination good.

no-offset-banner-468x60.white

If you want to paginate in your application, please make sure whether you really, really, really need:

  • Exact page number
  • High page numbers
  • The last page number
  • The total number of rows

Because if you don’t (and in 98% of all UIs, you really don’t), then you can drastically speed up your queries while providing your users a much better experience. If that’s not a win-win situation worth thinking about…?

And don’t forget, jOOQ ships with native keyset pagination support!

Using Stored Procedures With JPA, JDBC… Meh, Just Use jOOQ

The current edition of the Java magazine has an article about Big Data Best Practices for JDBC and JPA by Josh Juneau:
http://www.javamagazine.mozaicreader.com/MayJune2016

The article shows how to use a stored procedure with JDBC (notice how resources aren’t closed, unfortunately. This is commonly forgotten, even in Java Magazine articles)

// Using JDBC to call upon a database stored
// procedure
CallableStatement cs = null;
try {
    cs = conn.prepareCall("{call DUMMY_PROC(?,?)}");
    cs.setString(1, "This is a test");
    cs.registerOutParameter(2, Types.VARCHAR);
    cs.executeQuery();

    // Do something with result
    String returnStr = cs.getString(2);
} catch (SQLException ex){
    ex.printStackTrace();
}

And with JPA:

// Utilize JPA to call a database stored procedure
// Add @NamedStoredProcedureQuery to entity class
@NamedStoredProcedureQuery(
    name="createEmp", procedureName="CREATE_EMP",
    parameters = {
        @StoredProcedureParameter(
            mode= ParameterMode.IN,
            type=String.class,
            name="first"),
        @StoredProcedureParamter(
            mode = ParameterMode.IN,
            type=String.class,
            name="last")
    })

// Calling upon stored procedure
StoredProcedureQuery qry =
    em.createStoredProcedureQuery("createEmp");
qry.setParameter("first", "JOSH");
qry.setParameter("last","JUNEAU");
qry.execute();

Specifically the latter was also recently discussed in blog posts by Vlad Mihalcea and Thorben Janssen.

Do you like verbosity and complexity?

No? We neither. This is why we give you a third option instead: Just use jOOQ. Here’s the equivalent jOOQ code:

// JDBC example:
String returnStr = Routines.dummyProc(
    config, "This is a test");

// JPA example
Routines.createEmp(config, "JOSH", "JUNEAU");

Yes! That’s it. Don’t waste time manually configuring your bind variables with JDBC API calls, or JPA annotations. No one likes writing annotations for stored procedures. With jOOQ and jOOQ’s code generator, procedure calls are:

  • A one-liner
  • A no-brainer
  • A way to bring back the fun to stored procedures

Learn more about using Oracle stored procedures with nested collections and object types here:
https://blog.jooq.org/2014/11/04/painless-access-from-java-to-plsql-procedures-with-jooq

Using jOOQ’s ExecuteListener to Prevent Write Operations on a Connection

Security is important, especially on the data access layer. Most commercial databasese allow for fine-grained privilege control using database access grants. For instance, you would be restricting access from a user to a certain set of tables (or even better: views), via GRANT statements:

GRANT SELECT ON table TO user;

With this fine-grained access control, write operations on certain database objects can be prevented directly in the database.

What if that’s not possible?

Not all databases ship with sophisticated access privilege implementations, or perhaps, your application cannot profit from those features for operational reasons. In that case, you should at least be able to implement security on the client, e.g. by using jOOQ’s ExecuteListener (for coarse grained access control), or by using jOOQ’s VisitListener (for fine grained access control).

An example using an ExecuteListener might look like this:

class ReadOnlyListener extends DefaultExecuteListener {
    @Override
    public void executeStart(ExecuteContext ctx) {
        if (ctx.type() != READ)
            throw new DataAccessException("No privilege to execute " + ctx.sql());
    }
}

If you hook this listener into your jOOQ Configuration, you will no longer be able to execute any write operations on that Configuration. It’s that easy!

For more fine-grained control (e.g. a per-table ACL), a VisitListener will do the trick. An (very much simplified) example implementation that shows what can be done can be seen here:

static class ACLListener extends DefaultVisitListener {

    @Override
    public void visitStart(VisitContext context) {
        if (context.queryPart() instanceof Table
                && Arrays.asList(context.clauses()).contains(INSERT_INSERT_INTO)
                && ((Table<?>) context.queryPart()).getName().equals("AUTHOR"))
            throw new DataAccessException("No privilege to insert into AUTHOR");
    }
}

Essentially, this check prevents a client session from running insert statements into the AUTHOR table. A future version of jOOQ will ship with this kind of ACL VisitListener out of the box, when https://github.com/jOOQ/jOOQ/issues/5197 is implemented.

A Subtle AutoCloseable Contract Change Between Java 7 and Java 8

A nice feature of the Java 7 try-with-resources statement and the AutoCloseable type that was introduced to work with this statement is the fact that static code analysis tools can detect resource leaks. For instance, Eclipse:

resource-leak

When you have the above configuration and you try running the following program, you’ll get three warnings:

public static void main(String[] args) 
throws Exception {
    Connection c = DriverManager.getConnection(
         "jdbc:h2:~/test", "sa", "");
    Statement s = c.createStatement();
    ResultSet r = s.executeQuery("SELECT 1 + 1");
    r.next();
    System.out.println(r.getInt(1));
}

The output is, trivially

2

The warnings are issued on all of c, s, r. A quick fix (don’t do this!) is to suppress the warning using an Eclipse-specific SuppressWarnings parameter:

@SuppressWarnings("resource")
public static void main(String[] args) 
throws Exception {
    ...
}

After all, WeKnowWhatWeReDoing™ and this is just a simple example, right?

Wrong!

The right way to fix this, even for simple examples (at least after Java 7) is to use the effortless try-with-resources statement.

public static void main(String[] args) 
throws Exception {
    try (Connection c = DriverManager.getConnection(
             "jdbc:h2:~/test", "sa", "");
         Statement s = c.createStatement();
         ResultSet r = s.executeQuery("SELECT 1 + 1")) {

        r.next();
        System.out.println(r.getInt(1));
    }
}

In fact, it would be great if Eclipse could auto-fix this warning and wrap all the individual statements in a try-with-resources statement. Upvote this feature request, please!

Great, we know this. What’s the deal with Java 8?

In Java 8, the contract on AutoCloseable has changed very subtly (or bluntly, depending on your point of view).

Java 7 version

A resource that must be closed when it is no longer needed.

Note the word "must".

Java 8 version

An object that may hold resources (such as file or socket handles) until it is closed. The close() method of an AutoCloseable object is called automatically when exiting a try-with-resources block for which the object has been declared in the resource specification header. This construction ensures prompt release, avoiding resource exhaustion exceptions and errors that may otherwise occur.

API Note:

It is possible, and in fact common, for a base class to implement AutoCloseable even though not all of its subclasses or instances will hold releasable resources. For code that must operate in complete generality, or when it is known that the AutoCloseable instance requires resource release, it is recommended to use try-with-resources constructions. However, when using facilities such as Stream that support both I/O-based and non-I/O-based forms, try-with-resources blocks are in general unnecessary when using non-I/O-based forms.

In short, from Java 8 onwards, AutoCloseable is more of a hint saying that you might be using a resource that needs to be closed, but this isn’t necessarily the case.

This is similar to the Iterable contract, which doesn’t say whether you can iterate only once, or several times over the Iterable, but it imposes a contract that is required for the foreach loop.

When do we have “optionally closeable” resources?

Take jOOQ for instance. Unlike in JDBC, a jOOQ Query (which was made AutoCloseable in jOOQ 3.7) may or may not represent a resource, depending on how you execute it. By default, it is not a resource:

try (Connection c = DriverManager.getConnection(
        "jdbc:h2:~/test", "sa", "")) {

    // No new resources created here:
    ResultQuery<Record> query =
        DSL.using(c).resultQuery("SELECT 1 + 1");

    // Resources created and closed immediately
    System.out.println(query.fetch());
}

The output is again:

+----+
|   2|
+----+
|   2|
+----+

But now, we have again an Eclipse warning on the query variable, saying that there is a resource that needs to be closed, even if by using jOOQ this way, we know that this isn’t true. The only resource in the above code is the JDBC Connection, and it is properly handled. The jOOQ-internal PreparedStatement and ResultSet are completely handled and eagerly closed by jOOQ.

Then, why implement AutoCloseable in the first place?

jOOQ inverses JDBC’s default behaviour.

  • In JDBC, everything is done lazily by default, and resources have to be closed explicitly.
  • In jOOQ, everything is done eagerly by default, and optionally, resources can be kept alive explicitly.

For instance, the following code will keep an open PreparedStatement and ResultSet:

try (Connection c = DriverManager.getConnection(
        "jdbc:h2:~/test", "sa", "");

     // We "keep" the statement open in the ResultQuery
     ResultQuery<Record> query =
         DSL.using(c)
            .resultQuery("SELECT 1 + 1")
            .keepStatement(true)) {

    // We keep the ResultSet open in the Cursor
    try (Cursor<Record> cursor = query.fetchLazy()) {
        System.out.println(cursor.fetchOne());
    }
}

With this version, we no longer have any warnings in Eclipse, but the above version is really the exception when using the jOOQ API.

The same thing is true for Java 8’s Stream API. Interestingly, Eclipse doesn’t issue any warnings here:

Stream<Integer> stream = Arrays.asList(1, 2, 3).stream();
stream.forEach(System.out::println);

Conclusion

Resource leak detection seems to be a nice IDE / compiler feature at first. But avoiding false positives is hard. Specifically, because Java 8 changed contracts on AutoCloseable, implementors are allowed to implement the AutoCloseable contract for mere convenience, not as a clear indicator of a resource being present that MUST be closed.

This makes it very hard, if not impossible, for an IDE to detect resource leaks of third party APIs (non-JDK APIs), where these contracts aren’t generally well-known. The solution is, as ever so often with static code analysis tools, to simply turn off potential resource leak detection:

resource-leak-solution

For more insight, see also this Stack Overflow answer by Stuart Marks, linking to the EG’s discussions on lambda-dev

Integrating jOOQ with Grails Featuring the UWS-jOOQ Plugin

Introduction

Grails is a web framework aimed to boost development productivity. One of the main features is domain centric database schema generation. Applications built with Grails are able to update existing schema just before they start. To do this, Grails is using built-in domain mappers or migrations in more advanced cases. The goal of the UWS-jOOQ Grails-plugin is to integrate jOOQ into the existing Grails lifecycle in order to leverage features of jOOQ without compromising the ones provided by Grails.

This article is part of a series brought to you by the Germany based jOOQ integration partner UWS Software Service (UWS). UWS is specialised in custom software development, application modernisation and outsourcing with a distinct focus on the Java Enterprise ecosystem.

Why should I use jOOQ with Grails?

In enterprise applications we often face issues with Hibernate performance, lack of support of some statements or just too much hassle caused by the Hibernate model. Hibernate’s Query Language HQL often is not sophisticated enough to cope with some requirements. This forces us to use plain SQL, which is not bad and helps solving specific business problems. However in big projects where a larger group of people is involved and a product continuously evolves, type-safety is very precious but is thrown away when using plain SQL. That’s the moment where the jOOQ framework excels and the UWS-jOOQ Grails-Plugin comes into the game.

How can I Integrate jOOQ into Grails?

We tried to provide a simple integration of jOOQ into Grails using Grails built-in dependency resolution. Just add the following line to the plugins section of your BuildConfig.groovy:

compile ':uws-jooq:0.1.1'

Add plugin configuration to your Config.groovy:

jooq.dataSource = ['dataSource']
jooq.xmlConfigDir = "src/resources/jooq/"
jooq.generatedClassOutputDirectory = 'src/java'
jooq.generatedClassPackageName.dataSource = 'ie.uws.example'

As in this example the plugin allows to configure the datasources to be used and also some of the key-paths. Eventually you want to also take a look at a customized version of the Config.groovy in our sample integration project.

Next jOOQ needs an xml configuration file, which can be generated by the plugin using the following command. The plugin will use your existing datasources and their configuration to generate the jOOQ configuration:

grails jooq-generate-config

Now all the configuration is ready and it is time to get to one of the main features of jOOQ which is type-safe SQL. With the following command jOOQ will generate Java-classes which you want to use when writing SQL as they will give you compiler-based autocompletion:

grails jooq-init

Now that everything is in place, let’s say you would like to insert a new record into your database via jOOQ in one of your Controllers. It’s as simple as that:

class ExampleController {
  JooqService jooqService

  def insert() {
    DSLContext dsl = jooqService.dataSource	
    BookRecord record = dsl.newRecord(Tables.BOOK)
    record.author = "John"
    record.name = "Uws"
    record.version = 1
    record.store()
  }
}

How does the integration of jOOQ with Grails work under the hood?

In the example above you noticed the JooqService which is dependency-injected by Grails. The JooqService is your entrypoint when it comes to using jOOQ within Grails as it is able to pick your datasource and provide the jOOQ DSL context for you. When you have multiple datasources it also allows you to select a different datasource just by providing the name of it:

DSLContext dsl = jooqService.dataSource_custom

Note that autocompletion won’t tell you about the existence of a dataSource_custom field but JooqService will handle that for you.

In version 0.1 we added JooqService but DSLContext have to generated based on your databases schema. So it is important to execute jooq-init command every time you change your Grails domain model since this command compiles your code and executes all migrations so that latest Java-classes are generated on the latest database-schema. Thanks to this approach it is possible to generate structures even from an in-memory H2 database which will not be available right after the application will be shut down.

Best-practices for using jOOQ with Grails

Integrate legacy databases

You might face the situation where you have to connect to some legacy database using the Grails framework. It’s doable, for sure, but you have to create the right Hibernate mapping first or – with some luck – let Hibernate generate the right one for you. With this plugin you can just let jOOQ generate its Java-classes and you’re ready to communicate with the database using a fully type-safe DSL.

Let a database schema change break your code

It’s one of the most precious gifts when you know that something will break early. With using jOOQ in Grails it will happen during compilation time. When you are executing jooq-init, your application is compiled and the Java-classes are regenerated using the latest database schema. If the generated classes change, you will be notified that your code is not able to compile anymore. You can fix your SQL statements and ensure that your application won’t break during runtime.

Keep generated classes in your version control system

We recommend you to check in jOOQs generated classes into your VCS along with the rest of your applications source code. When you are using jOOQ classes to communicate with the database it’s mandatory for compilation to have those classes already defined. Do you use a different approach? Please let us know!

Roadmap

We’re planning to simplify integration even more and attach jooq-init into regular grails compilation process. Also we’d like to make our plugin harder to misuse (see jooqService section) and add the possibility to use jOOQ not only in services or controllers but also in plain Java classes.

Contribution to UWS-jOOQ Grails-Plugin

This software is distributed under the Apache License, Version 2.0. We want to keep this software free and provide services for people
who integrate jOOQ and Grails. If you’re interested in this project feel free to submit issues or pull requests to the project’s git repository.

Further reading

The following links provide additional information about the UWS-jOOQ Grails-Plugin :

General information about UWS or jOOQ can be found here:

Painless Access from Java to PL/SQL Procedures with jOOQ

A procedural language combined with SQL can do miracles in terms of productiveness, performance and expressivity.

In this article, we’ll see later on, how we can achieve the same with SQL (and PL/SQL) in Java, using jOOQ.

But first, a little bit of history…

Accessing PL/SQL from Java

One of the biggest reasons why Java developers in particular refrain from writing their own PL/SQL code is because the interface between PL/SQL and Java – ojdbc – is a major pain. We’ll see in the following examples how that is.

Assume we’re working on an Oracle-port of the popular Sakila database (originally created for MySQL). This particular Sakila/Oracle port was implemented by DB Software Laboratory and published under the BSD license.

Here’s a partial view of that Sakila database.

Sakila-film-actor-category

ERD created with vertabelo.comlearn how to use Vertabelo with jOOQ

Now, let’s assume that we have an API in the database that doesn’t expose the above schema, but exposes a PL/SQL API instead. The API might look something like this:

CREATE TYPE LANGUAGE_T AS OBJECT (
  language_id SMALLINT,
  name CHAR(20),
  last_update DATE
);
/

CREATE TYPE LANGUAGES_T AS TABLE OF LANGUAGE_T;
/

CREATE TYPE FILM_T AS OBJECT (
  film_id int,
  title VARCHAR(255),
  description CLOB,
  release_year VARCHAR(4),
  language LANGUAGE_T,
  original_language LANGUAGE_T,
  rental_duration SMALLINT,
  rental_rate DECIMAL(4,2),
  length SMALLINT,
  replacement_cost DECIMAL(5,2),
  rating VARCHAR(10),
  special_features VARCHAR(100),
  last_update DATE
);
/

CREATE TYPE FILMS_T AS TABLE OF FILM_T;
/

CREATE TYPE ACTOR_T AS OBJECT (
  actor_id numeric,
  first_name VARCHAR(45),
  last_name VARCHAR(45),
  last_update DATE
);
/

CREATE TYPE ACTORS_T AS TABLE OF ACTOR_T;
/

CREATE TYPE CATEGORY_T AS OBJECT (
  category_id SMALLINT,
  name VARCHAR(25),
  last_update DATE
);
/

CREATE TYPE CATEGORIES_T AS TABLE OF CATEGORY_T;
/

CREATE TYPE FILM_INFO_T AS OBJECT (
  film FILM_T,
  actors ACTORS_T,
  categories CATEGORIES_T
);
/

You’ll notice immediately, that this is essentially just a 1:1 copy of the schema in this case modelled as Oracle SQL OBJECT and TABLE types, apart from the FILM_INFO_T type, which acts as an aggregate.

Now, our DBA (or our database developer) has implemented the following API for us to access the above information:

CREATE OR REPLACE PACKAGE RENTALS AS
  FUNCTION GET_ACTOR(p_actor_id INT) RETURN ACTOR_T;
  FUNCTION GET_ACTORS RETURN ACTORS_T;
  FUNCTION GET_FILM(p_film_id INT) RETURN FILM_T;
  FUNCTION GET_FILMS RETURN FILMS_T;
  FUNCTION GET_FILM_INFO(p_film_id INT) RETURN FILM_INFO_T;
  FUNCTION GET_FILM_INFO(p_film FILM_T) RETURN FILM_INFO_T;
END RENTALS;
/

This, ladies and gentlemen, is how you can now…

… tediously access the PL/SQL API with JDBC

So, in order to avoid the awkward CallableStatement with its OUT parameter registration and JDBC escape syntax, we’re going to fetch a FILM_INFO_T record via a SQL statement like this:

try (PreparedStatement stmt = conn.prepareStatement(
        "SELECT rentals.get_film_info(1) FROM DUAL");
     ResultSet rs = stmt.executeQuery()) {

    // STRUCT unnesting here...
}

So far so good. Luckily, there is Java 7’s try-with-resources to help us clean up those myriad JDBC objects. Now how to proceed? What will we get back from this ResultSet? A java.sql.Struct:

while (rs.next()) {
    Struct film_info_t = (Struct) rs.getObject(1);

    // And so on...
}

Now, the brave ones among you would continue downcasting the java.sql.Struct to an even more obscure and arcane oracle.sql.STRUCT, which contains almost no Javadoc, but tons of deprecated additional, vendor-specific methods.

For now, let’s stick with the “standard API”, though.

Interlude:

Let’s take a moment to appreciate JDBC in times of Java 8.

When Java 5 was introduced, so were generics. We have rewritten our big code bases to remove all sorts of meaningless boilerplate type casts that are now no longer needed. With the exception of JDBC. When it comes to JDBC, guessing appropriate types is all a matter of luck. We’re accessing complex nested data structures provided by external systems by dereferencing elements by index, and then taking wild guesses at the resulting data types.

Lambdas have just been introduced, yet JDBC still talks to the mainframe.

Rhonda said, she put that STRUCT right between jack 73 and 75 on array F-B4. I wonder if I need my AC/DC converter to plug it

Rhonda said, she put that STRUCT right between jack 73 and 75 on array F-B4. I wonder if I need my AC/DC converter to plug it
Image in public domain

And then…

And here be dragons. And STRUCTS

And here be dragons. And STRUCTS
Original image in public domain

OK, enough of these rants.

Let’s continue navigating our STRUCT

while (rs.next()) {
    Struct film_info_t = (Struct) rs.getObject(1);

    Struct film_t = (Struct) film_info_t.getAttributes()[0];
    String title = (String) film_t.getAttributes()[1];
    Clob description_clob = (Clob) film_t.getAttributes()[2];
    String description = description_clob.getSubString(1, (int) description_clob.length());

    Struct language_t = (Struct) film_t.getAttributes()[4];
    String language = (String) language_t.getAttributes()[1];

    System.out.println("Film       : " + title);
    System.out.println("Description: " + description);
    System.out.println("Language   : " + language);
}

From the initial STRUCT that we received at position 1 from the ResultSet, we can continue dereferencing attributes by index. Unfortunately, we’ll constantly need to look up the SQL type in Oracle (or in some documentation) to remember the order of the attributes:

CREATE TYPE FILM_INFO_T AS OBJECT (
  film FILM_T,
  actors ACTORS_T,
  categories CATEGORIES_T
);
/

And that’s not it! The first attribute of type FILM_T is yet another, nested STRUCT. And then, those horrible CLOBs. The above code is not strictly complete. In some cases that only the maintainers of JDBC can fathom, java.sql.Clob.free() has to be called to be sure that resources are freed in time. Remember that CLOB, depending on your database and driver configuration, may live outside the scope of your transaction.

Unfortunately, the method is called free() instead of AutoCloseable.close(), such that try-with-resources cannot be used. So here we go:

List<Clob> clobs = new ArrayList<>();

while (rs.next()) {
    try {
        Struct film_info_t = (Struct) rs.getObject(1);
        Struct film_t = (Struct) film_info_t.getAttributes()[0];

        String title = (String) film_t.getAttributes()[1];
        Clob description_clob = (Clob) film_t.getAttributes()[2];
        String description = description_clob.getSubString(1, (int) description_clob.length());

        Struct language_t = (Struct) film_t.getAttributes()[4];
        String language = (String) language_t.getAttributes()[1];

        System.out.println("Film       : " + title);
        System.out.println("Description: " + description);
        System.out.println("Language   : " + language);
    }
    finally {
        // And don't think you can call this early, either
        // The internal specifics are mysterious!
        for (Clob clob : clobs)
            clob.free();
    }
}

That’s about it. Now we have found ourselves with some nice little output on the console:

Film       : ACADEMY DINOSAUR
Description: A Epic Drama of a Feminist And a Mad 
             Scientist who must Battle a Teacher in
             The Canadian Rockies
Language   : English             

That’s about it – You may think! But…

The pain has only started

… because we’re not done yet. There are also two nested table types that we need to deserialise from the STRUCT. If you haven’t given up yet (bear with me, good news is nigh), you’ll enjoy reading about how to fetch and unwind a java.sql.Array. Let’s continue right after the printing of the film:

Array actors_t = (Array) film_info_t.getAttributes()[1];
Array categories_t = (Array) film_info_t.getAttributes()[2];

Again, we’re accessing attributes by indexes, which we have to remember, and which can easily break. The ACTORS_T array is nothing but yet another wrapped STRUCT:

System.out.println("Actors     : ");

Object[] actors = (Object[]) actors_t.getArray();
for (Object actor : actors) {
    Struct actor_t = (Struct) actor;

    System.out.println(
        "  " + actor_t.getAttributes()[1]
       + " " + actor_t.getAttributes()[2]);
}

You’ll notice a few things:

  • The Array.getArray() method returns an array. But it declares returning Object. We have to manually cast.
  • We can’t cast to Struct[] even if that would be a sensible type. But the type returned by ojdbc is Object[] (containing Struct elements)
  • The foreach loop also cannot dereference a Struct from the right hand side. There’s no way of coercing the type of actor into what we know it really is
  • We could’ve used Java 8 and Streams and such, but unfortunately, all lambda expressions that can be passed to the Streams API disallow throwing of checked exceptions. And JDBC throws checked exceptions. That’ll be even uglier.

Anyway. Now that we’ve finally achieved this, we can see the print output:

Film       : ACADEMY DINOSAUR
Description: A Epic Drama of a Feminist And a Mad 
             Scientist who must Battle a Teacher in
             The Canadian Rockies
Language   : English             
Actors     : 
  PENELOPE GUINESS
  CHRISTIAN GABLE
  LUCILLE TRACY
  SANDRA PECK
  JOHNNY CAGE
  MENA TEMPLE
  WARREN NOLTE
  OPRAH KILMER
  ROCK DUKAKIS
  MARY KEITEL

When will this madness stop?

It’ll stop right here!

So far, this article read like a tutorial (or rather: medieval torture) of how to deserialise nested user-defined types from Oracle SQL to Java (don’t get me started on serialising them again!)

In the next section, we’ll see how the exact same business logic (listing Film with ID=1 and its actors) can be implemented with no pain at all using jOOQ and its source code generator. Check this out:

// Simply call the packaged stored function from
// Java, and get a deserialised, type safe record
FilmInfoTRecord film_info_t = Rentals.getFilmInfo1(
    configuration, new BigInteger("1"));

// The generated record has getters (and setters)
// for type safe navigation of nested structures
FilmTRecord film_t = film_info_t.getFilm();

// In fact, all these types have generated getters:
System.out.println("Film       : " + film_t.getTitle());
System.out.println("Description: " + film_t.getDescription());
System.out.println("Language   : " + film_t.getLanguage().getName());

// Simply loop nested type safe array structures
System.out.println("Actors     : ");
for (ActorTRecord actor_t : film_info_t.getActors()) {
    System.out.println(
        "  " + actor_t.getFirstName()
       + " " + actor_t.getLastName());
}

System.out.println("Categories     : ");
for (CategoryTRecord category_t : film_info_t.getCategories()) {
    System.out.println(category_t.getName());
}

Is that it?

Yes!

Wow, I mean, this is just as though all those PL/SQL types and procedures / functions were actually part of Java. All the caveats that we’ve seen before are hidden behind those generated types and implemented in jOOQ, so you can concentrate on what you originally wanted to do. Access the data objects and do meaningful work with them. Not serialise / deserialise them!

Let’s take a moment and appreciate this consumer advertising:

jOOQ generates Java code from your database and lets you build type safe SQL queries through its fluent API.

Not convinced yet?

I told you not to get me started on serialising the types to JDBC. And I won’t, but here’s how to serialise the types to jOOQ, because that’s a piece of cake!

Let’s consider this other aggregate type, that returns a customer’s rental history:

CREATE TYPE CUSTOMER_RENTAL_HISTORY_T AS OBJECT (
  customer CUSTOMER_T,
  films FILMS_T
);
/

And the full PL/SQL package specs:

CREATE OR REPLACE PACKAGE RENTALS AS
  FUNCTION GET_ACTOR(p_actor_id INT) RETURN ACTOR_T;
  FUNCTION GET_ACTORS RETURN ACTORS_T;
  FUNCTION GET_CUSTOMER(p_customer_id INT) RETURN CUSTOMER_T;
  FUNCTION GET_CUSTOMERS RETURN CUSTOMERS_T;
  FUNCTION GET_FILM(p_film_id INT) RETURN FILM_T;
  FUNCTION GET_FILMS RETURN FILMS_T;
  FUNCTION GET_CUSTOMER_RENTAL_HISTORY(p_customer_id INT) RETURN CUSTOMER_RENTAL_HISTORY_T;
  FUNCTION GET_CUSTOMER_RENTAL_HISTORY(p_customer CUSTOMER_T) RETURN CUSTOMER_RENTAL_HISTORY_T;
  FUNCTION GET_FILM_INFO(p_film_id INT) RETURN FILM_INFO_T;
  FUNCTION GET_FILM_INFO(p_film FILM_T) RETURN FILM_INFO_T;
END RENTALS;
/

So, when calling RENTALS.GET_CUSTOMER_RENTAL_HISTORY we can find all the films that a customer has ever rented. Let’s do that for all customers whose FIRST_NAME is “JAMIE”, and this time, we’re using Java 8:

// We call the stored function directly inline in
// a SQL statement
dsl().select(Rentals.getCustomer(
          CUSTOMER.CUSTOMER_ID
      ))
     .from(CUSTOMER)
     .where(CUSTOMER.FIRST_NAME.eq("JAMIE"))

// This returns Result<Record1<CustomerTRecord>>
// We unwrap the CustomerTRecord and consume
// the result with a lambda expression
     .fetch()
     .map(Record1::value1)
     .forEach(customer -> {
         System.out.println("Customer  : ");
         System.out.println("- Name    : " 
           + customer.getFirstName() 
           + " " + customer.getLastName());
         System.out.println("- E-Mail  : " 
           + customer.getEmail());
         System.out.println("- Address : " 
           + customer.getAddress().getAddress());
         System.out.println("            " 
           + customer.getAddress().getPostalCode() 
           + " " + customer.getAddress().getCity().getCity());
         System.out.println("            " 
           + customer.getAddress().getCity().getCountry().getCountry());

// Now, lets send the customer over the wire again to
// call that other stored procedure, fetching his
// rental history:
         CustomerRentalHistoryTRecord history = 
           Rentals.getCustomerRentalHistory2(dsl().configuration(), customer);

         System.out.println("  Customer Rental History : ");
         System.out.println("    Films                 : ");

         history.getFilms().forEach(film -> {
             System.out.println("      Film                : " 
               + film.getTitle());
             System.out.println("        Language          : " 
               + film.getLanguage().getName());
             System.out.println("        Description       : " 
               + film.getDescription());

// And then, let's call again the first procedure
// in order to get a film's actors and categories
             FilmInfoTRecord info = 
               Rentals.getFilmInfo2(dsl().configuration(), film);

             info.getActors().forEach(actor -> {
                 System.out.println("          Actor           : " 
                   + actor.getFirstName() + " " + actor.getLastName());
             });

             info.getCategories().forEach(category -> {
                 System.out.println("          Category        : " 
                   + category.getName());
             });
         });
     });

… and a short extract of the output produced by the above:

Customer  : 
- Name    : JAMIE RICE
- E-Mail  : JAMIE.RICE@sakilacustomer.org
- Address : 879 Newcastle Way
            90732 Sterling Heights
            United States
  Customer Rental History : 
    Films                 : 
      Film                : ALASKA PHANTOM
        Language          : English             
        Description       : A Fanciful Saga of a Hunter
                            And a Pastry Chef who must
                            Vanquish a Boy in Australia
          Actor           : VAL BOLGER
          Actor           : BURT POSEY
          Actor           : SIDNEY CROWE
          Actor           : SYLVESTER DERN
          Actor           : ALBERT JOHANSSON
          Actor           : GENE MCKELLEN
          Actor           : JEFF SILVERSTONE
          Category        : Music
      Film                : ALONE TRIP
        Language          : English             
        Description       : A Fast-Paced Character
                            Study of a Composer And a
                            Dog who must Outgun a Boat
                            in An Abandoned Fun House
          Actor           : ED CHASE
          Actor           : KARL BERRY
          Actor           : UMA WOOD
          Actor           : WOODY JOLIE
          Actor           : SPENCER DEPP
          Actor           : CHRIS DEPP
          Actor           : LAURENCE BULLOCK
          Actor           : RENEE BALL
          Category        : Music

If you’re using Java and PL/SQL…

… then you should click on the below banner and download the free trial right now to experiment with jOOQ and Oracle:

jOOQ generates Java code from your database and lets you build type safe SQL queries through its fluent API.

The Oracle port of the Sakila database is available from this URL for free, under the terms of the BSD license:

https://www.jooq.org/sakila

Finally, it is time to enjoy writing PL/SQL again!

Further reading:

See also this article about how jOOQ 3.9 provides access to PL/SQL RECORD types.