Asynchronous SQL Execution with jOOQ and Java 8’s CompletableFuture

Reactive programming is the new buzzword, which essentially just means asynchronous programming or messaging. Fact is that functional syntax greatly helps with structuring asynchronous execution chains, and today, we’ll see how we can do this in Java 8 using jOOQ and the new CompletableFuture API. In fact, things are quite simple:

// Initiate an asynchronous call chain
CompletableFuture

    // This lambda will supply an int value
    // indicating the number of inserted rows
    .supplyAsync(() -> DSL
        .using(configuration)
        .insertInto(AUTHOR, AUTHOR.ID, AUTHOR.LAST_NAME)
        .values(3, "Hitchcock")
        .execute()
    )
                          
    // This will supply an AuthorRecord value
    // for the newly inserted author
    .handleAsync((rows, throwable) -> DSL
        .using(configuration)
        .fetchOne(AUTHOR, AUTHOR.ID.eq(3))
    )

    // This should supply an int value indicating
    // the number of rows, but in fact it'll throw
    // a constraint violation exception
    .handleAsync((record, throwable) -> {
        record.changed(true);
        return record.insert();
    })
    
    // This will supply an int value indicating
    // the number of deleted rows
    .handleAsync((rows, throwable) -> DSL
        .using(configuration)
        .delete(AUTHOR)
        .where(AUTHOR.ID.eq(3))
        .execute()
    )

    // This tells the calling thread to wait for all
    // chained execution units to be executed
    .join();

What did really happen here? Nothing out of the ordinary. There are 4 execution blocks:
  1. One that inserts a new AUTHOR
  2. One that fetches that same AUTHOR again
  3. One that re-inserts the newly fetched AUTHOR (throwing an exception)
  4. One that ignores the thrown exception and delets the AUTHOR again
Finally, when the execution chain is established, the calling thread will join the whole chain using the CompletableFuture.join() method, which is essentially the same as the Future.get() method, except that it doesn’t throw any checked exception.

Comparing this to other APIs

Other APIs like Scala’s Slick have implemented similar things via “standard API”, such as calls to flatMap(). We’re currently not going to mimick such APIs as we believe that the new Java 8 APIs will become much more idiomatic to native Java speakers. Specifically, when executing SQL, getting connection pooling and transactions right is of the essence. The semantics of asynchronously chained execution blocks and how they relate to transactions is very subtle. If you want a transaction to span more than one such block, you will have to encode this yourself via jOOQ’s Configuration and its contained ConnectionProvider.

Blocking JDBC

Obviously, there will always be one blocking barrier to such solutions, and that is JDBC itself – which is very hard to turn into an asynchronous API. In fact, few databases really support asynchronous query executions and cursors, as most often, a single database session can only be used by a single thread for a single query at a time. We’d be very interested to learn about your asynchronous SQL querying requirements, so feel free to comment on this post!

13 thoughts on “Asynchronous SQL Execution with jOOQ and Java 8’s CompletableFuture

  1. We use jOOq (and I sing its praises) on a project I am on — it’s a wonderful tool and I’m glad that we purchased licensing for it and are supporting it. We’re using jOOq on a massive website for the government and have found that it trivially plugs into c3p0, allowing us to use connection pooling to cause tremendous throughput. Best of all, for those cases where I need to shoot something off, jOOq’s use of Futures made this trivial in 3.2.1, which is when we first purchased a license, meant that we could kick off our queries and do other things while waiting for the responses. This has been particularly useful when performing a series of complex queries where Oracle would choke if we returned several thousand results, so we broke it up to grab them in chunks of 1000, running all queries at once and then just combining it in Java (we even ran the queries directly off the DB server using PL/SQL, but to no avail). I’d be interested in using these new Java8 features, but we’re Java7 and we’d only move on the next major revision… and we’d only convert what we have if there were some overwhelming benefit that this has over Java7’s Futures.

    1. John, this sounds incredibly interesting! And in fact, I’m starting to think if we can factor out a parallel query feature out of your technique… How about an API extension where it would be possible to define “query partitions” (e.g. using unique ranges that generate BETWEEN predicates or something else), and then do precisely as you’ve mentioned – possibly via going through the Java 7 ForkJoinPool? I’m just thinking out loud, but there’s definitely a killer-feature in your explanations.

      Another option would be to generate synthetic UNION ALL clauses, which enables parallelism directly in Oracle database, possibly combining with parallel hints (have you tried that?)

      As far as Java 7 and 8 are concerned, I believe that the benefits are much more subtle than that. It is in fact much easier to parallelise stuff using Streams, and to express things using lambda expressions. But I’m not sure if these improvements pull their weight for an upgrade just now in your business.

      1. We tried hard to get parallel hints to get us there — trying to get Oracle to run in parallel using hints — but it would never run in parallel (even when we flat out told it “Use 4 and no less”). Actually, synthetic UNION ALLs, possibly with hints, might be what would allow that //-ization. The API extension, too, would be really spiffy :-) Either option is much better than what I’m doing now :-D I’m going to have to try those union alls some time soon and benchmark it.

        1. The API extension, too, would be really spiffy

          That is really something that we’ll be looking into for jOOQ 3.6. A lot of users implementing sharding might benefit from that. I had also just now talked to Morgan Tocker and Jess Balint from Oracle / MySQL about a possible collaboration in that area. Surely, this will then also be benefitial for the Oracle integration.

          I’d be curious about benchmark results with UNION ALL!

          1. Hi! I’m learning JOOQ right now for a project at work that we’re using MySQL.

            For that MySQL database we’ve implemented some data sharding, as the data is generated and spread across sites, so when we need to do some global queries, we implemented an in-house library that uses fork/join to execute the query in parallel in all MySQL servers and get the results.

            Did you guys came around to implement your idea for JOOQ 3.6+? Or how can I do that in version 3.6.1?

            1. Ooh, interesting. That fork/join sharding could be an entire blog topic on its own! Did you blog about that publicly? Or, do you have any public display of parts of that code? Would be extremely interesting to see!

              I’m not sure what you mean by “our idea”. What you can see in this blog post is just Java 8 API (java.util.concurrent.CompletableFuture)… In other words, what bits do you think are still missing for you?

              1. Sorry, I couldn’t post any code because it’s for the company I work for and they’re very “sensitive” about open sourcing code (they suck, they use a lot but never contribute back).

                But I can say that the hardest part was that we had to do certain tasks ourselves like sorting, paging, and math operations because the queries were from different JDBC connections (which meant loading the results into memory first), so it’s not as trivial as it sounds, and also you depend on having a good database pooling library, we use DBPool, it’s the one that has worked best for us so far.

                And well, your “idea” was aimed at the first reply you did to John on this post, were you mentioned a couple of possible ways of running queries in parallel in JOOQ, partitioning queries, etc. and have JOOQ do it instead of the user, that would have been a lot of help for us.

                1. they suck, they use a lot but never contribute back

                  That’s alright. Open Source is first and foremost free as in beer. It’s the vendor’s fault that they just give it away. Here’s my take on this:
                  https://blog.jooq.org/2015/01/21/open-source-doesnt-need-more-support-it-needs-better-business-models

                  You may disagree, of course ;-)

                  we had to do certain tasks ourselves like sorting, paging, and math operations because the queries were from different JDBC connections

                  Yes, I see… Some calculations are not really “de-composable” in a way that they can produce the same results when split across different queries…

                  And well, your “idea” was aimed at the first reply you did to John on this post, were you mentioned a couple of possible ways of running queries in parallel in JOOQ, partitioning queries, etc. and have JOOQ do it instead of the user, that would have been a lot of help for us.

                  Aha, I’m sorry – I didn’t realise this was not a top-level comment.

                  We actually didn’t continue looking into this, precisely because of the calculations that you’ve mentioned also. In fact, transforming predicates across different subqueries, or even across different shards turned out to be a rather complex problem that is hard to implement in jOOQ – although, less hard to implement in your own code – as you don’t have to find a very generic solution.

                  I suspect that a generic SQL transformation operation could be achieved using a VisitListener…
                  https://www.jooq.org/doc/latest/manual/sql-building/queryparts/custom-sql-transformation/

  2. This thread is a little old but I have a related (I think) question. I want to send an sql statement to the server and not wait for the response at all, ever, at least within the original context. This call would be after a lot of database work done within a transaction, since closed. The transaction will have fill a work table via COPY. I want to send sql separately that will work on that (rather large) work table. Do I need to create a separate thread with it’s own db context which would do the follow-on work and have it wait for and ignore the response?

    1. You cannot do that with JDBC, JDBC is a blocking API. jOOQ works on top of JDBC, so while “asynchronous” execution gives you the impression and the “feel” of actual asynchronicity, it is just your client thread calling the database action that can continue to do other work. Behind the scenes, there is another client thread (by default on the ForkJoinPool as per CompletableFuture’s implementation), which will still block on JDBC.

      There is a new ADBC API in the works by Oracle (see some recent discussions on the mailing list: https://mail.openjdk.java.net/pipermail/jdbc-spec-discuss/), and there are inofficial asynchronous drivers that do not implement JDBC, at least for PostgreSQL. You could use jOOQ to generate the SQL string and then send that SQL over the wire

      1. Thanks. I understand the JDBC constraint. I’m thinking I can let my “app” do it’s business (send a large number of records to a temp table (copy in seconds), call it good and fork a separate Runable which makes its own connection and issues a single sql statement (insert from temp) and waits for that to finish (and then do nothing but die) while the main loop is back in action. Is that sane? I’m trying to keep the queue for my Selector as empty as possible.

        1. Of course that’s sane. It’s a staging table. I’ve maintained large scripts that used staging tables for inter-system data exchange. Of course, maintaining data integrity (from a user point of view) is harder, because the user will have forgotten that their data was sent to some place. But that’s usually not too high a price to pay.

Leave a Reply