How I Incorrectly Fetched JDBC ResultSets. Again.

You know JDBC, right? It’s that really easy, concise API that we love to use to work with virtually any database, relational or not. It has essentially three types that you need to care about:

All the other types some sort of utilities.

Now, with the above three, we can do really nice and lean Java/SQL coding as follows:

try (Connection c = datasource.getConnection();
     Statement s = c.createStatement();
     ResultSet r = s.executeQuery("SELECT 'hello'")
) {
    while (r.next())
        System.out.println(r.getString(1));
}

Output:

hello

OK? Super easy.

Unless…

Unless you want to write generic JDBC code, because you don’t know what the SQL string is. It could be a SELECT statement. It could be and UPDATE. It could be DDL. It could be a statement batch (several statements). It could call triggers and stored procedures, which again produce nice things like warnings, exceptions, update counts, and additional result sets.

You know, the sort of thing that might come flying in to a generic utility method like jOOQ’s ResultQuery.fetchMany().

(Don’t think this couldn’t happen to you as well. SQL Server triggers are really mean things!)

For this, let’s consider the correct way to execute the following simple statement batch that works wonderfully in SQL Server:

raiserror('e0', 0, 2, 3);
create table t(i int);
raiserror('e1', 5, 2, 3);
insert into t values (1);
raiserror('e2', 10, 2, 3);
insert into t values (2);
raiserror('e3', 15, 2, 3);
select * from t;
drop table t;
raiserror('e4', 16, 2, 3);

The result is:

And obviously

For your convenience, I have pre-formatted the above String into a Java String variable, which is already the first problem, because Java STILL doesn’t have multi-line strings (gaah):

String sql =
    "\n raiserror('e0', 0, 2, 3);"
  + "\n create table t(i int);"
  + "\n raiserror('e1', 5, 2, 3);"
  + "\n insert into t values (1);"
  + "\n raiserror('e2', 10, 2, 3);"
  + "\n insert into t values (2);"
  + "\n raiserror('e3', 15, 2, 3);"
  + "\n select * from t;"
  + "\n drop table t;"
  + "\n raiserror('e4', 16, 2, 3);";

Now see, we might be inclined to just copy paste some JDBC snippet off some website (e.g. this blog, and take its first snippet) and execute it as such:

try (
    Statement s = c.createStatement();
    ResultSet r = s.executeQuery(sql)
) {
    while (r.next())
        System.out.println(r.getString(1));
}

Yeah. What’ll happen if we do this?

Rats:

com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:857)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:757)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2689)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:224)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:204)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:659)
	at SQLServer.main(SQLServer.java:80)

e3? What on earth? So what happened with my statement batch? Did it execute? Only until the middle? Or did I get to the end as well?

OK, quite obviously, we have to do this more carefully. We cannot use Statement.executeQuery() here, because we don’t know whether we’ll get a result set. In fact, we got an exception, but not the first one.

Let’s try something else. Let’s try this:

try (Statement s = c.createStatement()) {
    System.out.println(s.execute(sql));
}

That just yields:

false

Okaaay, did anything execute in the database at all? No more exceptions… Let me have a look at the SQL Server Profiler…

Nope, the entire batch got executed. (Could’ve just removed the DROP TABLE statement and checked the contents of table T in SQL Server Management Studio, of course).

Huh, quite a different result, depending on what method we’re calling. Does that scare you? Does your ORM get this right? (jOOQ didn’t but this is now fixed).

OK, let’s read the Javadoc on Statement.execute()

It says:

Executes the given SQL statement, which may return multiple results. In some (uncommon) situations, a single SQL statement may return multiple result sets and/or update counts. Normally you can ignore this unless you are (1) executing a stored procedure that you know may return multiple results or (2) you are dynamically executing an unknown SQL string.
The execute method executes an SQL statement and indicates the form of the first result. You must then use the methods getResultSet or getUpdateCount to retrieve the result, and getMoreResults to move to any subsequent result(s).

Huh, OK. Statement.getResultSet() and getUpdateCount() must be used, and then getMoreResults()

The getMoreResults() method also has this interesting bit of information:

There are no more results when the following is true:

// stmt is a Statement object
((stmt.getMoreResults() == false) && (stmt.getUpdateCount() == -1))

Interesting. -1. I guess we can be very happy that at least it’s not returning null or a punch in your face.

So, let’s try this again:

  • We first have to call execute()
  • If it’s true, we fetch getResultSet()
  • If it’s false, we check getUpdateCount()
  • If that was -1, we can stop

Or, in code:

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    boolean result = (i == 0)
        ? s.execute(sql)
        : s.getMoreResults();

    if (result)
        try (ResultSet rs = s.getResultSet()) {
            System.out.println("Result      :");

            while (rs.next())
                System.out.println("  " + rs.getString(1));
        }
    else if ((updateCount = s.getUpdateCount()) != -1)
        System.out.println("Update Count: " + updateCount);
    else
        break fetchLoop;
}

Beautiful! Some remarks:

  • Note how the loop stops after 256 iterations. Never trust these infinite streaming APIs, there’s always a bug somewhere, trust me
  • The boolean value return from Statement.execute() and Statement.getMoreResults() is the same. We can assign it to a variable inside the loop and call execute only on the first iteration
  • If true, fetch the result set
  • If false, check the update count
  • If that was -1, stop

Run time!

Update Count: 0
Update Count: 1
Update Count: 1
com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getMoreResults(SQLServerStatement.java:1270)
	at SQLServer.main(SQLServer.java:83)

Crap. But did it execute completely? Yes it did, but we didn’t get that sweet result set after e3, because of that exception. But at least, we now got 3 update counts. But wait a second, why didn’t we get e0, e1, and e2?

AHA, they’re warnings, not exceptions. Funky SQL Server decided that everything below some severity level is a warning. Whatever.

Anyway, let’s fetch those warnings as well!

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    boolean result = (i == 0)
        ? s.execute(sql)
        : s.getMoreResults();

    // Warnings here
    SQLWarning w = s.getWarnings();
    for (int j = 0; j < 255 && w != null; j++) {
        System.out.println("Warning     : " + w.getMessage());
        w = w.getNextWarning();
    }

    // Don't forget this
    s.clearWarnings();

    if (result)
        try (ResultSet rs = s.getResultSet()) {
            System.out.println("Result      :");

            while (rs.next())
                System.out.println("  " + rs.getString(1));
        }
    else if ((updateCount = s.getUpdateCount()) != -1)
        System.out.println("Update Count: " + updateCount);
    else
        break fetchLoop;
}

Great, so now we get all the warnings e0, e1, e2, and the exception e3, along with the update counts:

Warning     : e0
Update Count: 0
Warning     : e1
Update Count: 1
Warning     : e2
Update Count: 1
com.microsoft.sqlserver.jdbc.SQLServerException: e3
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1547)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getMoreResults(SQLServerStatement.java:1270)
	at SQLServer.main(SQLServer.java:82)

That’s more like our batch. But we’re still aborting after e3. How can we get the result set? Easy! Just ignore the exception, right? 🙂

And while we’re at it, let’s use ResultSetMetaData to read the unknown result set type.

fetchLoop:
for (int i = 0, updateCount = 0; i < 256; i++) {
    try {
        boolean result = (i == 0)
            ? s.execute(sql)
            : s.getMoreResults();

        SQLWarning w = s.getWarnings();
        for (int j = 0; j < 255 && w != null; j++) {
            System.out.println("Warning     : " + w.getMessage());
            w = w.getNextWarning();
        }

        s.clearWarnings();

        if (result)
            try (ResultSet rs = s.getResultSet()) {
                System.out.println("Result      :");
                ResultSetMetaData m = rs.getMetaData();

                while (rs.next())
                    for (int c = 1; c <= m.getColumnCount(); c++)
                        System.out.println(
                            "  " + m.getColumnName(c) +
                            ": " + rs.getInt(c));
            }
        else if ((updateCount = s.getUpdateCount()) != -1)
            System.out.println("Update Count: " + updateCount);
        else
            break fetchLoop;
        }
    catch (SQLException e) {
        System.out.println("Exception   : " + e.getMessage());
    }
}

There, that’s more like it:

Warning     : e0
Update Count: 0
Warning     : e1
Update Count: 1
Warning     : e2
Update Count: 1
Exception   : e3
Result      :
  i: 1
  i: 2
Update Count: 0
Exception   : e4

Now we’ve executed the entire batch in a super generic way with JDBC

Gah, I want this to be easier

Of course you do, which is why there is jOOQ. jOOQ has the really nice fetchMany() methods, which can execute random SQL strings to get a mixture of:

  • Update counts
  • Result sets
  • Exceptions / Warnings (jOOQ 3.10+ only)

For example, we can write:

// Use this new setting to indicate that we don't want to throw
//  exceptions, but collect them, as we've seen above
DSLContext ctx = DSL.using(c, 
  new Settings().withThrowExceptions(THROW_NONE));

// And then, simply:
System.out.println(ctx.fetchMany(sql));

The result is of the form:

Warning: SQL [...]; e0
Update count: 0
Warning: SQL [...]; e1
Update count: 1
Warning: SQL [...]; e2
Update count: 1
Exception: SQL [...]; e3
Result set:
+----+
|   i|
+----+
|   1|
|   2|
+----+
Update count: 0
Exception: SQL [...]; e4

Excellent!

What we didn’t cover

Oh, tons of things, but I need material for future blog posts, too, right?

  • We only discussed SQL Server so far
  • We didn’t discuss the fact that SQLException.getNextException() doesn’t work here
  • We didn’t discuss how we can combine this with OUT parameters (eegh, at what moment do we fetch those)
  • We didn’t discuss the fact that some JDBC drivers don’t implement this correctly (looking at you, Oracle)
  • We didn’t go into the depths of how JDBC drivers don’t implement ResultSetMetaData correctly
  • We didn’t cover the performance overhead of fetching warnings, e.g. in MySQL
  • … and much more

So, are you still writing JDBC code yourself? 🙂

jOOQ Tuesdays: Brett Wooldridge Shows What it Takes to Write the Fastest Java Connection Pool

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

brett-wooldridge

I’m very excited to feature today Brett Wooldridge, creator of HikariCP, the fastest connection pool available for Java.

Brett, you’ve created one of the most popular connection pools for Java: HikariCP. What made your library so popular?

I’ll provide some backstory on HikariCP before I answer that, but I’ll tease the answer by saying “marketing“.

A few years ago I was creating a product prototype for the company I work for, and I needed a connection pool. Like most developers I just wanted to drop in a pool and move on, so I took to the web to find the most popular and actively maintained library. Unfortunately, while load testing the prototype we started encountering deadlocks, and exceptions indicating connection state bleed over between threads.

Because the pool was open source, I thought I’d just pull down the code, find and fix the problems, and contribute back. But when I opened the code, I found thousands of lines more code than I was expecting.  Added to the mix were many locks, nested, sometimes acquired in one method and released in some distant place. There was simply no way to reason about where potential deadlocks lurked, even if we found and fixed the ones we encountered.

I picked up another pool and inspected its code. The lock semantics were clearer, but the volume of code was still more than 2x what I expected, especially given that it was delegating the core pooling logic to a separate library.

In addition, all of the pools I studied violated JDBC contracts in multiple ways. In as much as it is possible, a pool should return a Connection that is indistinguishable from one received in the absence of the pool. But these pools didn’t automatically close Statements when a connection was “closed” (returned), or clear warnings, or rollback uncommitted transactions, and they didn’t reset properties altered by the user such as auto-commit or transaction isolation level, and more; resulting in the next consumer getting a “dirty” connection.

I thought, “Really? This is the state of connection pools in the ecosystem after 20 years of Java?” Out of necessity and frustration, I created HikariCP.

To be fair, since I started HikariCP some pools have made some of these “correctness” behaviors configurable, but none of them do so by default and I suspect most users are running with the safety off.  At least two popular pools fail to complete our benchmark with OutOfMemory exceptions when they are enabled.  Conversely, HikariCP doesn’t support an unsafe mode of operation.

Returning to your question, as noted above there were many established pools available, so how did HikariCP become popular?  “Correctness” and reliability are a tough sell, so I focused on promoting performance, and started with a simple tweet. One follower led to another.  Some users tweeted about big performance gains, and improved reliability, and at some point in 2015 the Wix engineering team wrote a blog about switching to HikariCP.

In essence, simple word of mouth has led to HikariCP’s rising popularity, with an initial “marketing” push based on performance.  I do hope that over time more users will give equal weight to correctness and reliability, without which performance is meaningless, and for my part I plan to write more about those aspects of HikariCP.  

You quoted Edsger Dijkstra: “Simplicity is prerequisite for reliability.” – That reminds me of Antoine de Saint-Exupery’s “Perfection is Achieved Not When There Is Nothing More to Add, But When There Is Nothing Left to Take Away”. How do you manage to keep things simple when this world only ever gets more complicated?

Resisting complexity through feature-creep can be challenging.  I get a lot of requests for this or that feature, and while each may be simple in and of itself, if taken in totality would significantly increase complexity and code size.  Of course, that is not to say that I don’t add features.

For example, initial versions of HikariCP only supported a fixed size pool.  HikariCP was designed for systems with fairly constant load, and in that environment pools tend to stay at their maximum size, so I saw little need to complicate the code to support dynamic sizing.  Can you imagine a server at Google falling idle for several minutes?  Additionally, I feel like the more axes of configuration there are, the more difficult it is for users to optimally configure a pool.  However, eventually there were enough users who needed dynamic sizing, and its absence was a barrier to adoption, so support was added.  Principally, I did not want lack of dynamic sizing support to deprive users of the reliability and correctness benefits of HikariCP.

Still, I probably reject the vast majority of feature requests. As the custodian of HikariCP keeping it simple and true to that core philosophy is in the best interest of the community.  I always try to minimize the “surface area”, both in terms of code and configuration.  The larger the surface area of an API, the more difficult it is to comprehend.  Our brains have a limit for the amount of contextual information that can be held “in view” at one time; this is true in a lot of contexts.  For example, when reading code, methods larger than a certain size, or conditionals of more than a certain number of terms, are difficult to follow or reason about.  Generally, for users of HikariCP, the “surface area” is manifest in the number of configuration parameters.  While I can hardly say that “Perfection [has been] achieved”, I do feel like there is not much left to take away without cutting into functionality.

Few libraries go to the byte code level to optimise their code. While this helps in benchmarks, did it also help your users in production? What were the biggest caveats you found while micro-optimising?

Definitely.  Maybe some developers are dismissive of the potential gains, because in their minds they think, “What does it matter if connection acquisition takes 100ns or 100μs, the query is going to take 10ms anyway?”  However, pools intercept dozens of methods, and the “close()” path is typically slower than acquisition, so it’s not that simple.  I often get reports from users providing confirmation of real world performance improvements.  It’s anecdotal but one user initially commented in a bug report, “We’re testing HikariCP at the client and have had great initial success – an application loading 1 million records over multiple HTTP threads and putting them in the DB had its run time cut by 70% after moving from Tomcat CP to HikariCP!”  The follow-up comment on the bug was, “This was a bug in our side, using some unrelated non-threadsafe code.  No issue.  After fixing the bug, the code runs about 2x faster using HikariCP than Tomcat CP.”  That’s pretty good; and yet some reports surprise even me.

Regarding optimisation, and as long as we’re quoting famous thinkers, I would be remiss if I didn’t cite Knuth: “We should forget about small efficiencies, say about 97% of the time: premature optimisation is the root of all evil.”  I think the key word here is “premature”.  It is definitely better to write the code as it naturally comes and then, based on detailed profiling and benchmarking, perform “peephole optimisations” (to hijack a word from compiler theory).  At the same time, I would estimate that half of the performance gains in HikariCP have come as the result of algorithmic changes, rather than low-level optimisations.

Regarding caveats to micro-optimising, it would be hard to convey how much I have learned, and am still learning.  I’d like to give a shout-out to Aleksey Shipilëv for his excellent JMH micro-benchmark framework.  Aleksey has become somewhat of a JVM performance oracle (no pun intended, he used to work for Oracle).  The JVM performs an amazing array of optimisations, and if one is not careful then what appears to be a clever optimisation in the code simply confuses the JIT’s pattern-based optimiser and the result is slower rather than faster.

In order to effectively optimise on the JVM you sometimes end up reading the JIT source code, and you must become familiar with concepts such as dead code elimination, loop invariant hoisting, constant propagation, virtual call inlining, and many more.  Even with a good grip on these concepts I am sometimes surprised by the JVM in my attempts at optimisation.  In addition to the JIT, you really must understand the Java Memory Model (JMM) and how it maps onto CPU architectures like x86.

Lastly, after the design of algorithms, contention for shared state is the source of most bottlenecks (see the aforementioned JMM), so recently the biggest gains (for example, in v2.6.0) have come from tricks that simply avoid it; the fastest code is code that is never executed.

If there is a main takeaway, it is “trust the benchmarks”, your assumptions and intuitions are wrong more often than you imagine.

Your fellow jOOQ Tuesdays interviewee Vlad Mihalcea talked to us about queueing theory. How does this compare to what you wrote about connection pool sizing?

I have great respect for Vlad, I think we’re both members of the Mutual Admiration Society.  His FlexyPool is trying to solve a difficult problem; that being how to automatically tune optimal pool settings for varying loads.  Ultimately, the upper-bound is constrained by the database’s optimal concurrent query capacity, which is where my write-up on pool sizing comes into play.  However, there is a large amount of configuration space in-between a minimally sized pool and that upper-bound, which is where FlexyPool is trying to add value, by ensuring that the pool is “right sized”, dynamically, for the load it is servicing.

I say it is a difficult problem, because connection pools on modern multi-core servers likely present as a M/G/k queue in queueing theory; arrivals have a Markovian distribution, service times have a General distribution, and there are k servers (where “server” is defined as an abstract single-threaded processor).  Quoting wikipedia, “Most performance metrics for this queueing system are not known and remain an open problem.”  Modeling connection pools as a M/M/c queue might provide a decent approximation for the purposes of predicting queue lengths, but service times are not likely to have a Markovian distribution.  Of course, there are also non-Markovian stochastic models in queueing theory that could be applied.  Complicating everything is the fact that queued waiters (threads) can abandon the queue before service, for example when a timeout is reached.  That adds an additional twist when trying to predict queue lengths and wait times.  Hats off to Vlad for taking on this problem!

Anyway, what I wrote about setting the upper-bound on pool sizing translates to pinning the k (or c) value in those respective Markovian queueing theory models.

You chose a Japanese word in your product: 光 (Hikari, “Light”). What’s your connection to Japan?

I’ve lived and worked in Tokyo since 2008, though I think my Japanese is far behind where it should be given my time here.  I chalk that up to preferring time at the keyboard to language study.

As you mentioned, Hikari (pronounced Hi-ka-lee) translates to “Light” (as in sunlight).  In English, it is a double entendre in the context of HikariCP; though in Japanese it would not be.  “Light” in the sense of “the speed of…”, and “light” in the sense of being light in terms of code weight.

How to Fetch Oracle 12c Implicit Cursors with JDBC and jOOQ

Earlier this week, I’ve blogged about how to execute SQL batches with JDBC and jOOQ. This was useful for the MySQL, SQL Server, and Sybase users among you.

Today, we’ll discuss a slightly more difficult task, how to fetch Oracle 12c implicit cursors – which are essentially the same thing.

What’s an implicit cursor?

Oracle 12c added new procedures to their dynamic SQL API DBMS_SQL. Just run the following query in SQL Developer to see the results:

DECLARE
  c1 sys_refcursor;
  c2 sys_refcursor;
BEGIN
  OPEN c1 FOR SELECT 1 AS a FROM dual;
  dbms_sql.return_result(c1);
  OPEN c2 FOR SELECT 2 AS b FROM dual;
  dbms_sql.return_result(c2);
END;

The anonymous PL/SQL block contains two cursors that are opened and returned to whoever calls this block using DBMS_SQL.RETURN_RESULT. This is kind of magic, as we’re calling a procedure, passing a cursor to it, and somehow, this has a side effect on the client of this program after the program ends.

Not only can you do this in anonymous PL/SQL blocks, you can nest these calls in any procedure, of course. So, in other words, from Oracle 12c onwards, you don’t know for sure if you call a procedure if there will be more results than what you can see. For instance:

BEGIN
  any_procedure();
END;

The above call might just as well yield some implicit cursors. You can’t know for sure.

How to discover implicit cursors with JDBC

With JDBC, if you don’t know for sure what your query will yield as a result, you use the Statement.execute(String), or the PreparedStatement.execute() method to find out. As mentioned in the previous post, this is what you would do:

try (PreparedStatement s = c.prepareStatement(sql)) {
    fetchLoop:
    for (int i = 0, updateCount = 0;; i++) {
        boolean result = (i == 0)
            ? s.execute()
            : s.getMoreResults();
 
        if (result)
            try (ResultSet rs = s.getResultSet()) {
                System.out.println("\nResult:");
 
                while (rs.next())
                    System.out.println("  " + rs.getInt(1));
            }
        else if ((updateCount = s.getUpdateCount()) != -1)
            System.out.println("\nUpdate Count: " + updateCount);
        else
            break fetchLoop;
    }
}

Unfortunately, that won’t work on Oracle as Oracle’s JDBC driver doesn’t implement the JDBC spec correctly. I’ve documented this flaw in length on this Stack Overflow question here.

Using ojdbc, the following “improved” loop needs to be written:

/* Alternatively, use this for non-PreparedStatements:
try (Statement s = cn.createStatement()) {
    Boolean result = s.execute(sql); */
try (PreparedStatement s = cn.prepareStatement(sql)) {
    // Use good old three-valued boolean logic
    Boolean result = s.execute();

    fetchLoop:
    for (int i = 0;; i++) {

        // Check for more results if not already done in 
        // this iteration
        if (i > 0 && result == null)
            result = s.getMoreResults();
        System.out.println(result);

        if (result) {
            result = null;

            try (ResultSet rs = s.getResultSet()) {
                System.out.println("Fetching result " + i);
            }
            catch (SQLException e) {
                // Ignore ORA-17283: No resultset available (1)
                if (e.getErrorCode() == 17283)
                    continue fetchLoop;
                else
                    throw e;
            }
        }
        else if (s.getUpdateCount() == -1)
            // Ignore -1 value if there is one more result! (2)
            if (result = s.getMoreResults())
                continue fetchLoop;
            else
                break fetchLoop;
    }
}

Two elements of the above logic need more explanation:

  1. There’s a possibility of an ORA-17283: No resultset available error being raised when accessing the Statement.getResultSet() despite the previous call to Statement.execute() yielding true. If that happens, we’ll just ignore the error and try fetching another result set
  2. In case we’re using PreparedStatement, the original call to PreparedStatement.execute() will yield false (!) and the Statement.getUpdateCount() value is -1, which would normally mean that we should stop. Not in this case. Let’s just try one more time to get a result set, and tah-dah, here are our implicit result sets.

Note that the algorithm now works with both static Statement and PreparedStatement, which (very unfortunately) behave differently when calling execute().

The above will now work with any SQL statement. In case you’re using the previous SQL statement returning implicit cursors:

String sql =
    "\nDECLARE"
  + "\n  c1 sys_refcursor;"
  + "\n  c2 sys_refcursor;"
  + "\nBEGIN"
  + "\n  OPEN c1 FOR SELECT 1 AS a FROM dual;"
  + "\n  dbms_sql.return_result(c1);"
  + "\n  OPEN c2 FOR SELECT 2 AS a FROM dual;"
  + "\n  dbms_sql.return_result(c2);"
  + "\nEND;";

… you will now be able to fetch all the results:

true
true
Fetching result 1
true
Fetching result 2
false

How to get those cursors with jOOQ?

With jOOQ 3.10 (as always), you don’t need to worry about those low level JDBC details. Just call the following code:

System.out.println(
    DSL.using(cn).fetchMany(sql)
);

And you’ll get a convenient, object oriented representation of your multiple result sets in the form of an org.jooq.Results:

Result set:
+----+
|   A|
+----+
|   1|
+----+
Result set:
+----+
|   A|
+----+
|   2|
+----+

Even better, when you use a code generator to return multiple implicit cursors like this in a stored procedure, just call the generated stored procedure object like this, to get all the cursors automatically:

MyProcedure p = new MyProcedure();
p.setParameter1(x);
p.setParameter2(y);
p.execute(configuration);
Results results = p.getResults();

for (Result<?> result : results)
  for (Record record : result)
    System.out.println(record);

Done!

How to Execute SQL Batches With JDBC and jOOQ

Some databases (in particular MySQL and T-SQL databases like SQL Server and Sybase) support a very nice feature: They allow for running a “batch” of statements in a single statement. For instance, in SQL Server, you can do something like this:

-- Statement #1
DECLARE @table AS TABLE (id INT);

-- Statement #2
SELECT * FROM @table;

-- Statement #3
INSERT INTO @table VALUES (1),(2),(3);

-- Statement #4
SELECT * FROM @table;

This is a batch of 4 statements, and it can be executed as a single statement both with JDBC and with jOOQ. Let’s see how:

Executing a batch with JDBC

Unfortunately, the term “batch” has several meanings, and in this case, I don’t mean the JDBC Statement.addBatch() method, which is actually a bit clumsy as it doesn’t allow for fetching mixed update counts and result sets.

Instead, what I’ll be doing is this:

String sql =
    "\n  -- Statement #1                              "
  + "\n  DECLARE @table AS TABLE (id INT);            "
  + "\n                                               "
  + "\n  -- Statement #2                              "
  + "\n  SELECT * FROM @table;                        "
  + "\n                                               "
  + "\n  -- Statement #3                              "
  + "\n  INSERT INTO @table VALUES (1),(2),(3);       "
  + "\n                                               "
  + "\n  -- Statement #4                              "
  + "\n  SELECT * FROM @table;                        ";

try (PreparedStatement s = c.prepareStatement(sql)) {
    fetchLoop:
    for (int i = 0, updateCount = 0;; i++) {
        boolean result = (i == 0)
            ? s.execute()
            : s.getMoreResults();

        if (result)
            try (ResultSet rs = s.getResultSet()) {
                System.out.println("\nResult:");

                while (rs.next())
                    System.out.println("  " + rs.getInt(1));
            }
        else if ((updateCount = s.getUpdateCount()) != -1)
            System.out.println("\nUpdate Count: " + updateCount);
        else
            break fetchLoop;
    }
}

The output of the above program being:

Result:

Update Count: 3

Result:
  1
  2
  3

The above API usage is a somewhat “hidden” – or at least not every day usage of the JDBC API. Mostly, you’ll be using Statement.executeQuery() when you’re expecting a ResultSet, or Statement.executeUpdate() otherwise.

But in our case, we don’t really know what’s happening. We’re going to discover the result types on the fly, when executing the statement. Here are the main JDBC API features that we’re using, along with an explanation:

  • Statement.execute(): This method should be used if we don’t know the result type. The method returns a boolean, which is true when the first statement in the batch produced a ResultSet and false otherwise.
  • Statement.getMoreResults(): This method returns the same kind of boolean value as the previous Statement.execute() method, but it does so for the next statement in the batch (i.e. for every statement except the first).
  • If the current result is a ResultSet (the boolean was true), then we’ll obtain that ResultSet through Statement.getResultSet() (we can obviously no longer call the usual Statement.executeQuery() to obtain the ResultSet).
  • If the current result is not a ResultSet (the boolean was true), then we’ll check the update count value through Statement.getUpdateCount().
  • If the update count is -1, then we’ve reached the end of the batch.

What a nice state machine!

The nice thing about this is that a batch may be completely nondeterministic. E.g. there may be triggers, T-SQL blocks (e.g. an IF statement), stored procedures, and many other things that contribute result sets and/or update counts. In some cases, we simply don’t know what we’ll get.

Executing a batch with jOOQ

It’s great that the JDBC API designers have thought of this exotic API usage on a rather low level. This makes JDBC extremely powerful. But who remembers the exact algorithm all the time? After all, the above minimalistic version required around 20 lines of code for something as simple as that.

Compare this to the following jOOQ API usage:

System.out.println(
    DSL.using(c).fetchMany(sql)
);

The result being:

Result set:
+----+
|  id|
+----+
Update count: 3
Result set:
+----+
|  id|
+----+
|   1|
|   2|
|   3|
+----+

Huh! Couldn’t get much simpler than that! Let’s walk through what happens:

The DSLContext.fetchMany() method is intended for use when users know there will be many result sets and/or update counts. Unlike JDBC which reuses ordinary JDBC API, jOOQ has a different API here to clearly distinguish between behaviours. The method then eagerly fetches all the results and update counts in one go (lazy fetching is on the roadmap with issue #4503).

The resulting type is org.jooq.Results, a type that extends List<Result>, which allows for iterating over the results only, for convenience. If a mix of results or update counts need to be consumed, the Results.resultsOrRows() method can be used.

A note on warnings / errors

Note that if your batch raises errors, then the above JDBC algorithm is incomplete. Read more about this in this follow-up post.

How to Prevent JDBC Resource Leaks with JDBC and with jOOQ

In a recent consulting gig, I was analysing a client’s connection pool issue in a productive system, where during some peak loads, all the Java processes involving database interactions just started queueing up until nothing really worked anymore. No exceptions, though, and when the peak load was gone in the evening, everything returned back to normal. The database load looked pretty healthy at the time, so no actual database problem was involved – the problem had to be a client side problem.

Weblogic operations teams quickly identified the connection pool to be the bottleneck. All the connections were constantly allocated to some client process. The immediate thought was: A resource leak is happeneing, and it didn’t show before because this was an exceptional situation: Around the beginning of the new year when everyone wanted to download their electronic documents from the bank (and some new features introduced many more document related database calls).

The obvious problem

That particular system still runs a lot of legacy code in Java 6 style, which means, there are tons of code elements of the following kind:

Connection connection = null;
try {

  // Get the connection from the pool through JNDI
  connection = JDBCHelper.getConnection();
}
finally {

  // Release the connection
  JDBCHelper.close(connection);  
}

While the above code is perfectly fine, and 99% of all database interactions were of the above type, there was an occasional instance of someone badly copy-pasting some code and doing something like this:

Connection connection = JDBCHelper.getConnection();
PreparedStatement stmt = null;

try {
  stmt = connection.prepareStatement("SELECT ...");
}
finally {

  // Release the statement
  JDBCHelper.close(stmt);
}

// But the connection is never released

Sometimes, things were even more subtle, as a utility method expected a connection like this:

// Utility method doesn't have to close the connection:
public void databaseCalls(Connection connection) {
  try {
    stmt = connection.prepareStatement("SELECT ...");
  }
  finally {

    // Release the statement
    JDBCHelper.close(stmt);
  }
}

public void businessLogic() {
  // Oops, subtle connection leak
  databaseCalls(JDBCHelper.getConnection());
}

Thoroughly fixing these things

There’s a quick fix to all these problems. The easiest fix is to just continue rigorously using the JDBCHelper.close() method (or just call connection.close() with appropriate error handling) every time. But apparently, that’s not easy enough as there will always be a non-vigilant developer (or a junior developer who doesn’t know these things), who will get it wrong, who will simply forget things.

I mean, even the official JDBC tutorial gets it “wrong” on their first page:
https://docs.oracle.com/javase/tutorial/jdbc/overview/index.html

The bad example being:

public void connectToAndQueryDatabase(
    String username, String password) {

    Connection con = DriverManager.getConnection(
                         "jdbc:myDriver:myDatabase",
                         username,
                         password);

    Statement stmt = con.createStatement();
    ResultSet rs = stmt.executeQuery(
        "SELECT a, b, c FROM Table1");

    while (rs.next()) {
        int x = rs.getInt("a");
        String s = rs.getString("b");
        float f = rs.getFloat("c");
    }
}

All resources leak in this example!

Of course, it’s just an example, and of course, it’s not a terrible situation, because resources can usually clean up themselves when they go out of scope, i.e. when the GC kicks in. But as software engineers we shouldn’t rely on that, and as the productive issues have shown, there are always edge cases, where precisely this lack of vigilance will cause great harm. After all,

It works on my machine

… is simply not an excuse. We should design our software for productive use.

Fix #1: Use try-with-resources. Always

If you want to stay on the safe side, always follow this rule:

The scope that acquires the resource, closes the resource

As long as you’re working with JDBC, save yourself the trouble of writing those JDBCUtilities classes that close non-null resources and safely catch exceptions that may arise. Just use try-with-resources, all the time. For instance, take the example from the Oracle JDBC tutorial, which should read:

public void connectToAndQueryDatabase(
     String username, String password) {

    // All of these resources are allocated in this method. Thus,
    // this method's responsibility is to also close / free all
    // these resources.
    try (Connection con = DriverManager.getConnection(
            "jdbc:myDriver:myDatabase", username, password);
         Statement stmt = con.createStatement();
         ResultSet rs = stmt.executeQuery(
            "SELECT a, b, c FROM Table1")) {

        while (rs.next()) {
            int x = rs.getInt("a");
            String s = rs.getString("b");
            float f = rs.getFloat("c");
        }
    }
}

This already feels that much better and cleaner, doesn’t it? All the resources are acquired in the above method, and the try-with-resources block will close all of them when they go out of scope. It’s just syntax sugar for something we’ve been doing manually all the time. But now, we will (hopefully) never again forget!

Of course, you could introduce automatic leak detection in your integration tests, because it’s rather easy to proxy the JDBC DataSource and count all connection acquisitions and closings. An example can be seen in this post:
The best way to detect database connection leaks

Fix #2: Use jOOQ, which manages resources for you

Historically, JDBC works on lazy resources that are kept around for a while. The assumption in 1997 (when JDBC was introduced) was that database interactions were relatively slow and it made sense to fetch and process one record at a time, even for moderately sized result sets.

In fact, it was even common to abort fetching records from a cursor when we’ve had enough results and close it eagerly before consuming all the rows.

Today, these assumptions are (mostly) no longer true, and jOOQ (like other, more modern database APIs) invert the lazy/eager API default behaviour. In jOOQ, the JDBC types have the following corresponding counterparts:

  • JDBC DataSource / Connection => jOOQ ConnectionProvider:
    jOOQ doesn’t know the concept of an “open connection” like JDBC. jOOQ only has this ConnectionProvider which works in a similar way to JDBC’s / JavaEE’s DataSource. The semantics here is that the connection / session is “managed” and jOOQ will acquire / release it once per statement. This happens automatically, so users don’t have to worry about any connection resource.
  • JDBC Statement (and subtypes) => jOOQ Query:
    While the JDBC statement (especially the PreparedStatement) is a resource that binds some server-side objects, such as an execution plan, for instance, jOOQ again doesn’t have such a resourceful thing. The Query just wraps the SQL string (or AST) and bind variables. All resources are created lazily only when the query is actually executed – and released immediately after execution. Again, users don’t have to worry about any statement resource.
  • JDBC ResultSet => jOOQ Result:
    The JDBC ResultSet corresponds to a server-side cursor, another object that possibly binds quite a few resources, depending on your fetch mode. Again, in jOOQ no resources are bound / exposed, because jOOQ by default eagerly fetches your entire result set – the assumption being that a low-level optimisation here doesn’t add much value for moderately sized result sets

With the above inverted defaults (from lazy to eager resource allocation / freeing), the jOOQ-ified Oracle JDBC tutorial code would look like this:

Working with a standalone Connection

public void connectToAndQueryDatabase(
    String username, String password) {

    // If you're using a standalone connection, you can pass that
    // one to jOOQ, but you're still responsible of closing it
    // again:
    try (Connection con = DriverManager.getConnection(
            "jdbc:myDriver:myDatabase", username, password)) {

        // There is no statment resource anymore, and the result
        // is fetched eagerly from the database, so you don't have
        // to worry about it
        for (Record record : DSL.using(con).fetch(
                "SELECT a, b, c FROM Table1")) {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        }
    }
}

Working with a connection pool / DataSource

// You probably have some means of injecting / discovering
// a JDBC DataSource, e.g. from Spring, or from your JavaEE
// container, etc.
@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // With a DataSource, jOOQ will automatically acquire and
    // close the JDBC Connection for you, so the last remaining
    // resource has also disappeared from your client code.
    for (Record record : DSL
           .using(ds, SQLDialect.ORACLE)
           .fetch("SELECT a, b, c FROM Table1")) {
        int x = record.get("a", int.class);
        String s = record.get("b", String.class);
        float f = record.get("c", float.class);
    }
}

With jOOQ, all resource management is automatic, by default, because by default, you don’t want to worry about this low level stuff. It’s not 1997 anymore. The JDBC API really is too low level for most use-cases.

If you do want to optimise resource management and not fetch everything eagerly, you can, of course. jOOQ will allow you to fetch your results lazily, in two ways:

Using a Cursor

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // jOOQ's Cursor type is a resource, just like JDBC's
    // ResultSet. It actually keeps a reference to an open
    // ResultSet, internally. This is an opt-in
    // feature, though, only to be used if desired.
    try (Cursor<Record> cursor : DSL
            .using(ds, SQLDialect.ORACLE)
            .fetchLazy("SELECT a, b, c FROM Table1")) {

        for (Record record : cursor) {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        }
    }
}

Using a Java 8 Stream (lazy, resourceful version)

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // This can also work with a stream
    try (Stream<Record> stream : DSL
        .using(ds, SQLDialect.ORACLE)
        .fetchStream("SELECT a, b, c FROM Table1")) {

        stream.forEach(record -> {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        });
    }
}

Unfortunately, there are no auto-closing streams in Java, which is why we have to resort to using the try-with-resources statement, breaking the fluency of jOOQ’s API.

Do note though, that you can use the Stream API in an eager fashion:

Using a Java 8 Stream (eager version)

@Inject
DataSource ds;

public void connectToAndQueryDatabase(
    String username, String password) {

    // Fetch the jOOQ Result eagerly into memory, then stream it
    // Again, no resource management
    DSL.using(ds, SQLDialect.ORACLE)
       .fetch()
       .stream("SELECT a, b, c FROM Table1")
       .forEach(record -> {
            int x = record.get("a", int.class);
            String s = record.get("b", String.class);
            float f = record.get("c", float.class);
        });
}

Conclusion

Developers, unfortunately, often suffer from

Works on my machine

This leads to problems that can be discovered only in production, under load. When it comes to resources, it is important to constantly remind ourselves that …

The scope that acquires the resource, closes the resource

JDBC (and the JDK’s IO APIs), “unfortunately”, deal with resources on a very low level. This way, their default behaviour is very resource-efficient. For instance, when you only need to read a file header, you don’t load the entire file into memory through the InputStream. You can explicitly, manually, only load the first few lines.

But in many applications, this default and its low level nature gets in the way of correctness (accidental resource leaks are easy to create), and convenience (a lot of boiler plate code needs to be written).

With database interactions, it’s usually best to migrate your JDBC code towards a more modern API like jOOQ, which abstracts resource handling away in its API and inverts the lazy/eager semantics: Eager by default, lazy on demand.

More information about the differences between jOOQ and JDBC can be seen here, in the manual.

Using Stored Procedures With JPA, JDBC… Meh, Just Use jOOQ

The current edition of the Java magazine has an article about Big Data Best Practices for JDBC and JPA by Josh Juneau:
http://www.javamagazine.mozaicreader.com/MayJune2016

The article shows how to use a stored procedure with JDBC (notice how resources aren’t closed, unfortunately. This is commonly forgotten, even in Java Magazine articles)

// Using JDBC to call upon a database stored
// procedure
CallableStatement cs = null;
try {
    cs = conn.prepareCall("{call DUMMY_PROC(?,?)}");
    cs.setString(1, "This is a test");
    cs.registerOutParameter(2, Types.VARCHAR);
    cs.executeQuery();

    // Do something with result
    String returnStr = cs.getString(2);
} catch (SQLException ex){
    ex.printStackTrace();
}

And with JPA:

// Utilize JPA to call a database stored procedure
// Add @NamedStoredProcedureQuery to entity class
@NamedStoredProcedureQuery(
    name="createEmp", procedureName="CREATE_EMP",
    parameters = {
        @StoredProcedureParameter(
            mode= ParameterMode.IN,
            type=String.class,
            name="first"),
        @StoredProcedureParamter(
            mode = ParameterMode.IN,
            type=String.class,
            name="last")
    })

// Calling upon stored procedure
StoredProcedureQuery qry =
    em.createStoredProcedureQuery("createEmp");
qry.setParameter("first", "JOSH");
qry.setParameter("last","JUNEAU");
qry.execute();

Specifically the latter was also recently discussed in blog posts by Vlad Mihalcea and Thorben Janssen.

Do you like verbosity and complexity?

No? We neither. This is why we give you a third option instead: Just use jOOQ. Here’s the equivalent jOOQ code:

// JDBC example:
String returnStr = Routines.dummyProc(
    config, "This is a test");

// JPA example
Routines.createEmp(config, "JOSH", "JUNEAU");

Yes! That’s it. Don’t waste time manually configuring your bind variables with JDBC API calls, or JPA annotations. No one likes writing annotations for stored procedures. With jOOQ and jOOQ’s code generator, procedure calls are:

  • A one-liner
  • A no-brainer
  • A way to bring back the fun to stored procedures

Learn more about using Oracle stored procedures with nested collections and object types here:
https://blog.jooq.org/2014/11/04/painless-access-from-java-to-plsql-procedures-with-jooq

It is all about the JDBC Basics

We’re very happy to announce a guest post by Marco Behler, who has been blogging about jOOQ in the past.

img31Marco started out in programming (reverse-engineering, actually) and now mainly programmes on the JVM in his day-to-day work. He also always had a sweet tooth for strategy and marketing. Marco Behler GmbH is the result of that hybrid role.

It is all about the JDBC Basics

It is one of the days.

You are reading the Spring documentation’s @Transactional section and still don’t understand the difference between logical and physical transaction scopes. Simultaneously your app throws an
LazyInitializationException and you have no idea why. To top it off you see spontaneous database deadlocks in production and you suspect your connection pool is leaking connections..somehow.

Know what most likely would have helped instead of banging your head against the wall? Spending a couple (literally) of hours on learning the JDBC basics. Let’s find out why:

What are the JDBC basics?

The basics are opening up/closing database connections and then working with transactions. Also understanding how deadlocks, pessimistic and optimistic locking work on a plain JDBC level. A bit of isolation levels and savepoints and then directly on to connection pools and jdbc driver logging. That’s it. Seriously.

Why are the basics so important?

Everything you will encounter in frameworks like Spring, Hibernate, jOOQ etc. builds up on these basics. For example, there are a gazillion topics on the internet regarding Hibernate’s LazyInitializationException and I was scared of that particular exception myself many years ago. But what else would you expect trying to query the database without having a connection to the database open (which is basically all that this exeception is) ?

The same with Spring’s “transaction framework”. There is so much content, or shall we say (F)ear/(U)ncertainty/(D)oubt, out there on how to open up transactions with spring, be it programmatically, with annotations or xml. But what if you knew that under the hood, there is only one way (and actually one line of code) to open up transactions in the JDBC world?

Let me not even get started on the various (mis)configurations of connection pools you see in production in the wild. Or the unawareness of JDBC (driver) logging, which usually leads to debugging in the wild. All basics, which you can master in a couple of hours and which will help you for a lifetime!

Why do people not just learn the basics?

In every middle-sized project there is a ton of technologies involved and there usually is no clear-cut path on how to learn all of them or how they all work together. It simply takes a lot of time and effort to dig through everything.

There’s JPA sessions and JDBC connections and then Spring somehow provides those transactional proxies in 5 different ways and then some other colleague just put jOOQ into the mix, but then somehow my session doesn’t flush and my objects don’t get persisted and the HibernateTransactionManager is not working as expected.

With all of this, I would also hope for my database transactions just to commit – god forbid what happens on rollback 🙂

But in the end, everything technology mentioned is just a layer on top of JDBC. If you understand transactions or deadlocks or savepoints on the basic level, then Spring or Hibernate or jOOQ will not throw you off.

So what do you recommend ?

If you want to get miles ahead in your day-to-day database programming, you have to start with the basics. Step-by-Step. And then you will see most of your problems automatically evaporate.

Out of my extensive database programming experience, I created an ebook with a ton of ready-to-run exercises, which will take you from Java database novice to expert. At your own pace. You can literally copy the source code of every chapter into your IDE, run it and (hopefully) learn from it. It covers plain JDBC, Spring, Hibernate, jOOQ (soon) and also distributed transactions.

You can read the whole book for free online here, and I would love to get your feedback! I would really like to let the community feedback flow back into future editions of the book. In addition, If you like what you see and the exercises help you, you can also show your support by getting a paid digital version (pdf, epub, mobi).

In any case…

…learn your JDBC basics – and you will profit from them for a lifetime!