Correlated Subqueries are Evil and Slow. Or are They?

A common myth in SQL is the idea that correlated subqueries are evil and slow. For example, this query here:

SELECT 
  first_name, last_name,
  (SELECT count(*) 
   FROM film_actor fa 
   WHERE fa.actor_id = a.actor_id)
FROM actor a

It “forces” the database engine to run a nested loop of the form (in pseudo code):

for (Actor a : actor) {
  output(
    a.first_name,
    a.last_name,
    film_actor.where(fa -> fa.actor_id = a.actor_id)
              .size()
}

So, for every actor, collect all the corresponding film_actors and count them. This will produce the number of films each actors has played in.

It appears that it would be much better to run this query in “bulk”, I.e. to run:

SELECT 
  first_name, last_name, count(fa.actor_id)
FROM actor a
LEFT JOIN film_actor fa USING (actor_id)
GROUP BY actor_id, first_name, last_name

But is it really faster? And if so, why would you expect that?

Bulk aggregation vs nested loops

Bulk aggregation in this case really just means that we’re collecting all actors and all film_actors and we then merge them in memory for the group by operation. The execution plan (in Oracle) looks like this:

-------------------------------------------------------------------
| Id  | Operation              | Name                    | A-Rows |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                         |    200 |
|   1 |  HASH GROUP BY         |                         |    200 |
|*  2 |   HASH JOIN (OUTER)    |                         |   5462 |
|   3 |    TABLE ACCESS FULL   | ACTOR                   |    200 |
|   4 |    INDEX FAST FULL SCAN| IDX_FK_FILM_ACTOR_ACTOR |   5462 |
-------------------------------------------------------------------

There are 5462 rows in our film_actor table, and for each actor, we join and group and aggregate them all to get to the results. Let’s compare this to the nested loop’s plan:

-----------------------------------------------------------------------
| Id  | Operation         | Name                    | Starts | A-Rows |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                         |      1 |    200 |
|   1 |  SORT AGGREGATE   |                         |    200 |    200 |
|*  2 |   INDEX RANGE SCAN| IDX_FK_FILM_ACTOR_ACTOR |    200 |   5462 |
|   3 |  TABLE ACCESS FULL| ACTOR                   |      1 |    200 |
-----------------------------------------------------------------------

I’ve now included the “Starts” row to illustrate that after having collected all the 200 actors, we start the subquery 200 times to get the count for each actor. But this time, with a range scan.

From the plan only, can we tell which one is faster? Bulk aggregation will need more memory (to load all the data), but has a lower algorithmic complexity (linear). Nested looping will need less memory (all the required info is available from the index directly) but seems to have a higher algorithmic complexity (quadratic).

Fact is, this isn’t exactly true.

The bulk aggregation is indeed linear, but according to O(M + N) where M = number of actors and N = number of film_actors, whereas nested looping isn’t quadratic, it’s O(M log N). We don’t need to traverse the entire index to get the count.

At some point, the higher complexity is worse, but with this little amount of data, it’s not:

On the x-axis is the size of N and on the y-axis is the “complexity value”, e.g. how much time (or memory) is used by an algorithm.

Effects of algorithmic complexity for large N

Effects of algorithmic complexity for large N

Effects of algorithmic complexity for "small" N

Effects of algorithmic complexity for “small” N

Here’s a disclaimer about the above:

Algorithmic complexity for “small N” = “works on my machine” 馃槈

There’s nothing better than proving things with measurements. Let’s run the following PL/SQL program:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT NUMBER := 1000;
BEGIN
  v_ts := SYSTIMESTAMP;
  
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        first_name, last_name,
        (SELECT count(*) 
         FROM film_actor fa 
         WHERE fa.actor_id = a.actor_id)
      FROM actor a
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
  
  dbms_output.put_line('Nested select: ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
  
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        first_name, last_name, count(film_id)
      FROM actor a
      LEFT JOIN film_actor fa USING (actor_id)
      GROUP BY actor_id, first_name, last_name
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
  
  dbms_output.put_line('Group by     : ' || (SYSTIMESTAMP - v_ts));
END;
/

After three runs, and against our standard Sakila database (get it here: https://www.jooq.org/sakila) with 200 actors and 5462 film_actors, we can see that the nested select consistently outperforms the bulk group by:

Nested select: +000000000 00:00:01.122000000
Group by     : +000000000 00:00:03.191000000

Nested select: +000000000 00:00:01.116000000
Group by     : +000000000 00:00:03.104000000

Nested select: +000000000 00:00:01.122000000
Group by     : +000000000 00:00:03.228000000

Helping the optimiser

Some interesting feedback by Markus Winand (author of http://sql-performance-explained.com) was given on Twitter:

There is a third option: Nesting the GROUP BY operation in a derived table:

SELECT
  first_name, last_name, c
FROM actor a
JOIN (
  SELECT actor_id, count(*) c
  FROM film_actor
  GROUP BY actor_id
) USING (actor_id)

which produces a slightly better plan than the “ordinary” group by query:

--------------------------------------------------------------------
| Id  | Operation               | Name                    | A-Rows |
--------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                         |    200 |
|*  1 |  HASH JOIN              |                         |    200 |
|   2 |   TABLE ACCESS FULL     | ACTOR                   |    200 |
|   3 |   VIEW                  |                         |    200 |
|   4 |    HASH GROUP BY        |                         |    200 |
|   5 |     INDEX FAST FULL SCAN| IDX_FK_FILM_ACTOR_ACTOR |   5462 |
--------------------------------------------------------------------

Adding it to the benchmark as such:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT NUMBER := 1000;
BEGIN
  v_ts := SYSTIMESTAMP;
   
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT
        first_name, last_name,
        (SELECT count(*) 
         FROM film_actor fa 
         WHERE fa.actor_id = a.actor_id)
      FROM actor a
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
   
  dbms_output.put_line('Nested select            : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
   
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT
        first_name, last_name, count(film_id)
      FROM actor a
      LEFT JOIN film_actor fa USING (actor_id)
      GROUP BY actor_id, first_name, last_name
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;
   
  dbms_output.put_line('Group by                 : ' || (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
  
  FOR i IN 1..v_repeat LOOP
    FOR rec IN (
      SELECT 
        first_name, last_name, c
      FROM actor a
      JOIN (
        SELECT actor_id, count(*) c
        FROM film_actor
        GROUP BY actor_id
      ) USING (actor_id)
    ) LOOP
      NULL;
    END LOOP;
  END LOOP;

  dbms_output.put_line('Group by in derived table: ' || (SYSTIMESTAMP - v_ts));
END;
/

… shows that it’s already very close to the nested select option:

Nested select            : +000000000 00:00:01.371000000
Group by                 : +000000000 00:00:03.417000000
Group by in derived table: +000000000 00:00:01.592000000

Nested select            : +000000000 00:00:01.236000000
Group by                 : +000000000 00:00:03.329000000
Group by in derived table: +000000000 00:00:01.595000000

Conclusion

We have shown that under some circumstances, correlated subqueries can be better than bulk aggregation. In Oracle. With small-medium sized data sets. In other cases, that’s not true as the size of M and N, our two algorithmic complexity variables increase, O(M log N) will be much worse than O(M + N).

The conclusion here is: Don’t trust any initial judgement. Measure. When you run such a query a lot of times, 3x slower can make a big difference. But don’t go replace all your bulk aggregations either. 馃檪

Liked this article? The content is part of our Data Geekery SQL performance training

Cyclops-react Organises the Cambrian Explosion of Java 8 Libraries

We鈥檙e excited to announce another very interesting guest post on the jOOQ Blog by John Mcclean from AOL.

AOL is a global digital media and technology company, founded in 1985 and once known as America Online, AOL is now part of the Verizon Group. AOL focuses on four areas – video, mobile, ad technology and platforms, and open ecosystems. AOL connects publishers with advertisers across their global, programmatic platforms, tapping into Microsoft inventory and original content brands like TechCrunch, The Huffington Post and MAKERS.

johnmccleanJohn is an Architect at AOL. He works in the ad tech and platforms group, where he leads the advertising demand side forecasting team. A team that builds and runs a system that processes billions of RTB, impression and viewability records in realtime to generate price volume curves and other forecasts for advertising campaigns in milliseconds. John is also the lead developer for AOL open source projects cyclops-react and Microserver. Extracted from AOL’s forecasting system these projects allow AOL to rapidly deploy new features that work at scale, by guiding Java developers along the path of functional, reactive, microservices.

What is Cyclops-react?

The arrival of Lambda expressions and default methods in Java 8 heralded the biggest structural changes to the Java language in a decade. Building on top of this were some new cool APIs, such as Stream, Optional, CompletableFuture – finally Java developers could code in a more functional style. While this was very welcome, for many the enhancements did not quite go far enough. 聽

Stream, Optional, CompletableFuture all share the same abstract structure and obey the same rules. Yet the APIs don鈥檛 agree on common method names, never mind provide a common interface. For example Stream#map / Optional#map becomes CompletableFuture#thenApply. Also, the functionality added to Stream & Optional is missing from collections generally. Where is List#map ?

The JDK Stream implementation performs well, is totally lazy and well designed for extension, but provides only a limited subset of potential operators (constrained, perhaps, by a focus on data parallelism). Into the void stepped libraries such as jOO位 with its sequential Stream extension (called Seq). Seq adds many additional Streaming operators. jOO位 generally adds many missing functional features such as Tuples.

A core goal of cyclops-react, as well as adding original features such as FutureStreams, is to provide a mechanism for joining up both the JDK APIs and the third party functional libraries. There was a Cambrian explosion of cool libraries that emerged after the launch of Java 8. Libraries like Javaslang & Project Reactor. cyclops-react does this in the first instance by extending the JDK, and by leveraging other libraries such as 聽jOO位, pCollections & Agrona. These libraries in turn also extend JDK interfaces where possible to add features such as Persistent Collections and wait free Many Producer Single Consumer Queues.

Beyond reusing and extending JDK interfaces our aims were to make it easy for developers to integrate with external libraries by making use of third party standards such as the reactive-streams API and by building our own abstractions where no set standard existed. The libraries we currently focus on integrating with are Google鈥檚 Guava, RxJava, Functional Java, Project Reactor and Javaslang. We鈥檝e created abstractions for wrapping types like Stream, Optional & CompletableFuture – where no interface existed or was possible before. We chose these goals, because we are using cyclops-react in production across a Microservices architecture and being able to leverage the right technology for a problem and have it integrate smoothly with the rest of our code base is critical.

cyclops-react is quite a large feature rich project, and in addition has a number of integration modules. In the article below I鈥檒l cover some of the available features with a particular goal of showing how cyclops-react helps join up the dots across the JDK and into the brave new world of the pace setting Java 8 open source community.

Extending the JDK

cyclops-react extends JDK APIs where possible. For example ReactiveSeq adds functionality for handling errors, asynchronous processing and much more extends extends both JDK Stream and jOO位鈥檚 Seq. cyclops-react Collection extensions, rather than creating new collection implementations, implement and extend the appropriate JDK interfaces. cyclops-react LazyFutureStream in turn extends ReactiveSeq, and allows aggregate operations over Streams of Futures as if it were a simple Stream (this proves to be very useful for handling a large number typical Java I/O operations asynchronously and performantly).

ListX extends List, but adds operators that execute eagerly

ListX<Integer> tenTimes = ListX.of(1,2,3,4)
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 聽.map(i->i*10);

cyclops-react adds lots of operators for users to explore. We can, for example, apply functions across multiple collections at the same time

The reactive-streams API acts as a natural bridge between producers (publishers) of data and consumers (subscribers). All cyclops-react data types implement the Publisher interface from reactive-streams, and Subscriber implementations that can convert to any cyclops-react type are provided also. This makes direct integration with other reactive-streams based libraries, such as Project Reactor straightforward.

For example we can lazily populate a Reactor Flux from any cyclops publisher, such as SortedSetX, or populate a cyclops-react type from a Reactor type.

Flux<Integer> stream = Flux.from(
  SortedSetX.of(1,2,3,4,5,6,7,8));
//Flux[1,2,3,4,5,6,7,8]

ListX<Character> list = ListX.fromPublisher(
  Flux.just("a","b","c"));

Reactor Flux and Mono types can work directly with cyclops-react For comprehensions (each supported library also has their own set of native For comprehension classes in their integration module).聽

// import static com.aol.cyclops.control.For.*;
聽 聽 聽 聽聽
Publishers.each2(
  Flux.just(1,2,3), 
  i -> ReactiveSeq.range(i,5),Tuple::tuple).printOut();
聽 聽 聽 聽聽
/*
(1, 1)
(1, 2)
(1, 3)
(1, 4)
(2, 2)
(2, 3)
(2, 4)
(3, 3)
(3, 4)
*/

A For comprehension is a way of managing nested iteration over types with flatMap and map methods, by cascading calls to the appropriate methods. In cyclops-react, nested statements can access the elements of the previous statements, so For comprehensions can be a very useful way of managing the behavior of existing. For example to ensure that calls to existing methods findId and loadData which may return null values, and will throw NPEs if provided with a null parameter we can make use of a For comprehension that will safely execute loadData only when an Optional with a value is returned from findId()

List<Data> data = 
For.optional(findId())
   .optional(this::loadData);
//loadData is only called if findId() returns a value

Similarly, a type such as Try could be used to handle exceptional results from either findId or loadData, Futures can be used to execute chained methods asynchronously and so on.

Building cross-library abstractions

Java 8 introduced Monads to Java (Stream, Optional, CompletableFuture), but didn鈥檛 provide a common interface that would help reuse, in fact the method names used in CompletableFuture differ significantly from those used in Optional & Stream for the same function. So map became thenApply and flatMap thenCompose. Across the Java 8 world monads are becoming an increasingly common pattern, but there is often no way to abstract across them. In cyclops-react, rather than attempt to define an interface to represent monads, we built a set of wrapper interfaces and a number of custom adapters to adapt different instances from across the main functional-style libraries for Java 8 to those wrappers. The wrappers extend AnyM (short for Any Monad) and there are two sub-interfaces – AnyMValue which represents any monadic type that resolves to a single value (like Optional or CompletableFuture) or AnyMSeq that ultimately resolves to a sequence of values (like a Stream or List). The cyclops extension wrappers provide a mechanism to wrap the types from RxJava, Guava, Reactor, FunctionalJava and Javaslang.

//We can wrap any type from Reactor, RxJava,
//FunctionalJava, Javaslang, Guava
AnyMSeq<Integer> wrapped = 
  Fj.list(List.list(1,2,3,4,5));

//And manipulate it
AnyMSeq<Integer> timesTen = wrapped.map(i->i*10);聽

cyclops-react provides a common set of interfaces that these wrappers (and other cyclops-react types) inherit from, allowing developers to write more generic reusable code. AnyM extends reactive-streams publishers, meaning you can make any Javaslang, Guava, FunctionalJava or RxJava type a reactive-streams publisher with cyclops-react.

AnyMSeq<Integer> wrapped = 
  Javaslang.traversable(List.of(1,2,3,4,5));

//The wrapped type is a reactive-streams publisher
Flux<Integer> fromJavaslang = Flux.from(wrapped);

wrapped.forEachWithError(
  System.out::println,
  System.out::err);
聽聽

Furthermore the reactive functionality from cyclops-react is provided directly on the AnyM types. This means we can, for example, schedule data emission from a Javaslang or FunctionalJava Stream – or execute a reduce operation lazily, or asynchronously.


AnyMSeq<Integer> wrapped = 
  Javaslang.traversable(Stream.of(1,2,3,4,5));

CompletableFuture<Integer> asyncResult = 
  wrapped.futureOperations(Executors.newFixedThreadPool(1))
聽 聽 聽 聽 聽.reduce(50, (acc, next) -> acc + next);
//CompletableFuture[1550]

AnyMSeq<Integer> wrapped = 
  FJ.list(list.list(1,2,3,4,5));

Eval<Integer> lazyResult = 
  wrapped.map(i -> i * 10)
聽 聽 聽 聽 聽.lazyOperations()
聽 聽 聽 聽 聽.reduce(50, (acc,next) -> acc + next);
//Eval[15500]

HotStream<Integer> emitting = wrapped.schedule(
  "0 * * * * ?", 
  Executors.newScheduledThreadPool(1));

emitting.connect()
聽 聽 聽 聽 .debounce(1,TimeUnit.DAYS)
聽 聽 聽 聽 .forEachWithError(
           this::logSuccess,
           this::logFailure);

Theres a lot to explore both in cyclops-react and in the new broader Java 8 eco-system, hopefully you鈥檒l have a fun adventure playing with, learning from and extending the Java 8 boundaries yourself!

JSR-308 and the Checker Framework Add Even More Typesafety to jOOQ 3.9

Java 8 introduced JSR-308, which added new annotation capabilities to the Java language. Most importantly: Type annotations. It is now possible to design monsters like the below:

The code displayed in that tweet really compiles. Every type can be annotated now, in order to enhance the type system in any custom way. Why, you may ask? One of the main driving use-cases for this language enhancement is the Checker Framework, an Open Source library that allows you to easily implement arbitrary compiler plugins for sophisticated type checking. The most boring and trivial example would be nullability. Consider the following code:

import org.checkerframework.checker.nullness.qual.Nullable;

class YourClassNameHere {
    void foo(Object nn, @Nullable Object nbl) {
        nn.toString(); // OK
        nbl.toString(); // Fail
        if (nbl != null)
            nbl.toString(); // OK again
    }
}

The above example can be run directly in the Checker Framework live demo console. Compiling the above code with the following annotation processor:

javac -processor org.checkerframework.checker.nullness.NullnessChecker afile.java

Yields:

Error: [dereference.of.nullable] dereference of possibly-null reference nbl:5:9

That’s pretty awesome! It works in quite a similar way as the flow sensitive typing that is implemented in Ceylon or Kotlin, for instance, except that it is much more verbose. But it is also much much more powerful, because the rules that implement your enhanced and annotated Java type system can be implemented directly in Java using annotation processors! Which makes annotations turing-complete, in a way 馃槈

award

How does this help jOOQ?

jOOQ has shipped with two types of API documentation annotations for quite a while. Those annotations are:

  • @PlainSQL – To indicate that a DSL method accepts a “plain SQL” string which may introduce SQL injection risks
  • @Support – To indicate that a DSL method works either natively with, or can be emulated for a given set of SQLDialect

An example of such a method is the CONNECT BY clause, which is supported by Cubrid, Informix, and Oracle, and it is overloaded to accept also a “plain SQL” predicate, for convenience:

@Support({ CUBRID, INFORMIX, ORACLE })
@PlainSQL
SelectConnectByConditionStep<R> connectBy(String sql);

Thus far, these annotations were there only for documentation purposes. With jOOQ 3.9, not anymore. We’re now introducing two new annotations to the jOOQ API:

  • org.jooq.Allow – to allow for a set of dialects (or for the @PlainSQL annotation) to be used within a given scope
  • org.jooq.Require – to require for a set of dialects to be supported via the @Support annotation within a given scope

This is best explained by example. Let’s look at @PlainSQL first

Restricting access to @PlainSQL

One of the biggest advantages of using the jOOQ API is that SQL injection is pretty much a thing of the past. With jOOQ being an internal domain-specific language, users really define the SQL expression tree directly in their Java code, rather than a stringified version of the statement as with JDBC. The expression tree being compiled in Java, there’s no possibility of injecting any unwanted or unforeseen expressions via user input.

There is one exception though. jOOQ doesn’t support every SQL feature in every database. This is why jOOQ ships with a rich “plain SQL” API where custom SQL strings can be embedded anywhere in the SQL expression tree. For instance, the above CONNECT BY clause:

DSL.using(configuration)
   .select(level())
   .connectBy("level < ?", bindValue)
   .fetch();

The above jOOQ query translates to the following SQL query:

SELECT level
FROM dual
CONNECT BY level < ?

As you can see, it is perfectly possible to “do it wrong” and create a SQL injection risk, just like in JDBC:

DSL.using(configuration)
   .select(level())
   .connectBy("level < " + bindValue)
   .fetch();

The difference is very subtle. With jOOQ 3.9 and the Checker Framework, it is now possible to specify the following Maven compiler configuration:

<plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.3</version>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
        <fork>true</fork>
        <annotationProcessors>
            <annotationProcessor>org.jooq.checker.PlainSQLChecker</annotationProcessor>
        </annotationProcessors>
        <compilerArgs>
            <arg>-Xbootclasspath/p:1.8</arg>
        </compilerArgs>
    </configuration>
</plugin>

The org.jooq.checker.PlainSQLChecker will ensure that no client code using API annotated with @PlainSQL will compile. The error message we’re getting is something like:

C:\Users\lukas\workspace\jOOQ\jOOQ-examples\jOOQ-checker-framework-example\src\main\java\org\jooq\example\checker\PlainSQLCheckerTests.java:[17,17] error: [Plain SQL usage not allowed at current scope. Use @Allow.PlainSQL.]

If you know-what-you’re-doing™ and you absolutely must use jOOQ’s @PlainSQL API at a very specific location (scope), you can annotate that location (scope) with @Allow.PlainSQL and the code compiles just fine again:

// Scope: Single method.
@Allow.PlainSQL
public List<Integer> iKnowWhatImDoing() {
    return DSL.using(configuration)
              .select(level())
              .connectBy("level < ?", bindValue)
              .fetch(0, int.class);
}

Or even:

// Scope: Entire class.
@Allow.PlainSQL
public class IKnowWhatImDoing {
    public List<Integer> iKnowWhatImDoing() {
        return DSL.using(configuration)
                  .select(level())
                  .connectBy("level < ?", bindValue)
                  .fetch(0, int.class);
    }
}

Or even (but then you might just turn off the checker):

// Scope: entire package (put in package-info.java)
@Allow.PlainSQL
package org.jooq.example.checker;

The benefits are clear, though. If security is very important to you (and it should be), then just enable the org.jooq.checker.PlainSQLChecker on each developer build, or at least in CI builds, and get compilation errors whenever “accidental” @PlainSQL API usage is encountered.

Restricting access to SQLDialect

Now, much more interesting for most users is the ability to check whether jOOQ API that is used in client code really supports your database. For instance, the above CONNECT BY clause is supported only in Oracle (if we ignore the not so popular Cubrid and Informix databases). Let’s assume you do work with Oracle only. You want to make sure that all jOOQ API that you’re using is Oracle-compatible. You can now put the following annotation to all packages that use the jOOQ API:

// Scope: entire package (put in package-info.java)
@Allow(ORACLE)
package org.jooq.example.checker;

Now, simply activate the org.jooq.checker.SQLDialectChecker to type check your code for @Allow compliance and you’re done:

<plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.3</version>
    <configuration>
        <source>1.8</source>
        <target>1.8</target>
        <fork>true</fork>
        <annotationProcessors>
            <annotationProcessor>org.jooq.checker.SQLDialectChecker</annotationProcessor>
        </annotationProcessors>
        <compilerArgs>
            <arg>-Xbootclasspath/p:1.8</arg>
        </compilerArgs>
    </configuration>
</plugin>

From now on, whenever you use any jOOQ API, the above checker will verify that any of the following three yields true:

  • The jOOQ API being used is not annotated with @Support
  • The jOOQ API being used is annotated with @Support, but without any explicit SQLDialect (i.e. “works on all databases”), such as DSLContext.select()
  • The jOOQ API being used is annotated with @Support and with at least one of the SQLDialects referenced from @Allow

Thus, within a package annotated as such…

// Scope: entire package (put in package-info.java)
@Allow(ORACLE)
package org.jooq.example.checker;

… using a method annotated as such is fine:

@Support({ CUBRID, INFORMIX, ORACLE })
@PlainSQL
SelectConnectByConditionStep<R> connectBy(String sql);

… but using a method annotated as such is not:

@Support({ MARIADB, MYSQL, POSTGRES })
SelectOptionStep<R> forShare();

In order to allow for this method to be used, client code could, for instance, allow the MYSQL dialect in addition to the ORACLE dialect:

// Scope: entire package (put in package-info.java)
@Allow({ MYSQL, ORACLE })
package org.jooq.example.checker;

From now on, all code in this package may refer to methods supporting either MySQL and/or Oracle.

The @Allow annotation helps giving access to API on a global level. Multiple @Allow annotations (of potentially different scope) create a disjunction of allowed dialects as illustrated here:

// Scope: class
@Allow(MYSQL)
class MySQLAllowed {

    @Allow(ORACLE)
	void mySQLAndOracleAllowed() {
	    DSL.using(configuration)
		   .select()
		   
		   // Works, because Oracle is allowed
		   .connectBy("...")
		   
		   // Works, because MySQL is allowed
		   .forShare();
	}
}

As can be seen above, allowing for two dialects disjunctively won’t ensure that a given statement will work on either of the databases. So…

What if I want both databases to be supported?

In this case, we’ll resort to using the new @Require annotation. Multiple @Require annotations (of potentially different scope) create a conjunction of required dialects as illustrated here:

// Scope: class
@Allow
@Require({ MYSQL, ORACLE })
class MySQLAndOracleRequired {

    @Require(ORACLE)
	void onlyOracleRequired() {
	    DSL.using(configuration)
		   .select()
		   
		   // Works, because only Oracle is required
		   .connectBy("...")
		   
		   // Doesn't work because Oracle is required
		   .forShare();
	}
}

How to put this in use

Let’s assume your application only requires to work with Oracle. You can now put the following annotation on your package, and you will be prevented from using any MySQL-only API, for instance, because MySQL is not allowed as a dialect in your code:

@Allow(ORACLE)
package org.jooq.example.checker;

Now, as requirements change, you want to start supporting MySQL as well from your application. Just change the package specification to the following and start fixing all compilation errors in your jOOQ usage.

// Both dialects are allowed, no others are
@Allow({ MYSQL, ORACLE })

// Both dialects are also required on each clause
@Require({ MYSQL, ORACLE })
package org.jooq.example.checker;

Defaults

By default, for any scope, the following annotations are assumed by the org.jooq.checker.SQLDialectChecker:

  • Nothing is allowed. Each @Allow annotation adds to the set of allowed dialects.
  • Everything is required. Each @Require annotation removes from the set of required dialects.

See it in action

These features will be an integral part of jOOQ 3.9. They’re available simply by adding the following dependency:

<dependency>
    <!-- Use org.jooq            for the Open Source edition
             org.jooq.pro        for commercial editions, 
             org.jooq.pro-java-6 for commercial editions with Java 6 support,
             org.jooq.trial      for the free trial edition -->
	
    <groupId>org.jooq</groupId>
    <artifactId>jooq-checker</artifactId>
    <version>${org.jooq.version}</version>
</dependency>

… and then choosing the appropriate annotation processors to your compiler plugin.

Cannot wait until jOOQ 3.9? You don’t have to. Just check out the 3.9.0-SNAPSHOT version from GitHub and follow the example project given here:

https://github.com/jOOQ/jOOQ/tree/master/jOOQ-examples/jOOQ-checker-framework-example

Done! From now on, when using jOOQ, you can be sure that whatever code you write will work on all the databases that you plan to support!

I think that this year’s Annotatiomaniac Champion title should go to the makers of the Checker Framework:

award

Further reading about the Checker Framework: