jOOQ 3.12 Released With a new Procedural Language API

jOOQ 3.12 has been released with a new procedural language API, new data types, MemSQL support, formal Java 11+ support, a much better parser, and reactive stream API support

In this release, we’ve focused on a lot of minor infrastructure tasks, greatly
improving the overall quality of jOOQ. We’ve reworked some of our automated
integration tests, which has helped us fix a big number of not yet discovered
issues, including a much better coverage of our 26 supported RDBMS dialects.

We’re excited about these internal changes, as they will help us implement a lot
of features that have been requested by many for a long time, including an
immutable query object model, with all the secondary benefits like caching of
generated SQL, and much more powerful dynamic SQL construction and
transformation in the future.

Major new features include the new procedural language API shipped with our
commercial distributions, new data types including native JSON support, MemSQL
support, formal Java 11+ support, a much better parser, and reactive stream API
support.

Procedural languages

Following up on jOOQ 3.11’s support for anonymous blocks, the jOOQ 3.12
Professional and Enterprise Editions now include support for a variety of
procedural language features, including

  • Variable declarations
  • Variable assignments
  • Loops (WHILE, REPEAT, FOR, LOOP)
  • If Then Else
  • Labels
  • Exit, Continue, Goto
  • Execute

This feature set is part of our ongoing efforts to continue supporting more
advanced vendor specific functionality, including our planned definition and
translation of stored procedures, triggers, and other, ad-hoc procedural logic
that helps move data processing logic into the database server.

New Databases Supported

The jOOQ Professional Edition now supports the MemSQL dialect. MemSQL is derived
from MySQL, although our integration tests have shown that there are numerous
differences, such that supporting MemSQL formally will add a lot of value to our
customers by providing much increased syntactic correctness.

Reactive streams

The reactive programming model is gaining traction in some environments as new,
useful streaming APIs emerge, such as e.g. Reactor. These APIs have agreed to
work with a common SPI: reactive streams, and since JDK 9 the new
java.util.concurrent.Flow SPI. jOOQ 3.12 now implements these paradigms on an
API level, such that an integration with APIs like Reactor becomes much more
easy. The implementation still binds to JDBC, and is thus blocking. Future
versions of jOOQ will abstract over JDBC to allow for running queries against
ADBA (by Oracle) or R2DBC (by Spring)

New data types

We’ve introduced native support for a few new data types, which are often very
useful in specific situations. These include:

  • JSON / JSONB: A native string wrapper for textual and binary JSON data. While
    users will still want to bind more specific JSON to maps and lists using
    custom data type Bindings, in a lot of cases, being able to just serialise and
    deserialise JSON content as strings is sufficient. jOOQ now provides out of
    the box support for this approach for various SQL dialects.
  • INSTANT: RDBMS do not agree on the meaning of the SQL standard TIMESTAMP WITH
    TIME ZONE. PostgreSQL, for example, interprets it as a unix timestamp, just
    like java.time.Instant. For an optimal PostgreSQL experience, this new INSTANT
    type will be much more useful than the standard JDBC java.time.OffsetDateTime
    binding.
  • ROWID: Most RDBMS have a native ROWID / OID / CTID / physloc identity
    value that physically identifies a row on the underlying storage system,
    irrespective of any logical primary key. These ROWIDs can be leveraged to run
    more performant, vendor specific queries. Supporting this type allows for
    easily using this feature in arbitrary queries.

Parser

Our parser is seeing a lot of continued improvements over the releases as we
gather feedback from our users. Our main drivers for feedback are:

  • The DDLDatabase which allows for generating code from DDL scripts rather than
    live JDBC connections to your database
  • The https://www.jooq.org/translate website, which translates any kind of SQL
    between database dialects.

SQL dialect translation will evolve into an independent product in the future.
DDL parsing is already very powerful, and a lot of customers rely on it for
their production systems.

In the next versions, we will be able to simulate DDL on our own, without H2,
which will open up a variety of possible use cases, including better schema
management.

Specific jOOQ 3.12 parser improvements include:

  • Being able to access schema meta information (column types, constraints) to
    better emulate SQL features / translate SQL syntax between dialects
  • A parse search path, similar to PostgreSQL’s search_path, or other dialects’
    current_schema, allowing support for unqualified object references.
  • The DDL simulation from the DDLDatabase is now moved into the core library,
    supporting it also out of the box as a DDL script based meta data source
  • A new special comment syntax that helps ignoring SQL fragments in the jOOQ
    parser only, while executing it in your ordinary SQL execution.
  • A new interactive mode in the ParserCLI
  • Support for nested block comments
  • And much more

Formal Java 11 Support

While we have been supporting Java 11 for a while through our integration tests,
jOOQ 3.12 now fully supports Java 11 to help improve the experience around the
transitive JAXB dependency, which we now removed entirely out of jOOQ.

The commercial editions ship with a Java 11+ supporting distribution, which
includes more optimal API usage, depending on new Java 9-11 APIs. All editions,
including the jOOQ Open Source Edition, have a Java 8+ distribution that
supports any Java version starting from Java 8.

Commercial Editions

Dual licensing is at the core of our business, helping us to provide continued
value to our customers.

In the past, the main distinction between the different jOOQ editions was the
number of database products each edition supported. In the future, we want to
provide even more value to our customers with commercial subscriptions. This is
why, starting from jOOQ 3.12, we are now offering some new, advanced features
only in our commercial distributions. Such features include:

  • The procedural language API, which is available with the jOOQ Professional
    and Enterprise Editions
  • While the jOOQ 3.12 Open Source Edition supports Java 8+, the jOOQ 3.12
    Professional Edition also ships with a Java 11+ distribution, leveraging some
    newer JDK APIs, and the jOOQ 3.12 Enterprise Edition continues supporting
    Java 6 and 7.
  • Since Java 8 still sees very substantial market adoption, compared to Java 11,
    we still support Java 8 in the jOOQ 3.12 Open Source Edition.
  • Starting from jOOQ 3.12, formal support for older RDBMS dialect versions in
    the runtime libraries is reserved to the jOOQ Professional and Enterprise
    Editions. The jOOQ Open Source Edition will ship with support for the latest
    version of an RDBMS dialect, only. The code generator is not affected by this
    change.

By offering more value to our paying customers, we believe that we can continue
our successful business model, which in turn allows us to continue the free
jOOQ Open Source Edition for free. Our strategy is:

  • To implement new, advanced, commercial only features.
  • To offer legacy support (legacy Java versions, legacy database versions) to
    paying customers only.
  • To continue supporting a rich set of features to Open Source Edition users.

H2 and SQLite integration

Over the past year, both H2 and SQLite have seen a lot of improvements, which we
have now supported in jOOQ as well. Specifically, H2 is moving at a very fast
pace, and our traditional close cooperation got even better as we’re helping
the H2 team with our insights into the SQL standards, while the H2 team is
helping us with our own implementations.

Other improvements

The complete list of changes can be found on our website:
https://www.jooq.org/notes

A few improvements are worth summarising here explicitly

  • We’ve added support for a few new SQL predicates, such as the standard
    UNIQUE and SIMILAR TO predicates, as well as the synthetic, but very useful
    LIKE ANY predicate.
  • The JAXB implementation dependency has been removed and replaced by our own
    simplified implementation for a better Java 9+ experience.
  • The historic log4j (1.x) dependency has been removed. We’re now logging only
    via the optional slf4j dependency (which supports log4j bridges), or
    java.util.logging, if slf4j cannot be found on the classpath.
  • The shaded jOOR dependency has been upgraded to 0.9.12.
  • We’ve greatly improved our @Support annotation usage for better use with
    jOOQ-checker.
  • jOOQ-checker can now run with ErrorProne as well as with the checker framework
    as the latter still does not support Java 9+.
  • We’ve added support for a lot of new DDL statements and clauses.
  • There is now a synthetic PRODUCT() aggregate and window function.
  • We added support for the very useful window functions GROUPS mode.
  • Formatting CSV, JSON, XML now supports nested formatting.
  • UPDATE / DELETE statements now support (and emulate) ORDER BY and LIMIT.
  • When constructing advanced code generation configuration, users had to resort
    to using programmatic configuration. It is now possible to use SQL statements
    to dynamically construct regular expression matching tables, columns, etc.
  • Configuration has a new UnwrapperProvider SPI.
  • MockFileDatabase can now handle regular expressions and update statements.
  • Settings can cleanly separate the configuration of name case and quotation.
  • MySQL DDL character sets are now supported, just like collations.
  • A new Table.where() API simplifies the construction of simple derived tables.
    This feature will be very useful in the future, for improved row level
    security support.
  • A nice BigQuery and H2 feature is the “* EXCEPT (…)” syntax, which allows
    for removing columns from an asterisked expression. We now have
    Asterisk.except() and QualifiedAsterisk.except().
  • A lot of improvements in date time arithmetic were added, including support
    for vendor specific DateParts, like WEEK.

Full release notes here: https://www.jooq.org/notes

How to Fetch All Current Identity Values in Oracle

Oracle 12c has introduced the useful SQL standard IDENTITY feature, which is essentially just syntax sugar for binding a sequence to a column default. We can use it like this:

create table t1 (col1 number generated always as identity);
create table t2 (col2 number generated always as identity);

insert into t1 values (default);
insert into t1 values (default);
insert into t1 values (default);
insert into t2 values (default);

select * from t1;
select * from t2;

Which produces

COL1
----
  1
  2
  3

COL2
----
  1

For unit testing against our database, we might want to know what “state” our identities are in. For each table, we would like to know the next value such an identity would produce. If we knew all the backing sequence names, we could query their seq.currval, but we don’t know those sequence names as they are generated.

However, we can query the dictionary views to get this information as follows:

select data_default
from user_tab_cols
where data_default is not null
and identity_column = 'YES'
and table_name in ('T1', 'T2');

An alternative is to query user_tab_identity_cols

This would produce:

"TEST"."ISEQ$$_116601".nextval
"TEST"."ISEQ$$_116603".nextval

Now, if we’re lazy, we could just run EXECUTE IMMEDIATE on each of those expressions and we’re done:

set serveroutput on
declare
  v_current number;
begin
  for rec in (
    select table_name, data_default
    from user_tab_cols
    where data_default is not null
    and identity_column = 'YES'
    and table_name in ('T1', 'T2')
  ) loop
    execute immediate replace(
      'select ' || rec.data_default || ' from dual', 
      '.nextval', 
      '.currval'
    ) into v_current;
    dbms_output.put_line(
      'Table : ' || rec.table_name || 
      ', currval : ' || v_current
    );
  end loop;
end;
/

This would produce:

Table : T1, currval : 3
Table : T2, currval : 1

Alternatively, if you want this result to be a SQL result instead of DBMS_OUTPUT content, you could run this:

with
  function current_value(p_table_name varchar2) return number is
    v_current number;
  begin
    for rec in (
      select data_default
      from user_tab_cols
      where table_name = p_table_name
      and data_default is not null
      and identity_column = 'YES'
    )
    loop
      execute immediate replace(
        'select ' || rec.data_default || ' from dual', 
        '.nextval', 
        '.currval'
      ) into v_current;
      return v_current;
    end loop;
    
    return null;
  end;
select *
from (
  select table_name, current_value(table_name) current_value
  from user_tables
  where table_name in ('T1', 'T2')
)
where current_value is not null
order by table_name;
/

The alternative using user_tab_identity_cols would look like this:

with
  function current_value(p_table_name varchar2) return number is
    v_current number;
  begin
    for rec in (
      select sequence_name
      from user_tab_identity_cols
      where table_name = p_table_name
    )
    loop
      execute immediate 
        'select ' || rec.sequence_name || '.currval from dual'
      into v_current;
      return v_current;
    end loop;
     
    return null;
  end;
select *
from (
  select table_name, current_value(table_name) current_value
  from user_tables
)
where current_value is not null
order by table_name;
/

The result is now a nice SQL result set:

TABLE_NAME   CURRENT_VALUE
--------------------------
T1           3
T2           1

How to Use jOOQ’s Commercial Distributions with Spring Boot

Spring Boot is great to get started very quickly with what the Spring Boot authors have evaluated to be useful defaults. This can be a lot of help when you’re doing things for the first time, and have no way to copy paste working Maven pom.xml files from existing projects, for example.

When working with the jOOQ Open Source Edition, just go to https://start.spring.io, add the jOOQ dependency, and start working!

It is a bit different when you want to work with the commercial distributions of jOOQ, for two reasons:

  1. They are not on Maven Central, but in your own repository or artifactory, after you’ve installed the latest version from our website: https://www.jooq.org/download/versions
  2. They use a different Maven groupId, to make sure the different distributions can be easily distinguished.

The different groupIds for jOOQ distributions are:

org.jooq For the jOOQ Open Source Edition
org.jooq.trial For the jOOQ Trial Edition
org.jooq.pro For the jOOQ Express, Professional and Enterprise Edition (supporting the latest JDK versions)
org.jooq.pro-java-6 For the jOOQ Express, Professional and Enterprise Edition (supporting Java 6+)
org.jooq.pro-java-8 For the jOOQ Express, Professional and Enterprise Edition (supporting Java 8+, starting from jOOQ 3.12)

Spring Boot doesn’t know this, and doesn’t have to. All of these distributions are largely source and binary compatible, so you can switch editions in your application simply by replacing dependencies. A vanilla https://start.spring.io pom.xml configuration might look like this.

Notice: I’m leaving out spring-boot-starter-test, spring-boot-maven-plugin, and other things not essential for this blog post, please use https://start.spring.io to generate a more complete pom.xml stub!

<project>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.1.6.RELEASE</version>
  </parent>
  <groupId>com.example</groupId>
  <artifactId>demo</artifactId>
  <version>0.0.1-SNAPSHOT</version>

  <dependencies>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-jooq</artifactId>
    </dependency>
  </dependencies>
</project>

What dependencies are we getting from this?

mvn dependency:tree

We’re getting:

[INFO] --- maven-dependency-plugin:3.1.1:tree (default-cli) @ demo ---
[INFO] com.example:demo:jar:0.0.1-SNAPSHOT
[INFO] \- org.springframework.boot:spring-boot-starter-jooq:jar:2.1.6.RELEASE:compile
[INFO]    +- org.springframework.boot:spring-boot-starter-jdbc:jar:2.1.6.RELEASE:compile
[INFO]    |  +- org.springframework.boot:spring-boot-starter:jar:2.1.6.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot:jar:2.1.6.RELEASE:compile
[INFO]    |  |  |  \- org.springframework:spring-context:jar:5.1.8.RELEASE:compile
[INFO]    |  |  |     +- org.springframework:spring-aop:jar:5.1.8.RELEASE:compile
[INFO]    |  |  |     \- org.springframework:spring-expression:jar:5.1.8.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot-autoconfigure:jar:2.1.6.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot-starter-logging:jar:2.1.6.RELEASE:compile
[INFO]    |  |  |  +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO]    |  |  |  |  \- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO]    |  |  |  +- org.apache.logging.log4j:log4j-to-slf4j:jar:2.11.2:compile
[INFO]    |  |  |  |  \- org.apache.logging.log4j:log4j-api:jar:2.11.2:compile
[INFO]    |  |  |  \- org.slf4j:jul-to-slf4j:jar:1.7.26:compile
[INFO]    |  |  +- javax.annotation:javax.annotation-api:jar:1.3.2:compile
[INFO]    |  |  \- org.yaml:snakeyaml:jar:1.23:runtime
[INFO]    |  +- com.zaxxer:HikariCP:jar:3.2.0:compile
[INFO]    |  |  \- org.slf4j:slf4j-api:jar:1.7.26:compile
[INFO]    |  \- org.springframework:spring-jdbc:jar:5.1.8.RELEASE:compile
[INFO]    +- org.springframework:spring-tx:jar:5.1.8.RELEASE:compile
[INFO]    |  +- org.springframework:spring-beans:jar:5.1.8.RELEASE:compile
[INFO]    |  \- org.springframework:spring-core:jar:5.1.8.RELEASE:compile
[INFO]    |     \- org.springframework:spring-jcl:jar:5.1.8.RELEASE:compile
[INFO]    \- org.jooq:jooq:jar:3.11.11:compile
[INFO]       \- javax.xml.bind:jaxb-api:jar:2.3.1:compile
[INFO]          \- javax.activation:javax.activation-api:jar:1.2.0:compile

When this blog post was written, 3.11.11 was the latest jOOQ Open Source Edition version. But perhaps, you want a newer version or an older version. You can override this easily by specifying the ${jooq.version} property in Maven:

<project>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.1.6.RELEASE</version>
  </parent>
  <groupId>com.example</groupId>
  <artifactId>demo</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  
  <properties>
    <jooq.version>3.11.0</jooq.version>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-jooq</artifactId>
    </dependency>
  </dependencies>
</project>

The dependency tree is now:

[INFO] --- maven-dependency-plugin:3.1.1:tree (default-cli) @ demo ---
[INFO] com.example:demo:jar:0.0.1-SNAPSHOT
[INFO] \- org.springframework.boot:spring-boot-starter-jooq:jar:2.1.6.RELEASE:compile
[INFO]    +- org.springframework.boot:spring-boot-starter-jdbc:jar:2.1.6.RELEASE:compile
[INFO]    |  +- org.springframework.boot:spring-boot-starter:jar:2.1.6.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot:jar:2.1.6.RELEASE:compile
[INFO]    |  |  |  \- org.springframework:spring-context:jar:5.1.8.RELEASE:compile
[INFO]    |  |  |     +- org.springframework:spring-aop:jar:5.1.8.RELEASE:compile
[INFO]    |  |  |     \- org.springframework:spring-expression:jar:5.1.8.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot-autoconfigure:jar:2.1.6.RELEASE:compile
[INFO]    |  |  +- org.springframework.boot:spring-boot-starter-logging:jar:2.1.6.RELEASE:compile
[INFO]    |  |  |  +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO]    |  |  |  |  \- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO]    |  |  |  +- org.apache.logging.log4j:log4j-to-slf4j:jar:2.11.2:compile
[INFO]    |  |  |  |  \- org.apache.logging.log4j:log4j-api:jar:2.11.2:compile
[INFO]    |  |  |  \- org.slf4j:jul-to-slf4j:jar:1.7.26:compile
[INFO]    |  |  +- javax.annotation:javax.annotation-api:jar:1.3.2:compile
[INFO]    |  |  \- org.yaml:snakeyaml:jar:1.23:runtime
[INFO]    |  +- com.zaxxer:HikariCP:jar:3.2.0:compile
[INFO]    |  |  \- org.slf4j:slf4j-api:jar:1.7.26:compile
[INFO]    |  \- org.springframework:spring-jdbc:jar:5.1.8.RELEASE:compile
[INFO]    +- org.springframework:spring-tx:jar:5.1.8.RELEASE:compile
[INFO]    |  +- org.springframework:spring-beans:jar:5.1.8.RELEASE:compile
[INFO]    |  \- org.springframework:spring-core:jar:5.1.8.RELEASE:compile
[INFO]    |     \- org.springframework:spring-jcl:jar:5.1.8.RELEASE:compile
[INFO]    \- org.jooq:jooq:jar:3.11.0:compile
[INFO]       \- javax.xml.bind:jaxb-api:jar:2.3.1:compile
[INFO]          \- javax.activation:javax.activation-api:jar:1.2.0:compile

But it’s still the jOOQ Open Source Edition. What if you want a commercial distribution, e.g. to try out jOOQ? One way is to explicitly exclude Spring Boot’s transitive jOOQ Open Source Edition dependency, and introduce your own explicit dependency. For example:

<project>
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.1.6.RELEASE</version>
  </parent>
  <groupId>com.example</groupId>
  <artifactId>demo</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  
  <properties>
    <jooq.version>3.11.11</jooq.version>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-jooq</artifactId>
    
      <!-- Exclude the jOOQ Open Source Edition -->
      <exclusions>
        <exclusion>
          <groupId>org.jooq</groupId>
          <artifactId>jooq</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
  
    <!-- Include a commercial jOOQ distribution -->
    <dependency>
      <groupId>org.jooq.trial</groupId>
      <artifactId>jooq</artifactId>
      <version>${jooq.version}</version>
    </dependency>
  </dependencies>
</project>

The new dependency tree is now:

[INFO] --- maven-dependency-plugin:3.1.1:tree (default-cli) @ demo ---
[INFO] com.example:demo:jar:0.0.1-SNAPSHOT
[INFO] +- org.springframework.boot:spring-boot-starter-jooq:jar:2.1.6.RELEASE:compile
[INFO] |  +- org.springframework.boot:spring-boot-starter-jdbc:jar:2.1.6.RELEASE:compile
[INFO] |  |  +- org.springframework.boot:spring-boot-starter:jar:2.1.6.RELEASE:compile
[INFO] |  |  |  +- org.springframework.boot:spring-boot:jar:2.1.6.RELEASE:compile
[INFO] |  |  |  |  \- org.springframework:spring-context:jar:5.1.8.RELEASE:compile
[INFO] |  |  |  |     +- org.springframework:spring-aop:jar:5.1.8.RELEASE:compile
[INFO] |  |  |  |     \- org.springframework:spring-expression:jar:5.1.8.RELEASE:compile
[INFO] |  |  |  +- org.springframework.boot:spring-boot-autoconfigure:jar:2.1.6.RELEASE:compile
[INFO] |  |  |  +- org.springframework.boot:spring-boot-starter-logging:jar:2.1.6.RELEASE:compile
[INFO] |  |  |  |  +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO] |  |  |  |  |  \- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO] |  |  |  |  +- org.apache.logging.log4j:log4j-to-slf4j:jar:2.11.2:compile
[INFO] |  |  |  |  |  \- org.apache.logging.log4j:log4j-api:jar:2.11.2:compile
[INFO] |  |  |  |  \- org.slf4j:jul-to-slf4j:jar:1.7.26:compile
[INFO] |  |  |  +- javax.annotation:javax.annotation-api:jar:1.3.2:compile
[INFO] |  |  |  \- org.yaml:snakeyaml:jar:1.23:runtime
[INFO] |  |  +- com.zaxxer:HikariCP:jar:3.2.0:compile
[INFO] |  |  |  \- org.slf4j:slf4j-api:jar:1.7.26:compile
[INFO] |  |  \- org.springframework:spring-jdbc:jar:5.1.8.RELEASE:compile
[INFO] |  \- org.springframework:spring-tx:jar:5.1.8.RELEASE:compile
[INFO] |     +- org.springframework:spring-beans:jar:5.1.8.RELEASE:compile
[INFO] |     \- org.springframework:spring-core:jar:5.1.8.RELEASE:compile
[INFO] |        \- org.springframework:spring-jcl:jar:5.1.8.RELEASE:compile
[INFO] \- org.jooq.trial:jooq:jar:3.11.11:compile
[INFO]    \- javax.xml.bind:jaxb-api:jar:2.3.1:compile
[INFO]       \- javax.activation:javax.activation-api:jar:1.2.0:compile

And you’re all set!

How to Write a Simple, yet Extensible API

How to write a simple API is already an art on its own.

I didn’t have time to write a short letter, so I wrote a long one instead.

― Mark Twain

But keeping an API simple for beginners and most users, and making it extensible for power users seems even more of a challenge. But is it?

What does “extensible” mean?

Imagine an API like, oh say, jOOQ. In jOOQ, you can write SQL predicates like this:

ctx.select(T.A, T.B)
   .from(T)
   .where(T.C.eq(1)) // Predicate with bind value here
   .fetch();

By default (as this should always be the default), jOOQ will generate and execute this SQL statement on your JDBC driver, using a bind variable:

SELECT t.a, t.b
FROM t
WHERE t.c = ?

The API made the most common use case simple. Just pass your bind variable as if the statement was written in e.g. PL/SQL, and let the language / API do the rest. So we passed that test.

The use case for power users is to occasionally not use bind variables, for whatever reasons (e.g. skew in data and bad statistics, see also this post about bind variables).Will we pass that test as well?

jOOQ mainly offers two ways to fix this:

On a per-query basis

You can turn your variable into an inline value explicitly for this single occasion:

ctx.select(T.A, T.B)
   .from(T)
   .where(T.C.eq(inline(1))) // Predicate without bind value here
   .fetch();

This is using the static imported DSL.inline() method. Works, but not very convenient, if you have to do this for several queries, for several bind values, or worse, depending on some context.

This is a necessary API enhancement, but it does not make the API extensible.

On a global basis

Notice that ctx object there? It is the DSLContext object, the “contextual DSL”, i.e. the DSL API that is in the context of a jOOQ Configuration. You can thus set:

ctx2 = DSL.using(ctx
    .configuration()
    .derive()
    .set(new Settings()
    .withStatementType(StatementType.STATIC_STATEMENT));

// And now use this new DSLContext instead of the old one
ctx2.select(T.A, T.B)
    .from(T)
    .where(T.C.eq(1)) // No longer a bind variable
    .fetch();

Different approaches to offering such extensibility

We have our clean and simple API. Now some user wants to extend it. So often, we’re tempted to resort to a hack, e.g. by using thread locals, because they would work easily when under the assumption of a thread-bound execution model – such as e.g. classic Java EE Servlets

The price we’re paying for such a hack is high.

  1. It’s a hack, and as such it will break easily. If we offer this as functionality to a user, they will start depending on it, and we will have to support and maintain it
  2. It’s a hack, and it is based on assumptions, such as thread bound ness. It will not work in an async / reactive / parallel stream context, where our logic may jump back and forth between threads
  3. It’s a hack, and deep inside, we know it’s wrong. Obligatory XKCD: https://xkcd.com/292

This might obviously work, just like global (static) variables. You can set this variable globally (or “globally” for your own thread), and then the API’s internals will be able to read it. No need to pass around parameters, so no need to compromise on the APIs simplicity by adding optional and often ugly, distractive parameters.

What are better approaches to offering such extensibility?

Dependency Injection

One way is to use explicit Dependency Injection (DI). If you have a container like Spring, you can rely on Spring injecting arbitrary objects into your method call / whatever, where you need access to it:

This way, if you maintain several contextual objects of different lifecycle scopes, you can let the DI framework make appropriate decisions to figure out where to get that contextual information from. For example, when using JAX-RS, you can do this using an annotation based approach:


// These annotations bind the method to some HTTP address
@GET
@Produces("text/plain")
@Path("/api")
public String method(

    // This annotation fetches a request-scoped object
    // from the method call's context
    @Context HttpServletRequest request,

    // This annotation produces an argument from the
    // URL's query parameters
    @QueryParam("arg") String arg
) {
    ...
}

This approach works quite nicely for static environments (annotations being static), where you do not want to react to dynamic URLs or endpoints. It is declarative, and a bit magic, but well designed, so once you know all the options, you can choose the right one for your use case very easily.

While @QueryParam is mere convenience (you could have gotten the argument also from the HttpServletRequest), the @Context is powerful. It can help inject values of arbitrary lifecycle scope into your method / class / etc.

I personally favour explicit programming over annotation-based magic (e.g. using Guice for DI), but that’s probably a matter of taste. Both are a great way for implementors of APIs (e.g. HTTP APIs) to help get access to framework objects.

However, if you’re an API vendor, and want to give users of your API a way to extend the API, I personally favour jOOQ’s SPI approach.

SPIs

One of jOOQ’s strengths, IMO, is precisely this single, central place to register all SPI implementations that can be used for all sorts of purposes: The Configuration.

For example, on such a Configuration you can specify a JSR-310 java.time.Clock. This clock will be used by jOOQ’s internals to produce client side timestamps, instead of e.g. using System.currentTimeMillis(). Definitely a use case for power users only, but once you have this use case, you really only want to tweak a single place in jOOQ’s API: The Configuration.

All of jOOQ’s internals will always have a Configuration reference available. And it’s up to the user to decide what the scope of this object is, jOOQ doesn’t care. E.g.

  • per query
  • per thread
  • per request
  • per session
  • per application

In other words, to jOOQ, it doesn’t matter at all if you’re implementing a thread-bound, blocking, classic servlet model, or if you’re running your code reactively, or in parallel, or whatever. Just manage your own Configuration lifecycle, jOOQ doesn’t care.

In fact, you can have a global, singleton Configuration and implement thread bound components of it, e.g. the ConnectionProvider SPI, which takes care of managing the JDBC Connection lifecycle for jOOQ. Typically, users will use e.g. a Spring DataSource, which manages JDBC Connection (and transactions) using a thread-bound model, internally using ThreadLocal. jOOQ does not care. The SPI specifies that jOOQ will:

Again, it does not matter to jOOQ what the specific ConnectionProvider implementation does. You can implement it in any way you want if you’re a power user. By default, you’ll just pass jOOQ a DataSource, and it will wrap it in a default implementation called DataSourceConnectionProvider for you.

The key here is again:

  • The API is simple by default, i.e. by default, you don’t have to know about this functionality, just pass jOOQ a DataSource as always when working with Java and SQL, and you’re ready to go
  • The SPI allows for easily extending the API without compromising on its simplicity, by providing a single, central access point to this kind of functionality

Other SPIs in Configuration include:

  • ExecuteListener: An extremely useful and simple way to hook into the entire jOOQ query management lifecycle, from generating the SQL string to preparing the JDBC statement, to binding variables, to execution, to fetching result sets. A single SPI can accomodate various use cases like SQL logging, patching SQL strings, patching JDBC statements, listening to result set events, etc.
  • ExecutorProvider: Whenever jOOQ runs something asynchronously, it will ask this SPI to provide a standard JDK Executor, which will be used to run the asynchronous code block. By default, this will be the JDK default (the default ForkJoinPool), as always. But you probably want to override this default, and you want to be in full control of this, and not think about it every single time you run a query.
  • MetaProvider: Whenever jOOQ needs to look up database meta information (schemas, tables, columns, types, etc.), it will ask this MetaProvider about the available meta information. By default, this will run queries on the JDBC DatabaseMetaData, which is good enough, but maybe you want to wire these calls to your jOOQ-generated classes, or something else.
  • RecordMapperProvider and RecordUnmapperProvider: jOOQ has a quite versatile default implementation of how to map between a jOOQ Record and an arbitrary Java class, supporting a variety of standard approaches including JavaBeans getter/setter naming conventions, JavaBeans @ConstructorProperties, and much more. These defaults apply e.g. when writing query.fetchInto(MyBean.class). But sometimes, the defaults are not good enough, and you want this particular mapping to work differently. Sure, you could write query.fetchInto(record -> mymapper(record)), but you may not want to remember this for every single query. Just override the mapper (and unmapper) at a single, central spot for your own chosen Configuration scope (e.g. per query, per request, per session, etc.) and you’re done

Conclusion

Writing a simple API is difficult. Making it extensible in a simple way, however, is not. If your API has achieved “simplicity”, then it is very easy to support injecting arbitrary SPIs for arbitrary purposes at a single, central location, such as jOOQ’s Configuration.

In my most recent talk “10 Reasons Why we Love Some APIs and Why we Hate Some Others”, I’ve made a point that things like simplicity, discoverability, consistency, and convenience are among the most important aspects of a great API. How do you define a good API? The most underrated answer on this (obviously closed) Stack Overflow question is this one:

.

Again, this is hard in terms of creating a simple API. But it is extremely easy when making this simple API extensible. Make your SPIs very easily discoverable. A jOOQ power user will always look for extension points in jOOQ’s Configuration. And because the extension points are explicit types which have to be implemented (as opposed to annotations and their magic), no documentation is needed to learn the SPI (of course it is still beneficial as a reference).

I’d love to hear your alternative approaches to this API design challenge in the comments.

Watch the full talk here:

Using IGNORE NULLS With SQL Window Functions to Fill Gaps

I found a very interesting SQL question on Twitter recently:

Rephrasing the question: We have a set of sparse data points:

+------------+-------+
| VALUE_DATE | VALUE |
+------------+-------+
| 2019-01-01 |   100 |
| 2019-01-02 |   120 |
| 2019-01-05 |   125 |
| 2019-01-06 |   128 |
| 2019-01-10 |   130 |
+------------+-------+

Since dates can be listed as discrete, continuous data points, why not fill in the gaps between 2019-01-02 and 2019-01-05 or 2019-01-06 and 2019-01-10? The desired output would be:

+------------+-------+
| VALUE_DATE | VALUE |
+------------+-------+
| 2019-01-01 |   100 |
| 2019-01-02 |   120 | <-+
| 2019-01-03 |   120 |   | -- Generated
| 2019-01-04 |   120 |   | -- Generated
| 2019-01-05 |   125 |
| 2019-01-06 |   128 | <-+
| 2019-01-07 |   128 |   | -- Generated
| 2019-01-08 |   128 |   | -- Generated
| 2019-01-09 |   128 |   | -- Generated
| 2019-01-10 |   130 |
+------------+-------+

In the generated columns, we’ll just repeat the most recent value.

How to do this with SQL?

For the sake of this example, I’m using Oracle SQL, as the OP was expecting to do this with Oracle. The idea is to do this in two steps:

  1. Generate all the dates between the first and the last data points
  2. For each date, find either the current data point, or the most recent one

But first, let’s create the data:

create table t (value_date, value) as
  select date '2019-01-01', 100 from dual union all
  select date '2019-01-02', 120 from dual union all
  select date '2019-01-05', 125 from dual union all
  select date '2019-01-06', 128 from dual union all
  select date '2019-01-10', 130 from dual;

1. Generating all the dates

In Oracle, we can use the convenient CONNECT BY syntax for this. We could also use some other tool to generate dates to fill the gaps, including SQL standard recursion using WITH, or some PIPELINED function, but I like CONNECT BY for this purpose.

We’ll write:

select (
  select min(t.value_date) 
  from t
) + level - 1 as value_date
from dual
connect by level <= (
  select max(t.value_date) - min(t.value_date) + 1
  from t
)

This produces:

VALUE_DATE|
----------|
2019-01-01|
2019-01-02|
2019-01-03|
2019-01-04|
2019-01-05|
2019-01-06|
2019-01-07|
2019-01-08|
2019-01-09|
2019-01-10|

Now we wrap the above query in a derived table and left join the actual data set:

select 
  d.value_date,
  t.value
from (
  select (
    select min(t.value_date) 
    from t
  ) + level - 1 as value_date
  from dual
  connect by level <= (
    select max(t.value_date) - min(t.value_date) + 1
    from t
  )
) d
left join t
on d.value_date = t.value_date
order by d.value_date;

The date gaps are now filled, but our values column is still sparse:

VALUE_DATE|VALUE|
----------|-----|
2019-01-01|  100|
2019-01-02|  120|
2019-01-03|     |
2019-01-04|     |
2019-01-05|  125|
2019-01-06|  128|
2019-01-07|     |
2019-01-08|     |
2019-01-09|     |
2019-01-10|  130|

2. Fill the value gaps

On each row, the VALUE column should either contain the actual value, or the “last_value” preceding the current row, ignoring all the nulls. Note that I specifically wrote this requirement using specific English language. We can now translate that sentence directly to SQL:

last_value (t.value) ignore nulls over (order by d.value_date)

Since we have added an ORDER BY clause to the window function, the default frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW applies, which colloquially means “all the preceding rows”. (Technically, that’s not accurate. It means all rows with values less than or equal to the value of the current row – see Kim Berg Hansen’s comment)

Convenient! We’re trying to find the last value in the window of all the preceding rows, ignoring the nulls.

This is standard SQL, but unfortunately not all RDBMS support IGNORE NULLS. Among the ones supported by jOOQ, currently these ones support the syntax:

  • DB2
  • H2
  • Informix
  • Oracle
  • Redshift
  • Sybase SQL Anywhere
  • Teradata

Sometimes, not the exact standard syntax is supported, but the standard feature. Use https://www.jooq.org/translate to see different syntax variants.

The full query now reads:

select 
  d.value_date,
  last_value (t.value) ignore nulls over (order by d.value_date)
from (
  select (
    select min(t.value_date) 
    from t
  ) + level - 1 as value_date
  from dual
  connect by level <= (
    select max(t.value_date) - min(t.value_date) + 1
    from t
  )
) d
left join t
on d.value_date = t.value_date
order by d.value_date;

… and it yields the desired result:

VALUE_DATE         |VALUE|
-------------------|-----|
2019-01-01 00:00:00|  100|
2019-01-02 00:00:00|  120|
2019-01-03 00:00:00|  120|
2019-01-04 00:00:00|  120|
2019-01-05 00:00:00|  125|
2019-01-06 00:00:00|  128|
2019-01-07 00:00:00|  128|
2019-01-08 00:00:00|  128|
2019-01-09 00:00:00|  128|
2019-01-10 00:00:00|  130|

Other RDBMS

This solution made use of some Oracle specific features such as CONNECT BY. In other RDBMS, the same idea can be implemented by using a different way of generating data. This article focuses only on using IGNORE NULLS. If you’re interested, feel free to post an alternative solution in the comments for your RDBMS.

Calling an Oracle Function with PL/SQL BOOLEAN Type from SQL

One of the most wanted features in the Oracle database is the BOOLEAN type. The SQL standard specified it a while ago, and RDBMS like PostgreSQL show how powerful it can be, e.g. when using the EVERY() aggregate function.

The PL/SQL language already has support for boolean types. We can write:

CREATE OR REPLACE FUNCTION number_to_boolean (i NUMBER) 
RETURN BOOLEAN 
IS
BEGIN
  RETURN NOT i = 0;
END number_to_boolean;
/

CREATE OR REPLACE FUNCTION boolean_to_number (b BOOLEAN) 
RETURN NUMBER 
IS
BEGIN
  RETURN CASE WHEN b THEN 1 WHEN NOT b THEN 0 END;
END boolean_to_number;
/

From PL/SQL, we can now easily call the above functions:

SET SERVEROUTPUT ON
BEGIN
  IF number_to_boolean(1) THEN
    dbms_output.put_line('1 is true');
  END IF;
  IF NOT number_to_boolean(0) THEN
    dbms_output.put_line('0 is false');
  END IF;
  IF number_to_boolean(NULL) IS NULL THEN
    dbms_output.put_line('null is null');
  END IF;
END;
/

The above prints

1 is true
0 is false
null is null

But we cannot do the same from the SQL engine:

SELECT 
  number_to_boolean(1), 
  number_to_boolean(0), 
  number_to_boolean(null) 
FROM dual;

This yields:

ORA-00902: invalid datatype

Eventually, Oracle will fix this by supporting boolean types in the SQL engine (show your love to Oracle here).

The WITH clause

Until then, we can make use of a nice workaround using new functionality from Oracle 12c. We can declare functions in the WITH clause! Run this:

WITH
  FUNCTION f RETURN NUMBER IS 
  BEGIN 
    RETURN 1; 
  END f;
SELECT f
FROM dual;

You’ll get

 F
---
 1

That’s wonderful, and what’s even better, this part of the WITH clause is written in PL/SQL, where we can use the BOOLEAN type again. So we can define bridge functions for each function call. Instead of this:

SELECT 
  number_to_boolean(1), 
  number_to_boolean(0), 
  number_to_boolean(null) 
FROM dual;

We can write this:

WITH
  FUNCTION number_to_boolean_(i NUMBER)
  RETURN NUMBER
  IS
    b BOOLEAN;
  BEGIN
    -- Actual function call
    b := number_to_boolean(i);
    
    -- Translation to numeric result
    RETURN CASE b WHEN TRUE THEN 1 WHEN FALSE THEN 0 END;
  END number_to_boolean_;
SELECT 
  number_to_boolean_(1) AS a, 
  number_to_boolean_(0) AS b, 
  number_to_boolean_(null) AS c
FROM dual;

This now yields:

 A   B   C
-------------
 1   0   null

Of course, we don’t get an actual boolean type back in the result set, as the SQL engine cannot process that. But if you’re calling this function from JDBC, 1/0/null can be translated transparently to true/false/null.

It also works for chaining. Instead of the following, which still yields ORA-00902:

SELECT 
  boolean_to_number(number_to_boolean(1)), 
  boolean_to_number(number_to_boolean(0)), 
  boolean_to_number(number_to_boolean(null))
FROM dual;

We can write this:

WITH
  FUNCTION number_to_boolean_(i NUMBER)
  RETURN NUMBER
  IS
    b BOOLEAN;
  BEGIN
    -- Actual function call
    b := number_to_boolean(i);
    
    -- Translation to numeric result
    RETURN CASE b WHEN TRUE THEN 1 WHEN FALSE THEN 0 END;
  END number_to_boolean_;
  
  FUNCTION boolean_to_number_(b NUMBER)
  RETURN NUMBER
  IS
  BEGIN
    -- Actual function call
    RETURN boolean_to_number(NOT b = 0);
  END boolean_to_number_;
SELECT 
  boolean_to_number_(number_to_boolean_(1)) AS a, 
  boolean_to_number_(number_to_boolean_(0)) AS b, 
  boolean_to_number_(number_to_boolean_(null)) AS c
FROM dual;

… which again yields

 A   B   C
-------------
 1   0   null

And now, the 1/0/null integers are the actual desired result types.

This technique can be automated for any type of PL/SQL function that accepts and/or returns a PL/SQL BOOLEAN type, or even for functions that accept %ROWTYPE parameters, which we’ll work into jOOQ soon, in the near future.

A more real world example can be seen in this Stack Overflow question.

jOOQ 3.12 support

In jOOQ 3.12, we will add native support for using such functions in SQL through #8522. We have already supported PL/SQL boolean types in standalone procedure calls since jOOQ 3.8. With the next version, we can call a function like this one:

FUNCTION f_bool (i BOOLEAN) RETURN BOOLEAN;

From anywhere within a jOOQ statement, e.g.

Record1<Integer> r =
create()
    .select(one())
    .where(PlsObjects.fBool(false))
    .fetchOne();

assertNull(r);

When the above is called, the following SQL statement is generated by jOOQ 3.12, behind the scenes:

with
  function "F_BOOL_"(I integer)
  return integer
  is
    "r" boolean;
  begin
    "r" := "TEST"."PLS_OBJECTS"."F_BOOL"(not I = 0);
    return case when "r" then 1 when not "r" then 0 end;
  end "F_BOOL_";
  select 1
from dual
where (F_BOOL_(0) = 1)

Notice how the boolean expression codes like a true boolean / predicate?

The Difference Between SQL’s JOIN .. ON Clause and the Where Clause

A question that is frequently occurring among my SQL training‘s participants is:

What’s the difference between putting a predicate in the JOIN .. ON clause and the WHERE clause?

I can definitely see how that’s confusing some people, as there seems to be no difference at first sight, when running queries like these, e.g. in Oracle. I’m using the Sakila database, as always:

-- First query
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE fa.film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) DESC;

This will yield something like:

ACTOR_ID  FIRST_NAME  LAST_NAME  COUNT
--------------------------------------
108       WARREN      NOLTE      3
162       OPRAH       KILMER     3
19        BOB         FAWCETT    2
10        CHRISTIAN   GABLE      2
53        MENA        TEMPLE     2
137       MORGAN      WILLIAMS   1
2         NICK        WAHLBERG   1

Of course, we could have written this instead, and received the same result:

-- Second query
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
  AND fa.film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) DESC;

Now, I’ve moved the FILM_ID < 10 filter from the WHERE clause to the ON clause. But the execution plan is the same for both queries:

---------------------------------------------------------
| Id  | Operation               | Name          | Rows  |
---------------------------------------------------------
|   0 | SELECT STATEMENT        |               |    49 |
|   1 |  SORT ORDER BY          |               |    49 |
|   2 |   HASH GROUP BY         |               |    49 |
|*  3 |    HASH JOIN            |               |    49 |
|*  4 |     INDEX FAST FULL SCAN| PK_FILM_ACTOR |    49 |
|   5 |     VIEW                | VW_GBF_7      |   200 |
|   6 |      TABLE ACCESS FULL  | ACTOR         |   200 |
---------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("ITEM_1"="FA"."ACTOR_ID")
   4 - filter("FA"."FILM_ID"<10)

It does not seem matter at all. Both queries yield the same result as well as the same plan. So…

Are ON and WHERE really the same thing?

They are when you run an inner join. But they are not when you run an outer join.

And now, let’s compare these two queries here:

-- First query
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE fa.film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) ASC;

Yielding

ACTOR_ID  FIRST_NAME  LAST_NAME  COUNT
--------------------------------------
194       MERYL       ALLEN      1
198       MARY        KEITEL     1
30        SANDRA      PECK       1
85        MINNIE      ZELLWEGER  1
123       JULIANNE    DENCH      1

Notice that with this syntax, we’re not getting any actors that have no films with FILM_ID < 10. We should get dozens! How about this:

-- Second query
SELECT a.actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
  AND fa.film_id < 10
GROUP BY a.actor_id, a.first_name, a.last_name
ORDER BY count(fa.film_id) ASC;

This used to produce the same result for an (INNER) JOIN, but given the LEFT JOIN, we’re now not getting Susan Davis in the result:

ACTOR_ID  FIRST_NAME  LAST_NAME     COUNT
-----------------------------------------
3         ED          CHASE         0
4         JENNIFER    DAVIS         0
5         JOHNNY      LOLLOBRIGIDA  0
6         BETTE       NICHOLSON	    0
...
1         PENELOPE    GUINESS       1
200       THORA       TEMPLE        1
2         NICK        WAHLBERG      1
198       MARY        KEITEL        1

The plans are also different:

---------------------------------------------------------
| Id  | Operation               | Name          | Rows  |
---------------------------------------------------------
|   0 | SELECT STATEMENT        |               |    49 |
|   1 |  SORT ORDER BY          |               |    49 |
|   2 |   HASH GROUP BY         |               |    49 |
|*  3 |    HASH JOIN            |               |    49 |
|*  4 |     INDEX FAST FULL SCAN| PK_FILM_ACTOR |    49 |
|   5 |     VIEW                | VW_GBF_7      |   200 |
|   6 |      TABLE ACCESS FULL  | ACTOR         |   200 |
---------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("ITEM_1"="FA"."ACTOR_ID")
   4 - filter("FA"."FILM_ID"<10)

No outer join here! Versus

---------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  |
---------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |   200 |
|   1 |  SORT ORDER BY                |               |   200 |
|   2 |   MERGE JOIN OUTER            |               |   200 |
|   3 |    TABLE ACCESS BY INDEX ROWID| ACTOR         |   200 |
|   4 |     INDEX FULL SCAN           | PK_ACTOR      |   200 |
|*  5 |    SORT JOIN                  |               |    44 |
|   6 |     VIEW                      | VW_GBC_5      |    44 |
|   7 |      HASH GROUP BY            |               |    44 |
|*  8 |       INDEX FAST FULL SCAN    | PK_FILM_ACTOR |    49 |
---------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("A"."ACTOR_ID"="ITEM_1"(+))
       filter("A"."ACTOR_ID"="ITEM_1"(+))
   8 - filter("FILM_ID"(+)<10)

The first query did not produce an outer join operation, the second one did!

What’s the difference?

The difference is:

  • An INNER JOIN produces all the actors who played in at least one film, filtering out the actors who did not play in a film. That’s the very definition of an inner join. If we filter the films with FILM_ID < 10, that simply means we don’t want any actors without such films either.
  • A LEFT JOIN will produce all the rows from the left side of the join, regardless if there is a matching row on the right side of the join.

In both cases, the matching rows are determined by the ON clause. If two rows don’t match, then:

  • The INNER JOIN removes them both from the result
  • The LEFT JOIN retains the left row in the result

But regardless what the JOIN produces, the WHERE clause will again remove rows that do not satisfy the filter. So,

  • In the INNER JOIN case, it does not matter if we remove actors with no films, and then actors without films with FILM_ID < 10, OR if we remove actors with no films with FILM_ID < 10 directly. They’re going to be removed anyway.
  • In the LEFT JOIN case, it does matter if we retain actors with no films, and then remove actors without films with FILM_ID < 10 (in case of which actors without films will be removed again), OR if we retain actors without films with FILM_ID < 10, and then not apply any further filters.

Conclusion

For INNER JOIN, WHERE predicates and ON predicates have the same effect.

For OUTER JOIN, WHERE predicates and ON predicates have a different effect.

In general, it is always best to put a predicate where it belongs, logically. If the predicate is related to a JOIN operation, it belongs in the ON clause. If a predicate is related to a filter applied to the entire FROM clause, it belongs in the WHERE clause.