A Curious Incidence of a jOOQ API Design Flaw


jOOQ is an internal domain-specific language (DSL), modelling the SQL language (external DSL) in Java (the host language). The main mechanism of the jOOQ API is described in this popular article:

The Java Fluent API Designer Crash Course.

Anyone can implement an internal DSL in Java (or in most other host languages) according to the rules from that article.

An example SQL language feature: BOOLEANs

One of the nice things about the SQL language, however, is the BOOLEAN type, which has been introduced late into the language as of SQL:1999. Sure, without booleans, you can just model TRUE and FALSE values via 1 and 0, and transform the predicates into the value using CASE

CASE WHEN A = B THEN 1 ELSE 0 END

But with true BOOLEAN support, you can do awesome queries like the following PostgreSQL query that is run against the Sakila database:

SELECT
  f.title, 
  string_agg(a.first_name, ', ') AS actors
FROM film AS f
JOIN film_actor AS fa USING (film_id)
JOIN actor AS a USING (actor_id)
GROUP BY film_id
HAVING every(a.first_name LIKE '%A%')

The above yields:

TITLE                    ACTORS
-----------------------------------------------------
AMISTAD MIDSUMMER        CARY, DARYL, SCARLETT, SALMA
ANNIE IDENTITY           CATE, ADAM, GRETA
ANTHEM LUKE              MILLA, OPRAH
ARSENIC INDEPENDENCE     RITA, CUBA, OPRAH
BIRD INDEPENDENCE        FAY, JAYNE
...

In other words, we’re looking for all the films where all the actors who played in the film contain the letter “A” in their first names. This is done via an aggregation on the boolean expression / predicate first_name LIKE '%A%':

HAVING every(a.first_name LIKE '%A%')

Now, in the terms of the jOOQ API, this means we’ll have to provide overloads of the having() method that take different argument types, such as:

// These accept "classic" predicates
having(Condition... conditions);
having(Collection<? extends Condition> conditions);

// These accept a BOOLEAN type
having(Field<Boolean> condition);

Of course, these overloads are available for any API method that accepts predicates / boolean values, not just for the HAVING clause.

As mentioned before, since SQL:1999, jOOQ’s Condition and Field<Boolean> are really the same thing. jOOQ allows for converting between the two via explicit API:

Condition condition1 = FIRST_NAME.like("%A%");
Field<Boolean> field = field(condition1);
Condition condition2 = condition(field);

… and the overloads make conversion more conveniently implicit.

So, what’s the problem?

The problem is that we thought it might be a good idea to add yet another convenient overload, the having(Boolean) method, where constant, nullable BOOLEAN values could be introduced into the query, for convenience, which can be useful when building dynamic SQL, or commenting out some predicates:

DSL.using(configuration)
   .select()
   .from(TABLE)
   .where(true)
// .and(predicate1)
   .and(predicate2)
// .and(predicate3)
   .fetch();

The idea is that the WHERE keyword will never be commented out, regardless what predicate you want to temporarily remove.

Unfortunately, adding this overload introduced a nuisance to developers using IDE auto-completion. Consider the following two method calls:

// Using jOOQ API
Condition condition1 = FIRST_NAME.eq   ("ADAM");
Condition condition2 = FIRST_NAME.equal("ADAM");

// Using Object.equals (accident)
boolean = FIRST_NAME.equals("ADAM");

By (accidentally) adding a letter “s” to the equal() method – mostly because of IDE autocompletion – the whole predicate expression changes semantics drastically, from a jOOQ expression tree element that can be used to generate SQL to an “ordinary” boolean value (which always yields false, obviously).

Prior to having added the last overload, this wasn’t a problem. The equals() method usage wouldn’t compile, as there was no applicable overload taking a Java boolean type.

// These accept "classic" predicates
having(Condition condition);
having(Condition... conditions);
having(Collection<? extends Condition> conditions);

// These accept a BOOLEAN type
having(Field<Boolean> condition);

// This method didn't exist prior to jOOQ 3.7
// having(Boolean condition);

After jOOQ 3.7, this accident started to go unnoticed in user code as the compiler no longer complained, leading to wrong SQL.

Conclusion: Be careful when designing an internal DSL. You inherit the host language’s “flaws”

Java is “flawed” in that every type is guaranteed to inherit from java.lang.Object and with it, its methods: getClass(), clone(), finalize() equals(), hashCode(), toString(), notify(), notifyAll(), and wait().

In most APIs, this isn’t really that much of a problem. You don’t really need to re-use any of the above method names (please, don’t).

But when designing an internal DSL, these Object method names (just like the language keywords) limit you in your design space. This is particularly obvious in the case of equal(s).

We’ve learned, and we’ve deprecated and will remove the having(Boolean) overload, and all the similar overloads again.

Advanced Java Trickery for Typesafe Query DSLs


When browsing Hacker News, I recently stumbled upon Benji Weber’s most interesting attempt at creating typesafe database interaction with Java 8. Benji created a typesafe query DSL somewhat similar to jOOQ with the important difference that it uses Java 8 method references to introspect POJOs and deduce query elements from it. This is best explained by example:

Optional<Person> person = 
    from(Person.class)
        .where(Person::getFirstName)
        .like("%ji")
        .and(Person::getLastName)
        .equalTo("weber")
        .select(
            personMapper, 
            connectionFactory::openConnection);

The above query can then be transformed into the following SQL statement:

SELECT * FROM person 
WHERE first_name LIKE ? AND last_name = ?

This is indeed a very interesting approach, and we’ve seen similar ideas around, before. Most prominently, such ideas were implemented in:

  • JaQu, another very interesting competitor product of jOOQ, created by Thomas Müller, the maintainer of the popular H2 database
  • LambdaJ, an attempt to bring lambda expressions to Java long before Java 8
  • OhmDB, a new NoSQL data store with a fluent query DSL

What’s new in Benji’s approach is really the fact that Java 8 method references can be used instead of resorting to CGLIB and other sorts of bytecode trickery through instrumentation. An example of such trickery is JaQu’s experimental bytecode introspection to transform complex Java boolean expressions into SQL – called “natural syntax”:

Timestamp ts = 
  Timestamp.valueOf("2005-05-05 05:05:05");
Time t = Time.valueOf("23:23:23");

long count = db.from(co).
    where(new Filter() { public boolean where() {
        return co.id == x
            && co.name.equals(name)
            && co.value == new BigDecimal("1")
            && co.amount == 1L
            && co.birthday.before(new Date())
            && co.created.before(ts)
            && co.time.before(t);
        } }).selectCount();

While these ideas are certainly very interesting to play around with, we doubt such language and bytecode transformations will lead to robust results. People have criticised Hibernate’s use of proxying in various blog posts.

We prefer a WYSIWYG approach where API consumers remain in full control of what is going on. What are your thoughts about such clever ideas?

Simplernate. A Query DSL for Hibernate, Inspired by jOOQ


Simplernate” ! The name is very compelling. And so is the idea – we think. Specifically, because we get most credit for this new project in the following short readme:
https://gist.github.com/thermz/eb1b12b2146168a08e68

It reads:

Simplernate is (will be) an Hibernate wrapper that help developer to query the database using Hibernate ORM. To do this, Simplernate offers a syntax inspired by jOOQ philosofy [sic].

This obviously reminds us of Torpedoquery, another typesafe query DSL that wraps Hibernate and Hibernate’s HQL. Unfortunately, there isn’t that much to see yet, but we will certainly follow up on future progress of a project with such a promising name!

The Java Fluent API Designer Crash Course


Ever since Martin Fowler’s talks about fluent interfaces, people have started chaining methods all over the place, creating fluent API’s (or DSLs) for every possible use case. In principle, almost every type of DSL can be mapped to Java. Let’s have a look at how this can be done

DSL rules

DSLs (Domain Specific Languages) are usually built up from rules that roughly look like these


1. SINGLE-WORD
2. PARAMETERISED-WORD parameter
3. WORD1 [ OPTIONAL-WORD ]
4. WORD2 { WORD-CHOICE-A | WORD-CHOICE-B }
5. WORD3 [ , WORD3 ... ]

Alternatively, you could also declare your grammar like this (as supported by this nice Railroad Diagrams site):


Grammar ::= ( 
  'SINGLE-WORD' | 
  'PARAMETERISED-WORD' '('[A-Z]+')' |
  'WORD1' 'OPTIONAL-WORD'? | 
  'WORD2' ( 'WORD-CHOICE-A' | 'WORD-CHOICE-B' ) | 
  'WORD3'+ 
)

Put in words, you have a start condition or state, from which you can choose some of your languages’ words before reaching an end condition or state. It’s like a state-machine, and can thus be drawn in a picture like this:

Simple Grammar

A simple grammar created with http://www.bottlecaps.de/rr/ui

Java implementation of those rules

With Java interfaces, it is quite simple to model the above DSL. In essence, you have to follow these transformation rules:

  • Every DSL “keyword” becomes a Java method
  • Every DSL “connection” becomes an interface
  • When you have a “mandatory” choice (you can’t skip the next keyword), every keyword of that choice is a method in the current interface. If only one keyword is possible, then there is only one method
  • When you have an “optional” keyword, the current interface extends the next one (with all its keywords / methods)
  • When you have a “repetition” of keywords, the method representing the repeatable keyword returns the interface itself, instead of the next interface
  • Every DSL subdefinition becomes a parameter. This will allow for recursiveness

Note, it is possible to model the above DSL with classes instead of interfaces, as well. But as soon as you want to reuse similar keywords, multiple inheritance of methods may come in very handy and you might just be better off with interfaces.

With these rules set up, you can repeat them at will to create DSLs of arbitrary complexity, like jOOQ. Of course, you’ll have to somehow implement all the interfaces, but that’s another story.

Here’s how the above rules are translated to Java:

// Initial interface, entry point of the DSL
// Depending on your DSL's nature, this can also be a class with static
// methods which can be static imported making your DSL even more fluent
interface Start {
  End singleWord();
  End parameterisedWord(String parameter);
  Intermediate1 word1();
  Intermediate2 word2();
  Intermediate3 word3();
}

// Terminating interface, might also contain methods like execute();
interface End {
  void end();
}

// Intermediate DSL "step" extending the interface that is returned
// by optionalWord(), to make that method "optional"
interface Intermediate1 extends End {
  End optionalWord();
}

// Intermediate DSL "step" providing several choices (similar to Start)
interface Intermediate2 {
  End wordChoiceA();
  End wordChoiceB();
}

// Intermediate interface returning itself on word3(), in order to allow
// for repetitions. Repetitions can be ended any time because this 
// interface extends End
interface Intermediate3 extends End {
  Intermediate3 word3();
}

With the above grammar defined, we can now use this DSL directly in Java. Here are all the possible constructs:

Start start = // ...

start.singleWord().end();
start.parameterisedWord("abc").end();

start.word1().end();
start.word1().optionalWord().end();

start.word2().wordChoiceA().end();
start.word2().wordChoiceB().end();

start.word3().end();
start.word3().word3().end();
start.word3().word3().word3().end();

And the best thing is, your DSL compiles directly in Java! You get a free parser. You can also re-use this DSL in Scala (or Groovy) using the same notation, or a slightly different one in Scala, omitting dots “.” and parentheses “()”:

 val start = // ...

 (start singleWord) end;
 (start parameterisedWord "abc") end;

 (start word1) end;
 ((start word1) optionalWord) end;

 ((start word2) wordChoiceA) end;
 ((start word2) wordChoiceB) end;

 (start word3) end;
 ((start word3) word3) end;
 (((start word3) word3) word3) end;

Real world examples

Some real world examples can be seen all across the jOOQ documentation and code base. Here’s an extract from a previous post of a rather complex SQL query created with jOOQ:

create().select(
    r1.ROUTINE_NAME,
    r1.SPECIFIC_NAME,
    decode()
        .when(exists(create()
            .selectOne()
            .from(PARAMETERS)
            .where(PARAMETERS.SPECIFIC_SCHEMA.equal(r1.SPECIFIC_SCHEMA))
            .and(PARAMETERS.SPECIFIC_NAME.equal(r1.SPECIFIC_NAME))
            .and(upper(PARAMETERS.PARAMETER_MODE).notEqual("IN"))),
                val("void"))
        .otherwise(r1.DATA_TYPE).as("data_type"),
    r1.NUMERIC_PRECISION,
    r1.NUMERIC_SCALE,
    r1.TYPE_UDT_NAME,
    decode().when(
    exists(
        create().selectOne()
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.notEqual(r1.SPECIFIC_NAME))),
        create().select(count())
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.lessOrEqual(r1.SPECIFIC_NAME)).asField())
    .as("overload"))
.from(r1)
.where(r1.ROUTINE_SCHEMA.equal(getSchemaName()))
.orderBy(r1.ROUTINE_NAME.asc())
.fetch()

Here’s another example from a library that looks quite appealing to me. It’s called jRTF and it’s used to create RTF documents in Java in a fluent style:

rtf()
  .header(
    color( 0xff, 0, 0 ).at( 0 ),
    color( 0, 0xff, 0 ).at( 1 ),
    color( 0, 0, 0xff ).at( 2 ),
    font( "Calibri" ).at( 0 ) )
  .section(
        p( font( 1, "Second paragraph" ) ),
        p( color( 1, "green" ) )
  )
).out( out );

Summary

Fluent APIs have been a hype for the last 7 years. Martin Fowler has become a heavily-cited man and gets most of the credits, even if fluent APIs were there before. One of Java’s oldest “fluent APIs” can be seen in java.lang.StringBuffer, which allows for appending arbitrary objects to a String. But the biggest benefit of a fluent API is its ability to easily map “external DSLs” into Java and implement them as “internal DSLs” of arbitrary complexity.

The ultimate SQL-DSL: jOOQ in Scala


I’ve recently come across some advertising for the new upcoming version of Scala IDE for Eclipse, which made me remember my college programming lessons at the EPFL Laboratoire des Méthodes de Programmation (LAMP), the origin of the Scala language. Back then, Scala appeared quite freaky. Very elegant, a bit inefficient, somewhat dogmatic. It was much more functional than object oriented, from what I recall, and Martin Odersky had a hard time agreeing that the key to success is to combine the two paradigms. But Scala has come a long way in the last 8 years. So I was wondering if jOOQ was portable to Scala. The answer amazes me:

jOOQ is 100% Scala-ready !!

Obviously, this is not due to jOOQ’s fluent API alone. It’s mostly because of how Scala was built on top of Java. Check out this piece of sample code:

package org.jooq.scala

import java.sql.Connection
import java.sql.DriverManager

// This makes integration of Java into Scala easier
import scala.collection.JavaConversions._

// Import all relevant things from jOOQ
import org.jooq.impl.Factory._
import org.jooq.util.maven.example.mysql.Test2Factory
import org.jooq.util.maven.example.mysql.Tables._

object Test {
  def main(args: Array[String]) {

    // This is business as usual. I guess there's
    // also a "Scala way" to do this...?
    Class.forName("com.mysql.jdbc.Driver");
    val connection = DriverManager.getConnection(
      "jdbc:mysql://localhost/test", "root", "");
    val create = new Test2Factory(connection);

    // Fetch book titles and their respective authors into
    // a result, and print the result to the console. Wow!
    // If this doesn't feel like SQL to you...?
    val result = (create
      select (
          T_BOOK.TITLE as "book title",
          T_AUTHOR.FIRST_NAME as "author's first name",
          T_AUTHOR.LAST_NAME as "author's last name")
      from T_AUTHOR
      join T_BOOK on (T_AUTHOR.ID equal T_BOOK.AUTHOR_ID)
      where (T_AUTHOR.ID in (1, 2, 3))
      orderBy (T_AUTHOR.LAST_NAME asc) fetch)

    // Print the result to the console
    println(result)

    // Iterate over authors and the number of books they've written
    // Print each value to the console
    for (r <- (create
               select (T_AUTHOR.FIRST_NAME, T_AUTHOR.LAST_NAME, count)
               from T_AUTHOR
               join T_BOOK on (T_AUTHOR.ID equal T_BOOK.AUTHOR_ID)
               where (T_AUTHOR.ID in (1, 2, 3))
               groupBy (T_AUTHOR.FIRST_NAME, T_AUTHOR.LAST_NAME)
               orderBy (T_AUTHOR.LAST_NAME asc)
               fetch)) {

      // Accessing record data is just like in Java
      print(r.getValue(T_AUTHOR.FIRST_NAME))
      print(" ")
      print(r.getValue(T_AUTHOR.LAST_NAME))
      print(" wrote ")
      print(r.getValue(count))
      println(" books ")
    }
  }
}

As expected, the console contains this data

+------------+-------------------+------------------+
|book title  |author's first name|author's last name|
+------------+-------------------+------------------+
|O Alquimista|Paulo              |Coelho            |
|Brida       |Paulo              |Coelho            |
|1984        |George             |Orwell            |
|Animal Farm |George             |Orwell            |
+------------+-------------------+------------------+

Paulo Coelho wrote 2 books 
George Orwell wrote 2 books 

You get 2 in 1

With Scala, jOOQ’s fluent API looks even more like SQL than in Java. And you get 2 in 1:

  1. Typesafe querying, meaning that your SQL syntax is compiled
  2. Typesafe querying, meaning that your database schema is part of the code

The biggest drawback I can see so far is that Scala ships with new reserved words, such as val, a very important method in jOOQ. I guess that could be sorted out somehow. So Scala users and SQL enthusiasts! Please! Feedback:-)

jOOQ-meta. A “hard-core SQL” proof of concept


jOOQ-meta is more than just meta data navigation for your database schema. It is also a proof of concept for the more complex jOOQ queries. It is easy for you users to believe that the simple vanilla queries of this form will work:

create.selectFrom(AUTHOR).where(LAST_NAME.equal("Cohen"));

But jOOQ claims to be a “hard-core SQL library”.

A “hard-core SQL” example

So let’s have a little look at some of jOOQ-meta’s hard-core SQL. Here’s a nice Postgres query that maps Postgres stored functions to jOOQ’s common concept of routines. There are two very curious features in Postgres, which are modelled by the example query

  1. Postgres only knows functions. If functions have one OUT parameter, then that parameter can be treated as the function return value. If functions have more than one OUT parameter, then those parameters can be treated as a function return cursor. In other words, all functions are tables. Quite interesting indeed. But for now, jOOQ doesn’t support that, so several OUT parameters need to be treated as a void result
  2. Postgres allows for overloading standalone functions (which isn’t allowed in Oracle, for instance). So in order to generate an overload index for every function directly in a SQL statement, I’m running a SELECT COUNT(*) subselect within a CASE expression, defaulting to null

Beware. SQL ahead! No dummy query!

Let’s have a look at the SQL:

Routines r1 = ROUTINES.as("r1");
Routines r2 = ROUTINES.as("r2");

for (Record record : create().select(
        r1.ROUTINE_NAME,
        r1.SPECIFIC_NAME,

        // 1. Ignore the data type when there is at least one out parameter
        decode()
            .when(exists(create()
                .selectOne()
                .from(PARAMETERS)
                .where(PARAMETERS.SPECIFIC_SCHEMA.equal(r1.SPECIFIC_SCHEMA))
                .and(PARAMETERS.SPECIFIC_NAME.equal(r1.SPECIFIC_NAME))
                .and(upper(PARAMETERS.PARAMETER_MODE).notEqual("IN"))),
                    val("void"))
            .otherwise(r1.DATA_TYPE).as("data_type"),
        r1.NUMERIC_PRECISION,
        r1.NUMERIC_SCALE,
        r1.TYPE_UDT_NAME,

        // 2. Calculate overload index if applicable
        decode().when(
        exists(
            create().selectOne()
                .from(r2)
                .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
                .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
                .and(r2.SPECIFIC_NAME.notEqual(r1.SPECIFIC_NAME))),
            create().select(count())
                .from(r2)
                .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
                .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
                .and(r2.SPECIFIC_NAME.lessOrEqual(r1.SPECIFIC_NAME)).asField())
        .as("overload"))
    .from(r1)
    .where(r1.ROUTINE_SCHEMA.equal(getSchemaName()))
    .orderBy(r1.ROUTINE_NAME.asc())
    .fetch()) {

    // [...] do the loop

The above SQL statement is executed when you generate source code for Postgres and works like a charm. With jOOQ 2.0, the DSL will become even less verbose and more powerful. You couldn’t write much less SQL when using JDBC directly. And you can forget it immediately, with other products, such as JPA, JPQL, HQL, etc:-).

For a comparison, this is what jOOQ renders (For better readability, I removed the escaping of table/field names):

select
  r1.routine_name,
  r1.specific_name,
  case when exists (
            select 1 from information_schema.parameters
            where (information_schema.parameters.specific_schema
              = r1.specific_schema
            and information_schema.parameters.specific_name
              = r1.specific_name
            and upper(information_schema.parameters.parameter_mode)
              <> 'IN'))
       then 'void'
       else r1.data_type
       end as data_type,
  r1.numeric_precision,
  r1.numeric_scale,
  r1.type_udt_name,
  case when exists (
            select 1 from information_schema.routines as r2
            where (r2.routine_schema = 'public'
            and r2.routine_name = r1.routine_name
            and r2.specific_name <> r1.specific_name))
       then (select count(*)
             from information_schema.routines as r2
             where (r2.routine_schema = 'public'
             and r2.routine_name = r1.routine_name
             and r2.specific_name <= r1.specific_name))
       end as overload
from information_schema.routines as r1
where r1.routine_schema = 'public'
order by r1.routine_name asc

Op4j and Lambda-J. For more fluency in Java


I recently blogged about simple constructs, such as Java’s Arrays.asList() and the fact that it is not used often enough:

https://lukaseder.wordpress.com/2011/10/28/javas-arrays-aslist-is-underused/

I like to work with fluent API’s, which are still quite a rare thing in the Java world, compared to other languages that support features such as language extensions, operator overloading, true generics, extension methods, closures, lambda expressions, functional constructs etc etc. But I also like Java’s JVM and the general syntax. And the many libraries that exist. I now came across Op4j, a really nice-looking library:

http://www.op4j.org/

It features exactly the kind of constructs I’d like to use every day. Some examples (taken from the documentation):

// Always static import Op.* as the main entry point
import static org.op4j.Op.*;
import static org.op4j.functions.FnString.*;

// Transform an array to uppercase
String[] values = ...;
List upperStrs =
  on(values).toList().map(toUpperCase()).get();

// Convert strings to integers
String[] values = ...;
List intValueList =
  on(values).toList().forEach().exec(toInteger()).get();

There are many more examples on their documentation page, and the API is huge and looks quite extensible:

http://www.op4j.org/apidocs/op4j/index.html

This library reminds me of Lambda-J, another attempt to bring more fluency to Java by introducing closure/lambda-like expressions in a static way:

http://code.google.com/p/lambdaj/

From a first look, Op4j looks more object oriented and straight-forward, though, whereas Lambda-J seems to depend on instrumentation and some advanced usage of reflection. A sample of some non-trivial Lambda-J usage:

Closure println = closure(); {
  of(System.out).println(var(String.class));
}

The above syntax is not easy to grasp. “closure()” seems to modify some static (ThreadLocal) state of the library, which can be used thereafter by the static method “of()”. “of()” in turn can take any type of parameter assuming its identity and type (!). Somehow, you can then “apply” objects of type String to the defined closure:

println.apply("one");
println.each("one", "two", "three");