ElSql, a new external SQL DSL for Java

Stephen Colebourne who is frequently commenting on the lambda-dev and other Java 8 mailing lists, has recently published an idea he has been having for a while: ElSql, a new external SQL DSL for Java.

An example SQL statement is given on the blog posts or on GitHub:

 @NAME(SelectBlogs)
   @PAGING(:paging_offset,:paging_fetch)
     SELECT @INCLUDE(CommonFields)
     FROM blogs
     WHERE id = :id
       @AND(:date)
         date > :date
       @AND(:active)
         active = :active
     ORDER BY title, author
 @NAME(CommonFields)
   title, author, content

As can be seen, this is almost mere string-based SQL, enhanced with some “hooks” for later use in Java. While Stephen’s idea here is to keep it simple (much simpler than jOOQ) it shows some useful applications of allowing to write external DSLs which resemble SQL even more than jOOQ.

DSL authoring support is becoming a more and more interesting topic in various platforms. Eclipse is developing Xtext, Scala is experimenting with even more powerful Macros. While Xtext doesn’t allow to mix the DSL grammar with Java, Scala’s Macros do precisely that. They allow for simple compile-time checked Scala syntax extension, if you have a powerful enough CPU to compile such source code. Where this could go for jOOQ was discussed on this blog before

For more details, read about ElSql on Stephen’s or the OpenGamma blog:

jOOQ’s fluent API in BNF notation

I have recently posted an article about how to generally design a fluent API in Java. By fluent API, I don’t mean simple constructs such as

new Builder().withSomething(x)
             .withSomethingElse(y)
             .withSomething(z)
             .withAnotherThing(xx);

The above is just simple method-chaining, without any sophisticated formal language definition. Most often in method-chaining contexts, the order of method calls is irrelevant to the API and may be chosen freely (i.e. you can call “withAnotherThing()” first, then “withSomethingElse()”. Instead, I mean a full-fledged domain specific language, formally defined using BNF notation (or anything equivalent).

jOOQ’s fluent API

Today, I’d like to give some insight about how jOOQ can be formally expressed as a true domain specific language using BNF notation. jOOQ in itself has actually evolved to become a SQL dialect of its own. It knows most of the standard DML SQL statements and clauses, as well as many vendor-specific ones. Let’s have a look at jOOQ 2.0’s understanding of a SELECT statement:

SELECT ::=
   ( ( 'select' | 'selectDistinct' ) FIELD* |
       'selectOne' |
       'selectZero' |
       'selectCount' )
     ( 'select' FIELD* )*
     ( 'hint' SQL )*
     ( 'from' ( TABLE+ | SQL )
       ( ( 'join' | 'leftOuterJoin' | 'rightOuterJoin' | 'fullOuterJoin' )
         ( TABLE | SQL )
         ( 'on' ( CONDITION+ | SQL ) MORE_CONDITIONS? |
           'using' FIELD+ ) |
         ( 'crossJoin' | 'naturalJoin' |
           'naturalLeftOuterJoin' | 'naturalRightOuterJoin' )
       ( TABLE | SQL ) )* )?
     ( ( 'where' ( CONDITION+ | SQL ) |
         ( 'whereExists' | 'whereNotExists' ) SELECT ) MORE_CONDITIONS? )?
     ( ( 'connectBy' | 'connectByNoCycle' )
       ( CONDITION | SQL )
       ( 'and' ( CONDITION | SQL ) )*
       ( 'startWith' ( CONDITION | SQL ) )? )?
     ( 'groupBy' GROUP_FIELD+ )?
     ( 'having' ( CONDITION+ | SQL ) MORE_CONDITIONS? )?
     ( 'orderBy' ( FIELD+ | SORT_FIELD+ | INT+ ) )?
     ( ( 'limit' INT ( 'offset' INT | INT )? ) |
       ( ( 'forUpdate'
           ( 'of' ( FIELD+ | TABLE+ ) )?
           ( 'wait' INT |
             'noWait' |
             'skipLocked' )? ) |
         'forShare' ) )?
   ( ( 'union' | 'unionAll' | 'except' | 'intersect' ) SELECT )*

Or in the much more readable graphical representation:

Unlike simpler querying API’s such as the JPA Criteria Query API or QueryDSL, for example, jOOQ really emphasises on SQL syntax correctness. In that way, it is impossible to form complex SQL queries in a nonsensical way such as providing “JOIN” clauses without previously declaring at least one table source, or providing “ON” clauses without previously defining a “JOIN” clause, or providing an “ON” clause on a “CROSS JOIN”, etc, etc.

The full jOOQ BNF representation can be seen here:

If you compare this BNF and its output (jOOQ) to what I’ve written in my previous blog post about fluent API design, you can see that there is potential for complete formalisation of this process. In principle, any arbitrary BNF can be formally transformed into a set of Java interfaces modelling your domain specific language directly in Java. I will soon create a prototype source code generator for this, so stay tuned!

The Java Fluent API Designer Crash Course

Ever since Martin Fowler’s talks about fluent interfaces, people have started chaining methods all over the place, creating fluent API’s (or DSLs) for every possible use case. In principle, almost every type of DSL can be mapped to Java. Let’s have a look at how this can be done

DSL rules

DSLs (Domain Specific Languages) are usually built up from rules that roughly look like these


1. SINGLE-WORD
2. PARAMETERISED-WORD parameter
3. WORD1 [ OPTIONAL-WORD ]
4. WORD2 { WORD-CHOICE-A | WORD-CHOICE-B }
5. WORD3 [ , WORD3 ... ]

Alternatively, you could also declare your grammar like this (as supported by this nice Railroad Diagrams site):


Grammar ::= ( 
  'SINGLE-WORD' | 
  'PARAMETERISED-WORD' '('[A-Z]+')' |
  'WORD1' 'OPTIONAL-WORD'? | 
  'WORD2' ( 'WORD-CHOICE-A' | 'WORD-CHOICE-B' ) | 
  'WORD3'+ 
)

Put in words, you have a start condition or state, from which you can choose some of your languages’ words before reaching an end condition or state. It’s like a state-machine, and can thus be drawn in a picture like this:

Simple Grammar
A simple grammar created with http://www.bottlecaps.de/rr/ui

Java implementation of those rules

With Java interfaces, it is quite simple to model the above DSL. In essence, you have to follow these transformation rules:

  • Every DSL “keyword” becomes a Java method
  • Every DSL “connection” becomes an interface
  • When you have a “mandatory” choice (you can’t skip the next keyword), every keyword of that choice is a method in the current interface. If only one keyword is possible, then there is only one method
  • When you have an “optional” keyword, the current interface extends the next one (with all its keywords / methods)
  • When you have a “repetition” of keywords, the method representing the repeatable keyword returns the interface itself, instead of the next interface
  • Every DSL subdefinition becomes a parameter. This will allow for recursiveness

Note, it is possible to model the above DSL with classes instead of interfaces, as well. But as soon as you want to reuse similar keywords, multiple inheritance of methods may come in very handy and you might just be better off with interfaces.

With these rules set up, you can repeat them at will to create DSLs of arbitrary complexity, like jOOQ. Of course, you’ll have to somehow implement all the interfaces, but that’s another story.

Here’s how the above rules are translated to Java:

// Initial interface, entry point of the DSL
// Depending on your DSL's nature, this can also be a class with static
// methods which can be static imported making your DSL even more fluent
interface Start {
  End singleWord();
  End parameterisedWord(String parameter);
  Intermediate1 word1();
  Intermediate2 word2();
  Intermediate3 word3();
}

// Terminating interface, might also contain methods like execute();
interface End {
  void end();
}

// Intermediate DSL "step" extending the interface that is returned
// by optionalWord(), to make that method "optional"
interface Intermediate1 extends End {
  End optionalWord();
}

// Intermediate DSL "step" providing several choices (similar to Start)
interface Intermediate2 {
  End wordChoiceA();
  End wordChoiceB();
}

// Intermediate interface returning itself on word3(), in order to allow
// for repetitions. Repetitions can be ended any time because this 
// interface extends End
interface Intermediate3 extends End {
  Intermediate3 word3();
}

With the above grammar defined, we can now use this DSL directly in Java. Here are all the possible constructs:

Start start = // ...

start.singleWord().end();
start.parameterisedWord("abc").end();

start.word1().end();
start.word1().optionalWord().end();

start.word2().wordChoiceA().end();
start.word2().wordChoiceB().end();

start.word3().end();
start.word3().word3().end();
start.word3().word3().word3().end();

And the best thing is, your DSL compiles directly in Java! You get a free parser. You can also re-use this DSL in Scala (or Groovy) using the same notation, or a slightly different one in Scala, omitting dots “.” and parentheses “()”:

 val start = // ...

 (start singleWord) end;
 (start parameterisedWord "abc") end;

 (start word1) end;
 ((start word1) optionalWord) end;

 ((start word2) wordChoiceA) end;
 ((start word2) wordChoiceB) end;

 (start word3) end;
 ((start word3) word3) end;
 (((start word3) word3) word3) end;

Real world examples

Some real world examples can be seen all across the jOOQ documentation and code base. Here’s an extract from a previous post of a rather complex SQL query created with jOOQ:

create().select(
    r1.ROUTINE_NAME,
    r1.SPECIFIC_NAME,
    decode()
        .when(exists(create()
            .selectOne()
            .from(PARAMETERS)
            .where(PARAMETERS.SPECIFIC_SCHEMA.equal(r1.SPECIFIC_SCHEMA))
            .and(PARAMETERS.SPECIFIC_NAME.equal(r1.SPECIFIC_NAME))
            .and(upper(PARAMETERS.PARAMETER_MODE).notEqual("IN"))),
                val("void"))
        .otherwise(r1.DATA_TYPE).as("data_type"),
    r1.NUMERIC_PRECISION,
    r1.NUMERIC_SCALE,
    r1.TYPE_UDT_NAME,
    decode().when(
    exists(
        create().selectOne()
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.notEqual(r1.SPECIFIC_NAME))),
        create().select(count())
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.lessOrEqual(r1.SPECIFIC_NAME)).asField())
    .as("overload"))
.from(r1)
.where(r1.ROUTINE_SCHEMA.equal(getSchemaName()))
.orderBy(r1.ROUTINE_NAME.asc())
.fetch()

Here’s another example from a library that looks quite appealing to me. It’s called jRTF and it’s used to create RTF documents in Java in a fluent style:

rtf()
  .header(
    color( 0xff, 0, 0 ).at( 0 ),
    color( 0, 0xff, 0 ).at( 1 ),
    color( 0, 0, 0xff ).at( 2 ),
    font( "Calibri" ).at( 0 ) )
  .section(
        p( font( 1, "Second paragraph" ) ),
        p( color( 1, "green" ) )
  )
).out( out );

Summary

Fluent APIs have been a hype for the last 7 years. Martin Fowler has become a heavily-cited man and gets most of the credits, even if fluent APIs were there before. One of Java’s oldest “fluent APIs” can be seen in java.lang.StringBuffer, which allows for appending arbitrary objects to a String. But the biggest benefit of a fluent API is its ability to easily map “external DSLs” into Java and implement them as “internal DSLs” of arbitrary complexity.

SQL in Scala, where jOOQ could go

I have recently blogged about how simple it is to integrate jOOQ into Scala. See the full blog post here:

https://lukaseder.wordpress.com/2011/12/11/the-ultimate-sql-dsl-jooq-in-scala/

I’m more and more thrilled by that option, as Scala is one of the fastest emerging JVM languages nowadays. The plain integration of a Java library in Scala leaves some open questions. jOOQ knows the “val()” operator or method, to create bind values. See the manual here:

http://www.jooq.org/manual/JOOQ/BindValues/

This operator cannot be used in Scala, as Scala declares “val” as a reserved word. I’ve had similar issues in Java before, when trying to use “case” or “else” in the API, which is not possible either. The path of least resistance is to overload or re-name those methods. “val” was overloaded as “value” in jOOQ 2.0.1. “case” and “else” were re-named a long time ago as “decode” (from Oracle’s DECODE function), and “otherwise” (as in XSL).

So for a full-fledged integration in Scala, jOOQ should be wrapped in a new API called scOOQ ;-). This new API should take Scala’s language features into account in order to make working with jOOQ a lot easier. This could be the chance to re-engineer some of the API and make all API methods uppercase, as is usual with SQL. With Scala’s ability of omitting syntax elements, such as “.” and “()”, the API could then declare one-word methods, such as in this Java example:

MERGE().INTO(MY_TABLE)
       .USING(SOURCE_TABLE)
       .ON(MY_TABLE.ID.equal(SOURCE_TABLE.ID))
       .WHEN().MATCHED().THEN().UPDATE()
         .SET(ID, 1)
         .SET(DATA, "Data")
       .WHEN().NOT().MATCHED().THEN().INSERT(MY_TABLE.ID, MY_TABLE.DATA)
       .VALUES(1, "Data")

While in Java, this looks quite nasty and verbose, in Scala it could be very nice! The below statement should compile in Scala if the API was declared as such:

MERGE INTO MY_TABLE
      USING (SOURCE_TABLE)
      ON (MY_TABLE.ID equal SOURCE_TABLE.ID)
      WHEN MATCHED THEN UPDATE
        SET (ID, 1)
        SET (DATA, "Data")
      WHEN NOT MATCHED THEN INSERT (MY_TABLE.ID, MY_TABLE.DATA)
      VALUES (1, "Data")

Convinced? Contributions very welcome! :-)

Op4j and Lambda-J. For more fluency in Java

I recently blogged about simple constructs, such as Java’s Arrays.asList() and the fact that it is not used often enough:

https://lukaseder.wordpress.com/2011/10/28/javas-arrays-aslist-is-underused/

I like to work with fluent API’s, which are still quite a rare thing in the Java world, compared to other languages that support features such as language extensions, operator overloading, true generics, extension methods, closures, lambda expressions, functional constructs etc etc. But I also like Java’s JVM and the general syntax. And the many libraries that exist. I now came across Op4j, a really nice-looking library:

http://www.op4j.org/

It features exactly the kind of constructs I’d like to use every day. Some examples (taken from the documentation):

// Always static import Op.* as the main entry point
import static org.op4j.Op.*;
import static org.op4j.functions.FnString.*;

// Transform an array to uppercase
String[] values = ...;
List upperStrs =
  on(values).toList().map(toUpperCase()).get();

// Convert strings to integers
String[] values = ...;
List intValueList =
  on(values).toList().forEach().exec(toInteger()).get();

There are many more examples on their documentation page, and the API is huge and looks quite extensible:

http://www.op4j.org/apidocs/op4j/index.html

This library reminds me of Lambda-J, another attempt to bring more fluency to Java by introducing closure/lambda-like expressions in a static way:

http://code.google.com/p/lambdaj/

From a first look, Op4j looks more object oriented and straight-forward, though, whereas Lambda-J seems to depend on instrumentation and some advanced usage of reflection. A sample of some non-trivial Lambda-J usage:

Closure println = closure(); {
  of(System.out).println(var(String.class));
}

The above syntax is not easy to grasp. “closure()” seems to modify some static (ThreadLocal) state of the library, which can be used thereafter by the static method “of()”. “of()” in turn can take any type of parameter assuming its identity and type (!). Somehow, you can then “apply” objects of type String to the defined closure:

println.apply("one");
println.each("one", "two", "three");

SQL DSL’s in other languages

Like jOOQ, there are many other tools out there, that aim to implement SQL as an internal DSL in other languages. This one is particularly nice-looking. It’s called sqlkorma, a SQL DSL for Clojure. A sample SQL statement taken from their documentation:

(select users
  (with address) ;; include other entities based on
                 ;; their relationship
  (fields :first :last :address.state)
      ;; you can alias a field using a vector of [field alias]
  (aggregate (count :*) :cnt :status) 
      ;; You specify alias and optionally a field to group by
      ;; available aggregates:
      ;; sum, first, last, min, max, avg, count
  (where {:first "john"
          :last [like "doe"]}) 
  (order :id :ASC)
  (group :status)
  (limit 3)
  (offset 3))

These nice languages that support closures and/or lambda expressions make for much more intuitive aliasing than jOOQ can do today. I’m really looking forward to Java 8!

See more examples in the documentation:

http://sqlkorma.com/docs#db