Let’s revise the SQL FROM clause

Intuitive SQL

SQL is extremely simple and yet at times, tricky. Most SQL developers have an intuitive (as opposed to formal) understanding of how the language works, for two reasons:

  1. It is designed “intuitively”, like a natural language. Maybe that keeps us from studying it more formally
  2. The formal language specification is not really freely available (except SQL-92), nor easy to read, nor necessary to get started

Of course, there are also those to whom SQL is a mystery. Let’s ignore those and have a look at how a simple SELECT statement works:

SELECT *
FROM my_table
JOIN my_other_table ON id1 = id2
JOIN my_third_table ON id2 = id3
WHERE ...

This looks like we’re using 4 of the SELECT statement’s many clauses:

  • SELECT: specifying the projection
  • FROM: specifying the “main” table
  • JOIN: specifying several “joined” tables
  • WHERE: specifying the selection / filtering

Misinterpretation of the “JOIN clause”

But in fact, our intuition is slightly wrong. “Our intuition…” Of course, you might already know this but I feel it’s something we don’t think about every day… How are we wrong? By the fact that we tend to think there is a formal “JOIN” clause in a SELECT statement. There isn’t. Here’s a simplified T-SQL definition:

<query_specification> ::= 
SELECT [ ALL | DISTINCT ] < select_list > 
    [ FROM { <table_source> } [ ,...n ] ] 
    [ WHERE <search_condition> ] 
    [ <GROUP BY> ] 
    [ HAVING < search_condition > ] 

As you can see, there is no explicit JOIN clause in a SELECT statement, even if we tend to phrase our SQL in a way that would indicate so. Let’s have a closer look at the (simplified) definition of <table_source>

<table_source> ::= 
{
    table_or_view_name [ [ AS ] table_alias ]
    | user_defined_function [ [ AS ] table_alias ]
    | derived_table [ AS ] table_alias [ ( column_alias [ ,...n ] ) ] 
    | <joined_table> 
}
<joined_table> ::= 
{
    <table_source> <join_type> <table_source> ON <search_condition> 
    | <table_source> CROSS JOIN <table_source> 
    | [ ( ] <joined_table> [ ) ] 
}
<join_type> ::= 
    [ { INNER | { { LEFT | RIGHT | FULL } [ OUTER ] } } [ <join_hint> ] ]
    JOIN

So in fact, the true reading of our above SQL statement goes like this:

SELECT *
FROM ((my_table JOIN my_other_table ON id1 = id2)
                JOIN my_third_table ON id2 = id3)
WHERE ...

We’re actually creating a tree of table sources, first by joining “my_other_table” to “my_table” and then by joining “my_third_table” to the previously created table source. When this is understood, the following statement will no longer be surprising:

SELECT *
FROM t1 JOIN t2 ON id1 = id2,
     another_table,
     t3 JOIN t4 ON id3 = id4,
     yet_another_table,
     t5 JOIN t6 ON id5 = id6
WHERE ...

Here, we select from an overall cross product of 5 table sources, some of which are actual tables, some of which are created from joined tables…

Implications for jOOQ

In jOOQ, this is about to be supported for release 2.0.3. The org.jooq.Table API will be enhanced by several join methods. You will be able to write things like this:

// Create a specific type of table source
Table<Record> tableSource =
  MY_TABLE.join(MY_OTHER_TABLE).on(ID1.equal(ID2))
          .join(MY_THIRD_TABLE).on(ID2.equal(ID3));

// Use the above table source in a select statement
create.select()
      .from(tableSource)
      .where(...);

// Use several "complex" table sources
create.select()
      .from(T1.join(T2).on(ID1.equal(ID2)),
            ANOTHER_TABLE,
            T3.join(T4).on(ID3.equal(ID4)),
            YET_ANOTHER_TABLE,
            T5.join(T6).on(ID5.equal(ID6)))
      .where(...);

The existing syntax will be maintained as a fully compatible convenience API. These additions are an important step forward for creating more complex table sources, also involving Oracle (and SQL Server) PIVOT and UNPIVOT tables, about which I will blog in one of the next posts. Stay tuned!

The Java Fluent API Designer Crash Course

Ever since Martin Fowler’s talks about fluent interfaces, people have started chaining methods all over the place, creating fluent API’s (or DSLs) for every possible use case. In principle, almost every type of DSL can be mapped to Java. Let’s have a look at how this can be done

DSL rules

DSLs (Domain Specific Languages) are usually built up from rules that roughly look like these


1. SINGLE-WORD
2. PARAMETERISED-WORD parameter
3. WORD1 [ OPTIONAL-WORD ]
4. WORD2 { WORD-CHOICE-A | WORD-CHOICE-B }
5. WORD3 [ , WORD3 ... ]

Alternatively, you could also declare your grammar like this (as supported by this nice Railroad Diagrams site):


Grammar ::= ( 
  'SINGLE-WORD' | 
  'PARAMETERISED-WORD' '('[A-Z]+')' |
  'WORD1' 'OPTIONAL-WORD'? | 
  'WORD2' ( 'WORD-CHOICE-A' | 'WORD-CHOICE-B' ) | 
  'WORD3'+ 
)

Put in words, you have a start condition or state, from which you can choose some of your languages’ words before reaching an end condition or state. It’s like a state-machine, and can thus be drawn in a picture like this:

Simple Grammar
A simple grammar created with http://www.bottlecaps.de/rr/ui

Java implementation of those rules

With Java interfaces, it is quite simple to model the above DSL. In essence, you have to follow these transformation rules:

  • Every DSL “keyword” becomes a Java method
  • Every DSL “connection” becomes an interface
  • When you have a “mandatory” choice (you can’t skip the next keyword), every keyword of that choice is a method in the current interface. If only one keyword is possible, then there is only one method
  • When you have an “optional” keyword, the current interface extends the next one (with all its keywords / methods)
  • When you have a “repetition” of keywords, the method representing the repeatable keyword returns the interface itself, instead of the next interface
  • Every DSL subdefinition becomes a parameter. This will allow for recursiveness

Note, it is possible to model the above DSL with classes instead of interfaces, as well. But as soon as you want to reuse similar keywords, multiple inheritance of methods may come in very handy and you might just be better off with interfaces.

With these rules set up, you can repeat them at will to create DSLs of arbitrary complexity, like jOOQ. Of course, you’ll have to somehow implement all the interfaces, but that’s another story.

Here’s how the above rules are translated to Java:

// Initial interface, entry point of the DSL
// Depending on your DSL's nature, this can also be a class with static
// methods which can be static imported making your DSL even more fluent
interface Start {
  End singleWord();
  End parameterisedWord(String parameter);
  Intermediate1 word1();
  Intermediate2 word2();
  Intermediate3 word3();
}

// Terminating interface, might also contain methods like execute();
interface End {
  void end();
}

// Intermediate DSL "step" extending the interface that is returned
// by optionalWord(), to make that method "optional"
interface Intermediate1 extends End {
  End optionalWord();
}

// Intermediate DSL "step" providing several choices (similar to Start)
interface Intermediate2 {
  End wordChoiceA();
  End wordChoiceB();
}

// Intermediate interface returning itself on word3(), in order to allow
// for repetitions. Repetitions can be ended any time because this 
// interface extends End
interface Intermediate3 extends End {
  Intermediate3 word3();
}

With the above grammar defined, we can now use this DSL directly in Java. Here are all the possible constructs:

Start start = // ...

start.singleWord().end();
start.parameterisedWord("abc").end();

start.word1().end();
start.word1().optionalWord().end();

start.word2().wordChoiceA().end();
start.word2().wordChoiceB().end();

start.word3().end();
start.word3().word3().end();
start.word3().word3().word3().end();

And the best thing is, your DSL compiles directly in Java! You get a free parser. You can also re-use this DSL in Scala (or Groovy) using the same notation, or a slightly different one in Scala, omitting dots “.” and parentheses “()”:

 val start = // ...

 (start singleWord) end;
 (start parameterisedWord "abc") end;

 (start word1) end;
 ((start word1) optionalWord) end;

 ((start word2) wordChoiceA) end;
 ((start word2) wordChoiceB) end;

 (start word3) end;
 ((start word3) word3) end;
 (((start word3) word3) word3) end;

Real world examples

Some real world examples can be seen all across the jOOQ documentation and code base. Here’s an extract from a previous post of a rather complex SQL query created with jOOQ:

create().select(
    r1.ROUTINE_NAME,
    r1.SPECIFIC_NAME,
    decode()
        .when(exists(create()
            .selectOne()
            .from(PARAMETERS)
            .where(PARAMETERS.SPECIFIC_SCHEMA.equal(r1.SPECIFIC_SCHEMA))
            .and(PARAMETERS.SPECIFIC_NAME.equal(r1.SPECIFIC_NAME))
            .and(upper(PARAMETERS.PARAMETER_MODE).notEqual("IN"))),
                val("void"))
        .otherwise(r1.DATA_TYPE).as("data_type"),
    r1.NUMERIC_PRECISION,
    r1.NUMERIC_SCALE,
    r1.TYPE_UDT_NAME,
    decode().when(
    exists(
        create().selectOne()
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.notEqual(r1.SPECIFIC_NAME))),
        create().select(count())
            .from(r2)
            .where(r2.ROUTINE_SCHEMA.equal(getSchemaName()))
            .and(r2.ROUTINE_NAME.equal(r1.ROUTINE_NAME))
            .and(r2.SPECIFIC_NAME.lessOrEqual(r1.SPECIFIC_NAME)).asField())
    .as("overload"))
.from(r1)
.where(r1.ROUTINE_SCHEMA.equal(getSchemaName()))
.orderBy(r1.ROUTINE_NAME.asc())
.fetch()

Here’s another example from a library that looks quite appealing to me. It’s called jRTF and it’s used to create RTF documents in Java in a fluent style:

rtf()
  .header(
    color( 0xff, 0, 0 ).at( 0 ),
    color( 0, 0xff, 0 ).at( 1 ),
    color( 0, 0, 0xff ).at( 2 ),
    font( "Calibri" ).at( 0 ) )
  .section(
        p( font( 1, "Second paragraph" ) ),
        p( color( 1, "green" ) )
  )
).out( out );

Summary

Fluent APIs have been a hype for the last 7 years. Martin Fowler has become a heavily-cited man and gets most of the credits, even if fluent APIs were there before. One of Java’s oldest “fluent APIs” can be seen in java.lang.StringBuffer, which allows for appending arbitrary objects to a String. But the biggest benefit of a fluent API is its ability to easily map “external DSLs” into Java and implement them as “internal DSLs” of arbitrary complexity.