Still Using Windows 3.1? So why stick to SQL-92?

We’ve been blogging a lot about the merits of modern SQL on the jOOQ blog. Specifically, window functions are one of the most fascinating features. But there are many many others.

Markus Winand, author of the popular book SQL Performance Explained has recently given a very well-researched talk about modern SQL. We particularly like his headline:

Should this wonderful presentation have convinced you to buy a copy of SQL Performance Explained (our review here), do not forget to enter the “jOOQ” promo code for an exclusive 10% discount!

Let’s revise the SQL FROM clause

Intuitive SQL

SQL is extremely simple and yet at times, tricky. Most SQL developers have an intuitive (as opposed to formal) understanding of how the language works, for two reasons:

  1. It is designed “intuitively”, like a natural language. Maybe that keeps us from studying it more formally
  2. The formal language specification is not really freely available (except SQL-92), nor easy to read, nor necessary to get started

Of course, there are also those to whom SQL is a mystery. Let’s ignore those and have a look at how a simple SELECT statement works:

SELECT *
FROM my_table
JOIN my_other_table ON id1 = id2
JOIN my_third_table ON id2 = id3
WHERE ...

This looks like we’re using 4 of the SELECT statement’s many clauses:

  • SELECT: specifying the projection
  • FROM: specifying the “main” table
  • JOIN: specifying several “joined” tables
  • WHERE: specifying the selection / filtering

Misinterpretation of the “JOIN clause”

But in fact, our intuition is slightly wrong. “Our intuition…” Of course, you might already know this but I feel it’s something we don’t think about every day… How are we wrong? By the fact that we tend to think there is a formal “JOIN” clause in a SELECT statement. There isn’t. Here’s a simplified T-SQL definition:

<query_specification> ::= 
SELECT [ ALL | DISTINCT ] < select_list > 
    [ FROM { <table_source> } [ ,...n ] ] 
    [ WHERE <search_condition> ] 
    [ <GROUP BY> ] 
    [ HAVING < search_condition > ] 

As you can see, there is no explicit JOIN clause in a SELECT statement, even if we tend to phrase our SQL in a way that would indicate so. Let’s have a closer look at the (simplified) definition of <table_source>

<table_source> ::= 
{
    table_or_view_name [ [ AS ] table_alias ]
    | user_defined_function [ [ AS ] table_alias ]
    | derived_table [ AS ] table_alias [ ( column_alias [ ,...n ] ) ] 
    | <joined_table> 
}
<joined_table> ::= 
{
    <table_source> <join_type> <table_source> ON <search_condition> 
    | <table_source> CROSS JOIN <table_source> 
    | [ ( ] <joined_table> [ ) ] 
}
<join_type> ::= 
    [ { INNER | { { LEFT | RIGHT | FULL } [ OUTER ] } } [ <join_hint> ] ]
    JOIN

So in fact, the true reading of our above SQL statement goes like this:

SELECT *
FROM ((my_table JOIN my_other_table ON id1 = id2)
                JOIN my_third_table ON id2 = id3)
WHERE ...

We’re actually creating a tree of table sources, first by joining “my_other_table” to “my_table” and then by joining “my_third_table” to the previously created table source. When this is understood, the following statement will no longer be surprising:

SELECT *
FROM t1 JOIN t2 ON id1 = id2,
     another_table,
     t3 JOIN t4 ON id3 = id4,
     yet_another_table,
     t5 JOIN t6 ON id5 = id6
WHERE ...

Here, we select from an overall cross product of 5 table sources, some of which are actual tables, some of which are created from joined tables…

Implications for jOOQ

In jOOQ, this is about to be supported for release 2.0.3. The org.jooq.Table API will be enhanced by several join methods. You will be able to write things like this:

// Create a specific type of table source
Table<Record> tableSource =
  MY_TABLE.join(MY_OTHER_TABLE).on(ID1.equal(ID2))
          .join(MY_THIRD_TABLE).on(ID2.equal(ID3));

// Use the above table source in a select statement
create.select()
      .from(tableSource)
      .where(...);

// Use several "complex" table sources
create.select()
      .from(T1.join(T2).on(ID1.equal(ID2)),
            ANOTHER_TABLE,
            T3.join(T4).on(ID3.equal(ID4)),
            YET_ANOTHER_TABLE,
            T5.join(T6).on(ID5.equal(ID6)))
      .where(...);

The existing syntax will be maintained as a fully compatible convenience API. These additions are an important step forward for creating more complex table sources, also involving Oracle (and SQL Server) PIVOT and UNPIVOT tables, about which I will blog in one of the next posts. Stay tuned!