A Functional Programming Approach to Dynamic SQL with jOOQ


Typesafe embedded DSLs like jOOQ are extremely powerful for dynamic SQL, because the query you’re constructing with the jOOQ DSL is a dynamic query by nature. You’re constructing a query expression tree using a convenient API (the “DSL”), even if you think your SQL statement is static. For instance:

for (Record rec : ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                     .from(ACTOR)
                     .where(ACTOR.FIRST_NAME.like("A%")))

    System.out.println(rec.get(ACTOR.FIRST_NAME) 
               + " " + rec.get(ACTOR.LAST_NAME));

The above query looks like a static SQL statement, the way you would write it in PL/SQL, for instance:

FOR rec IN (
  SELECT first_name, last_name
  FROM actor
  WHERE first_name LIKE 'A%'
) LOOP
  dbms_output.put_line(rec.first_name
             || ' ' || rec.last_name);
END LOOP;

The PL/SQL implicit cursor loop runs over the records produced by a pre-compiled SQL statement. That’s not the case with the jOOQ statement, in case of which the Java runtime re-creates the jOOQ statement expression tree every time afresh by dynamically creating an org.jooq.Select object, step by step (more about how the DSL works here).

Using jOOQ for actual dynamic SQL

As we’ve seen before, all jOOQ statements are dynamic statements, even if they “feel” static. Sometimes, you actually want a dynamic SQL query, e.g. when the user is allowed to specify custom predicates. In this case, you could do something like this:

// By default, make the dynamic predicate "TRUE"
Condition condition = DSL.trueCondition();

// If the user entered something in the text search field...
if (hasFirstNameSearch())
    condition = condition.and(FIRST_NAME.like(firstNameSearch()));

// If the user entered something in another text search field...
if (hasLastNameSearch())
    condition = condition.and(LAST_NAME.like(lastNameSearch()));

// The query now uses a dynamically created predicate
for (Record rec : ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                     .from(ACTOR)
                     .where(condition))

    System.out.println(rec.get(ACTOR.FIRST_NAME) 
               + " " + rec.get(ACTOR.LAST_NAME));

The above is not possible with PL/SQL easily, you’d have to resort to the dynamic SQL API called DBMS_SQL, which is about as verbose (and error-prone) as JDBC, as you’re concatenating SQL strings.

Adding functional programming to the mix

If you’re able to construct the entire query in a local scope, e.g. inside of a method, the above imperative style is quite sufficient. But sometimes, you may have something like a “base” query that you want to re-use all the time, and only sometimes, you want to add a custom predicate, or JOIN operation, etc.

In this case, using a more functional approach is optimal. For instance, you could offer a convenience API that produces a query fetching actor first and last names, with custom predicates:

// Higher order, SQL query producing function:
public static ResultQuery<Record2<String, String>> actors(
    Function<Actor, Condition> where
) {
    return ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
              .from(ACTOR)
              .where(where.apply(ACTOR)));
}

The above utility method doesn’t actually execute the query, it just constructs it and takes a function as an argument. In the old days, this used to be called the “strategy pattern”, which is implemented much more easily with a function, than with an object oriented approach (see also Mario Fusco’s interesting blog series about the Gang of Four design patterns).

How to call the above utility? Easy!

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%")))
    System.out.println(rec);

Now, this is not versatile enough yet, as we can pass only one function. How about this, instead:

@SafeVarargs
public static ResultQuery<Record2<String, String>> actors(
    Function<Actor, Condition>... where
) {
    return dsl().select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
                .from(ACTOR)
                .where(Arrays.stream(where)
                             .map(f -> f.apply(ACTOR))
                             .collect(Collectors.toList()));
}

(notice how we can immediately execute and iterate over the ResultQuery, as it implements Iterable)

We can now call this with any number of input functions to form dynamic predicates. E.g.:

// Get all actors
for (Record rec : actors())
    System.out.println(rec);

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%"))) {
    System.out.println(rec);

// Get actors whose first/last name matches "A% B%"
for (Record rec : actors(
        a -> a.FIRST_NAME.like("A%"),
        a -> a.LAST_NAME.like("B%"))) {
    System.out.println(rec);

You get the idea.

Conclusion

… the idea is that jOOQ is an extremely powerful SQL expression tree API, which allows you to dynamically construct SQL queries of arbitrary complexity. If you’re running a static query, this just means that all of your SQL expressions are constant every time you execute the query.

There are no limits to how far you can push this. We’ve seen jOOQ users write queries that dynamically assemble dozens of common table expressions with several levels of dynamically nested derived tables, too. If you have a crazy example to share, we’re looking forward to it!

A SQL query DSL for Scala by ScalikeJDBC


There are a tremendous amount of SQL APIs natively written in Scala. Manuel Bernhardt has summarised a nice collection in his a post. Another collection of Scala SQL APIs can be seen in this Stack Overflow question.

One API that we want to focus on in particular is ScalikeJDBC (licensed ASL 2.0), which has recently published a SQL query DSL API similar to that of jOOQ. See the full documentation here:

http://scalikejdbc.org/documentation/query-dsl.html

A couple of examples:

val orders: List[Order] = withSQL {
  select
    .from(Order as o)
    .innerJoin(Product as p).on(o.productId, p.id)
    .leftJoin(Account as a).on(o.accountId, a.id)
    .where.eq(o.productId, 123)
    .orderBy(o.id).desc
    .limit(4)
    .offset(0)
  }.map(Order(o, p, a)).list.apply()

The above example looks very similar to jOOQ code, except that the SELECT DSL seems to be a bit more rigid than jOOQ’s. For instance, it is not immediately obvious how to connect several complex predicates in that WHERE clause, or if complex predicates are available at all.

What’s really nice, however, is their way of leveraging Scala language features to provide a very fluent way of constructing dynamic SQL, as can be seen in this example:

def findOrder(id: Long, accountRequired: Boolean) = 
withSQL {
  select
    .from[Order](Order as o)
    .innerJoin(Product as p).on(o.productId, p.id)
    .map { sql =>
      if (accountRequired) 
        sql.leftJoin(Account as a)
           .on(o.accountId, a.id)
      else 
        sql
    }.where.eq(o.id, 13)
  }.map { rs =>
    if (accountRequired) 
      Order(o, p, a)(rs) 
    else 
      Order(o, p)(rs)
  }.single.apply()

From how we understand things, the map method that is invoked in the middle of the SQL statement (between innerJoin and where) can transform the intermediate DSL state using a lambda expression that allows for appending a leftJoin if needed. Obviously, this can be done in a more procedural fashion as well, by assigning that intermediate DSL state to a local variable.

The need for SQL query DSLs

We’ve blogged about many of these similar SQL query DSLs in the past. The fact that they constantly pop up in various APIs is no coincidence. SQL is a very typesafe and composable language that is hard to use dynamically through string-based APIs such as JDBC, ODBC, etc.

Having a typesafe internal domain-specific language model SQL in a host language like Java or Scala brings great advantages. But the disadvantages may shine through quickly, when the DSL is not carefully crafted in a completely foreseeable way. Take the following ScalikeJDBC QueryDSL example, for instance:

val ids = withSQL {
  select(o.result.id).from(Order as o)
    .where(sqls.toAndConditionOpt(
      productId.map(id => sqls.eq(o.productId, id)),
      accountId.map(id => sqls.eq(o.accountId, id))
    ))
    .orderBy(o.id)
}.map(_.int(1)).list.apply()

This toAndConditionOpt method is really unexpected and doesn’t follow the principle of least astonishment.

This is why jOOQ’s API design is based on a formal BNF that closely mimicks SQL itself. Read more about that here.

Advanced Java Trickery for Typesafe Query DSLs


When browsing Hacker News, I recently stumbled upon Benji Weber’s most interesting attempt at creating typesafe database interaction with Java 8. Benji created a typesafe query DSL somewhat similar to jOOQ with the important difference that it uses Java 8 method references to introspect POJOs and deduce query elements from it. This is best explained by example:

Optional<Person> person = 
    from(Person.class)
        .where(Person::getFirstName)
        .like("%ji")
        .and(Person::getLastName)
        .equalTo("weber")
        .select(
            personMapper, 
            connectionFactory::openConnection);

The above query can then be transformed into the following SQL statement:

SELECT * FROM person 
WHERE first_name LIKE ? AND last_name = ?

This is indeed a very interesting approach, and we’ve seen similar ideas around, before. Most prominently, such ideas were implemented in:

  • JaQu, another very interesting competitor product of jOOQ, created by Thomas Müller, the maintainer of the popular H2 database
  • LambdaJ, an attempt to bring lambda expressions to Java long before Java 8
  • OhmDB, a new NoSQL data store with a fluent query DSL

What’s new in Benji’s approach is really the fact that Java 8 method references can be used instead of resorting to CGLIB and other sorts of bytecode trickery through instrumentation. An example of such trickery is JaQu’s experimental bytecode introspection to transform complex Java boolean expressions into SQL – called “natural syntax”:

Timestamp ts = 
  Timestamp.valueOf("2005-05-05 05:05:05");
Time t = Time.valueOf("23:23:23");

long count = db.from(co).
    where(new Filter() { public boolean where() {
        return co.id == x
            && co.name.equals(name)
            && co.value == new BigDecimal("1")
            && co.amount == 1L
            && co.birthday.before(new Date())
            && co.created.before(ts)
            && co.time.before(t);
        } }).selectCount();

While these ideas are certainly very interesting to play around with, we doubt such language and bytecode transformations will lead to robust results. People have criticised Hibernate’s use of proxying in various blog posts.

We prefer a WYSIWYG approach where API consumers remain in full control of what is going on. What are your thoughts about such clever ideas?

Why Did SQLJ Die?


Every now and then, SQLJ pops up somewhere, mostly in a very dusty/enterprisey or in an academic context.

If you give SQLJ some thought, though, it isn’t such a bad idea. It is:

  • An ANSI and ISO standard
  • Part of the SQL standard
  • Quite easy to understand
  • Quite a powerful extension to JDBC

So why did it die (or rather, why did it never really take off)? This question was asked on Stack Overflow, and we gave an answer.

Let’s assume that you have already decided to embed your SQL (as opposed to externalising it through a templating mechanism, hiding it with an ORM, or with stored procedures). Here are a couple of reasons why SQLJ is not an optimal solution for embedding SQL:

IDE support

While Pro*C worked well for C and C++ in the 90s, Java really took off in the early 2000’s. With Java, there were also an increasing number of powerful IDEs such as Eclipse, NetBeans, JBuilder, and others. Java preprocessors and IDEs have never become friends, though, as parsing one language is hard enough. Parsing (and providing tooling) for two languages is much harder.

In fact, SQLJ made the surrounding Java code type-unsafe as IDEs and compilers couldn’t process .sqlj files before they had been pre-processed.

SQL popularity

There was a time when people started thinking that SQL itself was going to be dead. First, they did so with the advent of ORMs, then they did so with the advent of NoSQL. People thought that the DBA is dead. Again.

Well, this has been proven to be wrong a couple of times, but certainly not because of SQLJ.

Typesafety

In the late 2000’s, there had now been typesafe alternatives to SQLJ, such as jOOQ in Java, or LINQ-to.SQL in .NET, which leverage IDE features such as syntax autocompletion. By being internal domain-specific languages / query DSLs, these APIs not only bring typesafety to embedded SQL, but they also allow for dynamic SQL building, which SQLJ doesn’t support.

Predictions

While embedding SQL into other languages is a useful thing, SQLJ never solved this problem adequately. Hence, R.I.P., SQLJ

Ecto a Slick Query DSL for the Elixir Language


Elixir? What on earth is Elixir? It is a programming language that somewhat reminds me of Ruby. Here is some example Elixir code:

defmodule Hello do
  IO.puts "Defining the function world"

  def world do
    IO.puts "Hello World"
  end

  IO.puts "Function world defined"
end

And it has a Stack Overflow community with around 50 tagged questions as of December 2013. And it has a query DSL library called Ecto, which describes itself as such:

Ecto is a domain specific language for writing queries and interacting with databases in Elixir.

Example queries can be seen on the project’s GitHub readme page:

A simple SELECT

use Ecto.Query

query = from w in Weather,
      where: w.prcp > 0 or w.prcp == nil,
     select: w

Repo.all(query)

Using JOINs

query = from p in Post,
      where: p.id == 42,
  left_join: c in p.comments,
     select: assoc(p, c)

[post] = Repo.all(query)

post.comments.to_list

Looks like a slick (but not typesafe?) querying DSL mix between LINQ and Clojure’s sqlkorma.

For those brave ones among you using the Elixir language, this might be your one (and only) choice to access a SQL database! Good luck!

Simplernate. A Query DSL for Hibernate, Inspired by jOOQ


Simplernate” ! The name is very compelling. And so is the idea – we think. Specifically, because we get most credit for this new project in the following short readme:
https://gist.github.com/thermz/eb1b12b2146168a08e68

It reads:

Simplernate is (will be) an Hibernate wrapper that help developer to query the database using Hibernate ORM. To do this, Simplernate offers a syntax inspired by jOOQ philosofy [sic].

This obviously reminds us of Torpedoquery, another typesafe query DSL that wraps Hibernate and Hibernate’s HQL. Unfortunately, there isn’t that much to see yet, but we will certainly follow up on future progress of a project with such a promising name!

A Typesafety Comparison of SQL Access APIs


SQL is a very expressive and distinct language. It is one of the few declarative languages which are used by a broad audience in everyday work. As a declarative language, SQL allows to specify what we’re expecting as output, not how this output should be produced. As a side-effect of this, ad-hoc record data types are created by every statement. An example:

-- A (id: integer, title: varchar) type is created
SELECT id, title
FROM book;

The above statement generates a cursor whose records have a well-defined record type with these properties:

  • The degree of the record is 2
  • The column names are id and title
  • The column types are integer and varchar
  • The column id can be accessed at index 1. The column title can be accessed at index 2

In other words, SQL records combine features from records (access by name) and tuples (access by index). They can be seen like typesafe associative “map-arrays”, where map keys are formally bound to array indexes and their associated key/index type.

Another, more complex example shows how these ad-hoc record types can be reused within a SQL statement!

-- A (c: bigint) type is created
SELECT count(*) c
FROM book

-- A (..: integer, ..: integer) type is created and compared with...
WHERE (author_id, language_id) IN (

  -- ... another, compatible (..: integer, ..: integer) type
  SELECT a.id, a.language_id
  FROM author a
)

This query counts books written by authors in their native language.

In the above example, the projected record type is a bit simpler. It contains only one column. The interesting part is the row value expression IN comparison predicate, which compares two compatible (integer, integer) types. In SQL, you can typesafely create ad-hoc record types and immediately compare them with other ad-hoc record types. In these comparisons, column names are not important, but column indexes (and associated column types) are.

Comparing various SQL access APIs

The previous examples show how SQL allows for the formal declaration of record types including record degree, column names, column indexes, column types. While SQL is very expressive in that matter, many client languages accessing SQL are less expressive. When comparing expressiveness and typesafety, two features should be taken into consideration:

  1. Are the records produced into the client language typesafe?
  2. Are the SQL statements produced from the client language typesafe and syntax-safe?

Let’s have a look at various accessing techniques, and how expressive they are in terms of the above typesafety requirements:

JDBC: Least typesafety

JDBC offers the least expressiveness and typesafety. This isn’t surprising, as JDBC is a very low-level API. It offers:

  1. No typesafety whatsoever when accessing result records.
  2. No typesafety or syntax-safety whatsoever when producing SQL statements.

Here is an example:

PreparedStatement stmt = null;
ResultSet rs = null;

try {

  // SQL statements are just strings. Constructing them is not
  // typesafe or syntax-safe
  stmt = connection.prepareStatement(
    "SELECT id, title FROM book WHERE id = ?");

  // Bind values are set by index. There is no typesafety or
  // "index safety"
  stmt.setInt(1, 15);

  rs = stmt.executeQuery();
  while (rs.next()) {

    // There is no typesafety or "index safety" when accessing
    // result record values
    System.out.println(
      "ID: " + rs.getInt(1) + ", TITLE: " + rs.getString(2));
  }
}
finally {
  closeSafely(stmt, rs);
}

Now, this wasn’t surprising. JDBC makes up for the lack of typesafety by being absolutely general. It is possible to implement a JDBC driver for any type of relational database, no matter what kinds of SQL and JDBC features they really support.

JPA: Some typesafety

JPA has implemented quite a bit of typesafety mostly on top of JPQL, but also slightly on top of SQL. With JPA, you can have:

  1. Some typesafety when accessing records.
  2. Some typesafety and syntax-safety when producing JPQL statements through the CriteriaQuery API (not SQL statements).

Record access typesafety can be guaranteed when you project the outcome of your statements onto your JPA-annotated entities. While the mapping itself isn’t really typesafe, the outcome is, as a Java class is the closest match to a SQL record. A Java class, much like a SQL record, has:

  • A degree, expressed in the number of properties
  • Column names, expressed as property names
  • Column types, expressed as property types
  • But: No column indexes. Properties have no explicit order

JPA record mapping has additional features that exceed the expressiveness of SQL, as “flat”, tabular result sets can be mapped onto object hierarchies. In any case, you will have to create one record / entity type per query to profit from this typesafety. If you’re not projecting all columns from every table, but ad-hoc records (including values derived from functions), you will lose this typesafety again.

When it comes to statement typesafety, JPA offers the CriteriaQuery API to produce typesafe JPQL statements. The CriteriaQuery API is often criticised for its verboseness and for the fact that resulting client code is hard to read. Here is an example taken from the CriteriaQuery API docs:

CriteriaQuery<String> q = cb.createQuery(String.class);
Root<Order> order = q.from(Order.class);
q.select(order.get("shippingAddress").<String>get("state"));
 
CriteriaQuery<Product> q2 = cb.createQuery(Product.class);
q2.select(q2.from(Order.class)
            .join("items")
            .<Item,Product>join("product"));

It can be seen that there is only a limited amount of typesafety in the above query construction:

  • Columns are accessed by string literals, such as "shippingAddress".
  • Generic entity types are not really checked. The <Item,Product> generic parameters might as well be wrong.

Of course, there are more typesafe API parts in JPA’s CriteriaQuery API. Using those API parts quickly lead to the aforementioned verbosity, though, as can be seen in this Stack Overflow question, or in the Java EE 6 Tutorials.

LINQ: Much typesafety (in .NET)

LINQ goes very far in offering typesafety in both dimensions:

  1. Much typesafety when accessing records or tuples.
  2. Much typesafety when producing LINQ-to-SQL statements (not SQL statements).

As LINQ is formally integrated into various .NET languages, it has the advantage of being able to produce formally defined record types, directly into the target language (e.g. C#). Not only can typesafe records be produced, the LINQ-to-SQL statement is formally verified by the compiler as well. An example

// Typesafe renaming (aliasing with "AS" in SQL)
From p In db.Products
// Typesafe (named!) variable binding
Where p.UnitsInStock <= ReorderLevel AndAlso Not p.Discontinued
// The typesafe projection will produce a Products record
Select p

Another example from Stack Overflow can be seen here:

// Producing a C# tuple
var r = from u in db.Users
        join s in db.Staffs on u.Id equals s.UserId
        select new Tuple<User, Staff>(u, s);

// Producing an anonymous record type
var r = from u in db.Users
    select new { u.Name, 
                 u.Address,
                 ...,
                 (from s in db.Staffs 
                  select s.Password where u.Id == s.UserId) 
               };

LINQ has many obvious advantages when it comes to typesafety. In the case of LINQ, this comes at the price of losing actual SQL expressivity and syntax, as LINQ-to-SQL is not really SQL (just as JPQL is not really SQL either). The SQL querying API is partially shared with other, heterogeneous querying targets, such as LINQ-to-Entities, LINQ-to-Collections, LINQ-to-XML. This will reduce LINQ’s feature scope (see also a previous blog post, and I will soon blog about this again).

But C# offers all typesafety aspects that a SQL record offers as well: degree, column name (anonymous types), column index (tuples), column types (both types and tuples).

SLICK: Much typesafety (in Scala)

SLICK has been inspired by LINQ, and can thus offer a lot of typesafety as well. It offers:

  1. Much typesafety when accessing tuples (not records).
  2. Much typesafety when producing SLICK statements (not SQL statements).

SLICK takes advantage of Scala’s integrated tuple expressions. This is best shown by example:

// "for" is the "entry-point" to the DSL
val q = for {

    // FROM clause   WHERE clause
    c <- Coffees     if c.supID === 101

// SELECT clause and projection to a tuple
} yield (c.name, c.price)

The above example shows that the projection onto a (String, Int) tuple is done typesafely by the yield method. At the same time, the whole query expression is formally validated by the compiler, as SLICK makes heavy use of Scala’s language features in order to introduce an internal DSL for querying. Much more than LINQ, SLICK has a unique syntax that doesn’t remind of SQL any more. It is not obvious how subqueries, complex joins, grouping and aggregation can be expressed.

jOOQ: Much typesafety

jOOQ is mainly inspired by SQL itself and embraces all the features that SQL offers. It has thus:

  1. Much typesafety when accessing records or tuples.
  2. Much typesafety when producing SQL statements.

jOOQ offers similar capabilities as JPA when it comes to mapping SQL result sets onto records, although JPA’s mapping type hierarchies are not supported by jOOQ. But jOOQ also allows for typesafe tuple access, the way SLICK has implemented it. Ad-hoc records produced by arbitrary query projections will maintain their various column types through generic Record1<T1>, Record2<T1, T2>, Record3<T1, T2, T3>, … record types. Unlike in Java, this can be leveraged extensively in Scala, where these typesafe Record[N] types can be used just like Scala’s tuples.

On the other hand, just like LINQ-to-SQL, which has formally integrated querying as a first-class citizen into .NET languages, jOOQ allows for heavy type-checking and syntax-checking, when writing SQL statements in Java.

In SQL, you can typesafely write things like:

SELECT * FROM t WHERE (t.a, t.b) = (1, 2)
SELECT * FROM t WHERE (t.a, t.b) OVERLAPS (date1, date2)
SELECT * FROM t WHERE (t.a, t.b) IN (SELECT x, y FROM t2)
UPDATE t SET (a, b) = (SELECT x, y FROM t2 WHERE ...)
INSERT INTO t (a, b) VALUES (1, 2)

In jOOQ 3.0, you can (also typesafely!) write

select().from(t).where(row(t.a, t.b).eq(1, 2));
// Type-check here: ----------------->  ^^^^
 
select().from(t).where(row(t.a, t.b).overlaps(date1, date2));
// Type-check here: ------------------------> ^^^^^^^^^^^^
 
select().from(t).where(row(t.a, t.b).in(select(t2.x, t2.y).from(t2)));
// Type-check here: -------------------------> ^^^^^^^^^^
 
update(t).set(row(t.a, t.b), select(t2.x, t2.y).where(...));
// Type-check here: --------------> ^^^^^^^^^^

insertInto(t, t.a, t.b).values(1, 2);
// Type-check here: ---------> ^^^^

This also applies for existing API, which doesn’t involve row value expressions:

select().from(t).where(t.a.eq(select(t2.x).from(t2));
// Type-check here: ---------------> ^^^^
 
select().from(t).where(t.a.eq(any(select(t2.x).from(t2)));
// Type-check here: -------------------> ^^^^
 
select().from(t).where(t.a.in(select(t2.x).from(t2));
// Type-check here: ---------------> ^^^^

select(t1.a, t1.b).from(t1).union(select(t2.a, t2.b).from(t2));
// Type-check here: -------------------> ^^^^^^^^^^

jOOQ is not SQL, but unlike other attempts of introducing SQL as an internal domain-specific language into host languages like Java, Scala, C#, jOOQ looks very much like SQL thanks to its unique fluent API technique, which informally follows an underlying BNF notation.

Even if Java offers less expressiveness than other languages like C# or Scala, jOOQ probably comes closest to both result record typesafety and SQL syntax safety in the Java world.