A Typesafety Comparison of SQL Access APIs

SQL is a very expressive and distinct language. It is one of the few declarative languages which are used by a broad audience in everyday work. As a declarative language, SQL allows to specify what we’re expecting as output, not how this output should be produced. As a side-effect of this, ad-hoc record data types are created by every statement. An example:

-- A (id: integer, title: varchar) type is created
SELECT id, title
FROM book;

The above statement generates a cursor whose records have a well-defined record type with these properties:

  • The degree of the record is 2
  • The column names are id and title
  • The column types are integer and varchar
  • The column id can be accessed at index 1. The column title can be accessed at index 2

In other words, SQL records combine features from records (access by name) and tuples (access by index). They can be seen like typesafe associative “map-arrays”, where map keys are formally bound to array indexes and their associated key/index type.

Another, more complex example shows how these ad-hoc record types can be reused within a SQL statement!

-- A (c: bigint) type is created
SELECT count(*) c
FROM book

-- A (..: integer, ..: integer) type is created and compared with...
WHERE (author_id, language_id) IN (

  -- ... another, compatible (..: integer, ..: integer) type
  SELECT a.id, a.language_id
  FROM author a
)

This query counts books written by authors in their native language.

In the above example, the projected record type is a bit simpler. It contains only one column. The interesting part is the row value expression IN comparison predicate, which compares two compatible (integer, integer) types. In SQL, you can typesafely create ad-hoc record types and immediately compare them with other ad-hoc record types. In these comparisons, column names are not important, but column indexes (and associated column types) are.

Comparing various SQL access APIs

The previous examples show how SQL allows for the formal declaration of record types including record degree, column names, column indexes, column types. While SQL is very expressive in that matter, many client languages accessing SQL are less expressive. When comparing expressiveness and typesafety, two features should be taken into consideration:

  1. Are the records produced into the client language typesafe?
  2. Are the SQL statements produced from the client language typesafe and syntax-safe?

Let’s have a look at various accessing techniques, and how expressive they are in terms of the above typesafety requirements:

JDBC: Least typesafety

JDBC offers the least expressiveness and typesafety. This isn’t surprising, as JDBC is a very low-level API. It offers:

  1. No typesafety whatsoever when accessing result records.
  2. No typesafety or syntax-safety whatsoever when producing SQL statements.

Here is an example:

PreparedStatement stmt = null;
ResultSet rs = null;

try {

  // SQL statements are just strings. Constructing them is not
  // typesafe or syntax-safe
  stmt = connection.prepareStatement(
    "SELECT id, title FROM book WHERE id = ?");

  // Bind values are set by index. There is no typesafety or
  // "index safety"
  stmt.setInt(1, 15);

  rs = stmt.executeQuery();
  while (rs.next()) {

    // There is no typesafety or "index safety" when accessing
    // result record values
    System.out.println(
      "ID: " + rs.getInt(1) + ", TITLE: " + rs.getString(2));
  }
}
finally {
  closeSafely(stmt, rs);
}

Now, this wasn’t surprising. JDBC makes up for the lack of typesafety by being absolutely general. It is possible to implement a JDBC driver for any type of relational database, no matter what kinds of SQL and JDBC features they really support.

JPA: Some typesafety

JPA has implemented quite a bit of typesafety mostly on top of JPQL, but also slightly on top of SQL. With JPA, you can have:

  1. Some typesafety when accessing records.
  2. Some typesafety and syntax-safety when producing JPQL statements through the CriteriaQuery API (not SQL statements).

Record access typesafety can be guaranteed when you project the outcome of your statements onto your JPA-annotated entities. While the mapping itself isn’t really typesafe, the outcome is, as a Java class is the closest match to a SQL record. A Java class, much like a SQL record, has:

  • A degree, expressed in the number of properties
  • Column names, expressed as property names
  • Column types, expressed as property types
  • But: No column indexes. Properties have no explicit order

JPA record mapping has additional features that exceed the expressiveness of SQL, as “flat”, tabular result sets can be mapped onto object hierarchies. In any case, you will have to create one record / entity type per query to profit from this typesafety. If you’re not projecting all columns from every table, but ad-hoc records (including values derived from functions), you will lose this typesafety again.

When it comes to statement typesafety, JPA offers the CriteriaQuery API to produce typesafe JPQL statements. The CriteriaQuery API is often criticised for its verboseness and for the fact that resulting client code is hard to read. Here is an example taken from the CriteriaQuery API docs:

CriteriaQuery<String> q = cb.createQuery(String.class);
Root<Order> order = q.from(Order.class);
q.select(order.get("shippingAddress").<String>get("state"));
 
CriteriaQuery<Product> q2 = cb.createQuery(Product.class);
q2.select(q2.from(Order.class)
            .join("items")
            .<Item,Product>join("product"));

It can be seen that there is only a limited amount of typesafety in the above query construction:

  • Columns are accessed by string literals, such as "shippingAddress".
  • Generic entity types are not really checked. The <Item,Product> generic parameters might as well be wrong.

Of course, there are more typesafe API parts in JPA’s CriteriaQuery API. Using those API parts quickly lead to the aforementioned verbosity, though, as can be seen in this Stack Overflow question, or in the Java EE 6 Tutorials.

LINQ: Much typesafety (in .NET)

LINQ goes very far in offering typesafety in both dimensions:

  1. Much typesafety when accessing records or tuples.
  2. Much typesafety when producing LINQ-to-SQL statements (not SQL statements).

As LINQ is formally integrated into various .NET languages, it has the advantage of being able to produce formally defined record types, directly into the target language (e.g. C#). Not only can typesafe records be produced, the LINQ-to-SQL statement is formally verified by the compiler as well. An example

// Typesafe renaming (aliasing with "AS" in SQL)
From p In db.Products
// Typesafe (named!) variable binding
Where p.UnitsInStock <= ReorderLevel AndAlso Not p.Discontinued
// The typesafe projection will produce a Products record
Select p

Another example from Stack Overflow can be seen here:

// Producing a C# tuple
var r = from u in db.Users
        join s in db.Staffs on u.Id equals s.UserId
        select new Tuple<User, Staff>(u, s);

// Producing an anonymous record type
var r = from u in db.Users
    select new { u.Name, 
                 u.Address,
                 ...,
                 (from s in db.Staffs 
                  select s.Password where u.Id == s.UserId) 
               };

LINQ has many obvious advantages when it comes to typesafety. In the case of LINQ, this comes at the price of losing actual SQL expressivity and syntax, as LINQ-to-SQL is not really SQL (just as JPQL is not really SQL either). The SQL querying API is partially shared with other, heterogeneous querying targets, such as LINQ-to-Entities, LINQ-to-Collections, LINQ-to-XML. This will reduce LINQ’s feature scope (see also a previous blog post, and I will soon blog about this again).

But C# offers all typesafety aspects that a SQL record offers as well: degree, column name (anonymous types), column index (tuples), column types (both types and tuples).

SLICK: Much typesafety (in Scala)

SLICK has been inspired by LINQ, and can thus offer a lot of typesafety as well. It offers:

  1. Much typesafety when accessing tuples (not records).
  2. Much typesafety when producing SLICK statements (not SQL statements).

SLICK takes advantage of Scala’s integrated tuple expressions. This is best shown by example:

// "for" is the "entry-point" to the DSL
val q = for {

    // FROM clause   WHERE clause
    c <- Coffees     if c.supID === 101

// SELECT clause and projection to a tuple
} yield (c.name, c.price)

The above example shows that the projection onto a (String, Int) tuple is done typesafely by the yield method. At the same time, the whole query expression is formally validated by the compiler, as SLICK makes heavy use of Scala’s language features in order to introduce an internal DSL for querying. Much more than LINQ, SLICK has a unique syntax that doesn’t remind of SQL any more. It is not obvious how subqueries, complex joins, grouping and aggregation can be expressed.

jOOQ: Much typesafety

jOOQ is mainly inspired by SQL itself and embraces all the features that SQL offers. It has thus:

  1. Much typesafety when accessing records or tuples.
  2. Much typesafety when producing SQL statements.

jOOQ offers similar capabilities as JPA when it comes to mapping SQL result sets onto records, although JPA’s mapping type hierarchies are not supported by jOOQ. But jOOQ also allows for typesafe tuple access, the way SLICK has implemented it. Ad-hoc records produced by arbitrary query projections will maintain their various column types through generic Record1<T1>, Record2<T1, T2>, Record3<T1, T2, T3>, … record types. Unlike in Java, this can be leveraged extensively in Scala, where these typesafe Record[N] types can be used just like Scala’s tuples.

On the other hand, just like LINQ-to-SQL, which has formally integrated querying as a first-class citizen into .NET languages, jOOQ allows for heavy type-checking and syntax-checking, when writing SQL statements in Java.

In SQL, you can typesafely write things like:

SELECT * FROM t WHERE (t.a, t.b) = (1, 2)
SELECT * FROM t WHERE (t.a, t.b) OVERLAPS (date1, date2)
SELECT * FROM t WHERE (t.a, t.b) IN (SELECT x, y FROM t2)
UPDATE t SET (a, b) = (SELECT x, y FROM t2 WHERE ...)
INSERT INTO t (a, b) VALUES (1, 2)

In jOOQ 3.0, you can (also typesafely!) write

select().from(t).where(row(t.a, t.b).eq(1, 2));
// Type-check here: ----------------->  ^^^^
 
select().from(t).where(row(t.a, t.b).overlaps(date1, date2));
// Type-check here: ------------------------> ^^^^^^^^^^^^
 
select().from(t).where(row(t.a, t.b).in(select(t2.x, t2.y).from(t2)));
// Type-check here: -------------------------> ^^^^^^^^^^
 
update(t).set(row(t.a, t.b), select(t2.x, t2.y).where(...));
// Type-check here: --------------> ^^^^^^^^^^

insertInto(t, t.a, t.b).values(1, 2);
// Type-check here: ---------> ^^^^

This also applies for existing API, which doesn’t involve row value expressions:

select().from(t).where(t.a.eq(select(t2.x).from(t2));
// Type-check here: ---------------> ^^^^
 
select().from(t).where(t.a.eq(any(select(t2.x).from(t2)));
// Type-check here: -------------------> ^^^^
 
select().from(t).where(t.a.in(select(t2.x).from(t2));
// Type-check here: ---------------> ^^^^

select(t1.a, t1.b).from(t1).union(select(t2.a, t2.b).from(t2));
// Type-check here: -------------------> ^^^^^^^^^^

jOOQ is not SQL, but unlike other attempts of introducing SQL as an internal domain-specific language into host languages like Java, Scala, C#, jOOQ looks very much like SQL thanks to its unique fluent API technique, which informally follows an underlying BNF notation.

Even if Java offers less expressiveness than other languages like C# or Scala, jOOQ probably comes closest to both result record typesafety and SQL syntax safety in the Java world.