jOOQ Tuesdays: Gerald Sangudi and Keshav Murthy Reveal the Secrets of N1QL (SQL on JSON)

Welcome to the jOOQ Tuesdays series. In this series, we’ll publish an article on the third Tuesday every other month where we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, Open Source, and a variety of other related topics.

I’m very excited to feature today Gerald Sangudi and Keshav Murthy, creators of the the N1QL query language, CouchBase’s SQL-based JSON querying language.

Hi Gerald and Keshav – It’s great to have two people on the jOOQ Tuesdays interview at once! How did you guys meet?

Keshav: Gerald interviewed me for my the job at Couchbase.

Gerald: Best decision we ever made.

You’re both working at CouchBase, one of the leading “JSON databases” in the market. What drove you towards JSON?

Keshav: I worked for IBM, specifically on the Informix database. In 2013, Websphere teams needed JSON support. I was skeptical of JSON for data management except for managing metadata.  Years before, I had successfully opposed implementing XML and XQuery inside Informix. Then I went to a NoSQL conference where I saw enterprises using JSON in real-world applications.  I saw enterprises like eBay, Cisco developing their enterprise applications on NoSQL and JSON.

RDBMS customers had been asking for ALTER TABLE ONLINE when they needed to add new columns. They typically would have additional unused  columns which they’d rename them when needed.  Obviously, this wasn’t an optimal or a good solution because you could run out these columns, types or structure wasn’t flexible.

For years, customers had been ALTER TABLE while application were online.  I realized, JSON was a good way to provide it. That’s when we added JSON as a type, added sharding, and started to extend SQL to support JSON.  A good relational database can deliver lot of value JSON.  So, we developed a mechanism to exploit JSON to provide schema flexibility on relational tables and we received a patent for it: http://bit.ly/2og1uXe.

At Couchbase, data is stored as JSON and N1QL was designed as SQL for JSON.  So, we keep calm and JSON.

Gerald: I am not at Couchbase any more, but I was there for four years. Couchbase has its roots in memcached and CouchDB, so Couchbase was already storing and managing JSON data before N1QL and before we arrived. N1QL was developed to enable our users to query and manipulate their JSON data.

You’re both working on, or have worked on N1QL, CouchBase’s innovative SQL-esque query language for JSON. Has the SQL language come full circle? Why is the SQL syntax such a great fit for you?

Keshav: Don Chamberlin, co-inventor of SQL said this. During XQuery discussion, he argued against using SQL for manipulating XML. Now, with N1QL and SQL++ on JSON, he sees enormous possibilities for SQL to manipulate JSON effectively.  So, you could say, it has come a full circle.

From the Application point of view, requirement for SQL to support complex data models has been there.  SQL-99 added the structured type into the language, but didn’t recognize need for schema flexibility.

What Gerald has very nicely is to inherit SQL and  extend not just the language but the
underlying boolean logic.  

SQL has three valued logic: TRUE, FALSE, NULL.  N1QL has 4-valued logic: TRUE, FALSE, NULL and MISSING.  SQL has the select-join-project operations. N1QL has those and adds NEST and UNNEST operations for handling arrays.  Once we have the logic and operations, the type system, rest of the expressions for handling nested objects and arrays can be added.

There is another important change in N1QL compared to SQL.  SQL  is the query language to manipulate data.  N1QL can discover the document metadata (names, structure and types) dynamically and operate on it.

Gerald: SQL is the greatest and most successful query language of all time. Our job was to enable our users to query data that is sometimes different from what standard SQL expects. We hope N1QL does that.

As an aside, the “N1” in N1QL stands for non-first normal form. SQL geeks like you Lukas may know that “non-first” is one primary difference between JSON data and relational data. The other primary differences are schema and uniformity.

Will your relational competition steal features from N1QL? Or will you steal more features from them?

Keshav: I do hope relational databases steal features from N1QL. Having common approaches solve problems makes it easier for customers to choose the right database for the right problem. I do hope they choose Couchbase more often than RDBMS!

Gerald always says, we don’t differ from SQL unless there is a good reason. In that sense, we’ve taken lot of the features from relational databases already.  In addition, we learn from successful models in relational databases for things like index design, query optimization, security and monitoring.  We stand on shoulders of giants.

Gerald: We hope that both relational and non-relational vendors steal features from each other. There is a collaborative effort on something called SQL++, which is a superset of both SQL and N1QL. We hope SQL++ is the convergence point. One lesson from the success of SQL is that standards are great for both users and vendors.

CouchBase doesn’t have what you call a “static schema”. The biggest advantage of a static schema for the database optimiser is the many ways such a schema can be used to predict performance and choose optimal execution plans. How does optimisation work in a “schema-less” database?

Keshav: Actually, static schema gives you the structure, but not the data distribution, which is a major factor for calculating the execution cost. N1QL uses the information within the query and available indexes to glean the structure and decide on the plan. For example, if you have a query with a predicate:  WHERE state = “CA” and zipcode = “94040”, and there is an index on either state, zipcode or both, we’d assume these key-values exist in the document and push down the predicates to index scan.  We given further details on an article on DZone:  https://dzone.com/articles/a-deep-dive-into-couchbase-n1ql-query-optimization

Separately, we do have a mechanism to INFER the schema by sampling.  Right now, we show the inferred schema so users can understand the structure.  We also use the inferred schema to make query editing easier with hints and validations within the workbench.  We do have plans to use this, collect additional statistics to improve decisions in the optimizer.

Gerald: The N1QL optimizer is one of the joys of working on N1QL. As Keshav said, most of the SQL optimization techniques carry over to N1QL. The data may not have a static schema, but the query has an implicit schema, and the indexes have implicit schemas. That is, the query has predicates and other characteristics, and each index has keys and other characteristics. That is enough to keep an optimizer busy.

Working for a newer database vendor, you don’t have as much legacy, so you can innovate faster and more freely. What will be the next big thing in the database market?

Keshav: Our metric for innovation is the progress customer trying to make in their business. So, we innovate within the constraint of a customer job.  That helps us to innovate and measure its success from customer point of view.

We see customers deploying NoSQL databases for newer patterns like systems of engagement.

When you plan your vacation, you search a lot before you buy.  Search for places, hotels, airlines, things-to-do.  Then you compare costs, ratings, before you make the final purchase. For a travel company, these require significant database infrastructure to support high number of queries with low latencies and at a very low cost.

We see customers deploying NoSQL databases to support lot of the customer engagement and information requirement use cases.  RDBMS is still used as system of record for buy-sell-cancel-checkin-etc operations but will integrate with system of engagement databases to enhance customer experience.

Doing this effectively requires innovations in every area: data platform designs, scale out, query processing, index designs and manageability.

Gerald: I read Jeff Bezos’ latest annual letter to shareholders.  He says that the things that don’t change are more important than the things that do change. Databases should store your data reliably and give you answers and updates quickly. If database vendors continue to do that, we’ll be ok.

Using Kotlin’s Apply Function for Dynamic SQL with jOOQ

It was hard to limit ourselves to 10 Nice Examples of Writing SQL in Kotlin With jOOQ, recently, because the Kotlin language has many nice little features that really help a lot when working with Java libraries. We’ve talked about the nice with() stdlib function, which allows to “import” a namespace for a local scope or closure:

with (AUTHOR) {
    ctx.select(FIRST_NAME, LAST_NAME)
       .from(AUTHOR)
       .where(ID.lt(5))
       .orderBy(ID)
       .fetch {
           println("${it[FIRST_NAME]} ${it[LAST_NAME]}")
       }
}

In the above example, the AUTHOR table is made available as the this reference in the closure following the with function, which works exactly like JavaScript’s with(). Everything in AUTHOR is available, without dereferencing it from AUTHOR.

Apply is very similar

A very similar feature is made available through apply(), although with different syntactic implications. Check out this Stack Overflow question for some details about with() vs. apply() in Kotlin.

When using jOOQ, apply() is most useful for dynamic SQL. Imagine you have local variables indicating whether some parts of a query should be added to the query:

val filtering = true;
val joining = true;

These boolean variables would be evaluated dynamically, of course. filtering specifies whether a dynamic filter / where clause is needed, whereas joining specifies whether an additional JOIN is required.

So, the following query will select authors, and:

  • if “filtering”, we’re selecting only author ID = 1
  • if “joining”, we’ll join the books table and count the number of books per author

Both of these predicates are independent. Enter the game: apply():

ctx.select(
      a.FIRST_NAME, 
      a.LAST_NAME, 
      if (joining) count() else value(""))
   .from(a)
   .apply { if (filtering) where(a.ID.eq(1)) }
   .apply { if (joining) join(b).on(a.ID.eq(b.AUTHOR_ID)) }
   .apply { if (joining) groupBy(a.FIRST_NAME, a.LAST_NAME) }
   .orderBy(a.ID)
   .fetch {
       println(it[a.FIRST_NAME] + " " + 
               it[a.LAST_NAME] +
               (if (joining) " " + it[count()] else ""))
   }

That’s neat! See, the jOOQ API doesn’t specify any apply() method / function, yet you can chain the apply() function to the jOOQ API as if it were natively supported.

Like with(), apply() makes a reference available to a closure as this, so it doesn’t have to be referenced explicitly anymore. Which means, we can write neat things like

   .apply { if (filtering) where(a.ID.eq(1)) }

Where a where() clause is added only if we’re filtering!

Of course, jOOQ (or any other query builder) lends itself to this kind of dynamic SQL, and it can be done in Java too:
https://www.jooq.org/doc/latest/manual/sql-building/dynamic-sql

But the Kotlin-specific fluent integration using apply() is exceptionally neat. Well done, Kotlin!

Side-note

This only works because the jOOQ DSL API of jOOQ 3.x is mutable and every operation returns the same this reference as was kindly pointed out by Ilya Ryzhenkov

In the future (e.g. version 4.0), we’re planning on making the jOOQ API more immutable – mutability is a historic legacy (although, often, it’s the desired behaviour for a query builder).

More nice Kotlin/jOOQ tricks in this article here.

How to Use SQL INTERSECT to Work Around SQL’s NULL Logic

ANOTHER SQL Post this week? I got nerd-sniped:

Oooooh, challenge accepted!

So, let’s assume we have a table T with columns (A, B, C) like this:

WITH t(a, b, c) AS (
  SELECT 'a', 'b', null FROM dual UNION ALL
  SELECT 'a', null, 'c' FROM dual UNION ALL
  SELECT 'a', 'b', 'c'  FROM dual
)
SELECT * FROM t

As expected, this yields:

A       B       C
-----------------
a       b
a               c
a       b       c

Truly exciting.

Now we want to find all those rows that “match” either ('a', 'b', NULL) or ('a', NULL, 'b'). Clearly, this should produce the first two rows, right?

A       B       C
-----------------
a       b
a               c

Yes. Now the canonical solution would be to tediously write out the entire predicate as such:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a = 'a' AND b = 'b' AND c IS NULL)
OR (a = 'a' AND b IS NULL AND c = 'c')

That’s really boring. Sure, we could have factored out the first, common predicate:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE a = 'a' AND (
     (b = 'b' AND c IS NULL)
  OR (b IS NULL AND c = 'c')
)

That’s certainly better from a performance perspective, but Rafael had a nifty idea. Let’s use row value expressions (tuples) in our predicates:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a, b, c) IN (('a', 'b', NULL), ('a', NULL, 'c'))

Unfortunately this doesn’t yield any results, because nothing is equal to NULL in SQL (not even NULL itself). The above query is the same as this one:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a = 'a' AND b = 'b' AND c = NULL /* oops */)
OR (a = 'a' AND b = NULL /* oops */ AND c = 'c')

D’oh.

Solutions

The lame one

The canonical solution then would be a really lame (but perfectly valid) one. Encode NULL to be some “impossible” string value. Rafael suggested yolo. Fair enough.

This works:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a, NVL(b, 'yolo'), NVL(c, 'yolo')) 
  IN (('a', 'b', 'yolo'), ('a', 'yolo', 'c'))

All we have to do now is to always remember the term yolo which means “NULL, but not NULL thank you SQL”

The hipster one

But wait! SQL is painstakingly inconsistent when it comes to NULL. See, NULL really means UNKNOWN in three-valued logic, and this means, we never know if SQL abides to its own rules.

Come in INTERSECT. Like UNION or EXCEPT (MINUS) in Oracle, as well as SELECT DISTINCT, these set operations handle two NULL values as NOT DISTINCT. Yes, they’re not equal but also not distinct. Whatever. Just remember: That’s how it is 🙂

So, we can write this hipster solution to Rafael’s problem:

WITH t(a, b, c) AS (...)
SELECT *
FROM t
WHERE EXISTS (
  SELECT a, b, c FROM dual
  INTERSECT (
    SELECT 'a', 'b', null FROM dual
    UNION ALL
    SELECT 'a', null, 'c' FROM dual
  )
)

We create an intersection of the tuple (a, b, c), the left side of Rafael’s IN predicate, and the desired values on the right side of the IN predicate, and we’re done.

Clearly less tedious than writing the original predicates, right? (We won’t look into performance this time)

Cheers, and a happy weekend.

How to Find Redundant Indexes in SQL

The following two indexes are redundant in most SQL databases:

CREATE INDEX i_actor_1 ON actor (last_name);
CREATE INDEX i_actor_2 ON actor (last_name, first_name);

It is usually safe to drop the first index, because all queries that query the LAST_NAME column only can still profit from the second index I_ACTOR_2. The reason being that LAST_NAME is the first column of the composite index I_ACTOR_2 (it would be a different story, if it weren’t the first column).

Note: It is usually safe to drop the first index, because the benefits probably outweigh the cost:

Benefits of dropping

Costs of dropping

  • Querying a composite index can be slightly slower as can be seen in the below benchmark

Let’s see the costs of dropping the index below for Oracle, PostgreSQL, and SQL Server in this particular case (beware as always when interpreting benchmarks, they heavily depend on a lot of context, especially data size!)

Oracle

Preparation:

CREATE TABLE t (
  a NUMBER(10) NOT NULL,
  b NUMBER(10) NOT NULL
);

INSERT INTO t (a, b)
SELECT level, level
FROM dual
CONNECT BY level <= 100000;

CREATE INDEX i1 ON t(a);
CREATE INDEX i2 ON t(a, b);

EXEC dbms_stats.gather_table_stats('TEST', 'T');

Benchmark:

SET SERVEROUTPUT ON
CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT /*+INDEX(t i1)*/ * FROM t WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, 1, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT /*+INDEX(t i2)*/ * FROM t WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, 2, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(10, 5)) ratio 
    FROM results
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio);
  END LOOP;
END;
/

DROP TABLE results;

The result being:

Run 1, Statement 1 : 1.4797
Run 1, Statement 2 : 1.45545

Run 2, Statement 1 : 1.1997
Run 2, Statement 2 : 1.01121

Run 3, Statement 1 : 1.13606
Run 3, Statement 2 : 1

Run 4, Statement 1 : 1.13455
Run 4, Statement 2 : 1.00242

Run 5, Statement 1 : 1.13303
Run 5, Statement 2 : 1.00606

Some notes on benchmarks here.

The fastest query execution in the above result yields 1, the other executions are multiples of 1. Yes, there’s a 10% difference in this case, so as you can see. The benefits (faster insertions) certainly should outweight the cost (slower queries), so, don’t apply this advice in a read-heavy / write-rarely database.

PostgreSQL

A similar difference can be seen in a PostgreSQL benchmark. No hints can be used to choose indexes, so we’re simply creating two tables:

CREATE TABLE t1 (
  a INT NOT NULL,
  b INT NOT NULL
);
CREATE TABLE t2 (
  a INT NOT NULL,
  b INT NOT NULL
);

INSERT INTO t1 (a, b)
SELECT s, s
FROM generate_series(1, 100000) AS s(s);

INSERT INTO t2 (a, b)
SELECT s, s
FROM generate_series(1, 100000) AS s(s);

CREATE INDEX i1 ON t1(a);
CREATE INDEX i2 ON t2(a, b);

ANALYZE t1;
ANALYZE t2;

Benchmark:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 10000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM t1 WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM t2 WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

Result:

INFO:  Run 1, Statement 1: 00:00:00.071891
INFO:  Run 1, Statement 2: 00:00:00.080833

INFO:  Run 2, Statement 1: 00:00:00.076329
INFO:  Run 2, Statement 2: 00:00:00.079772

INFO:  Run 3, Statement 1: 00:00:00.073137
INFO:  Run 3, Statement 2: 00:00:00.079483

INFO:  Run 4, Statement 1: 00:00:00.073456
INFO:  Run 4, Statement 2: 00:00:00.081508

INFO:  Run 5, Statement 1: 00:00:00.077148
INFO:  Run 5, Statement 2: 00:00:00.083535

SQL Server

Preparation:

CREATE TABLE t (
  a INT NOT NULL,
  b INT NOT NULL
);

WITH s(s) AS (
  SELECT 1
  UNION ALL
  SELECT s + 1 FROM s WHERE s < 100
)
INSERT INTO t
SELECT TOP 100000 
  row_number() over(ORDER BY (SELECT 1)), 
  row_number() over(ORDER BY (select 1)) 
FROM s AS s1, s AS s2, s AS s3;

CREATE INDEX i1 ON t(a);
CREATE INDEX i2 ON t(a, b);

UPDATE STATISTICS t;

Benchmark:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 2000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

DECLARE @results TABLE (
  run     INT,
  stmt    INT,
  elapsed DECIMAL
);

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR 
    SELECT b FROM t WITH (INDEX (i1)) WHERE a = 1;

  SET @s2 = CURSOR FOR 
    SELECT b FROM t WITH (INDEX (i2)) WHERE a = 1;

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  INSERT INTO @results 
    VALUES (@r, 1, DATEDIFF(ms, @ts, current_timestamp));

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  INSERT INTO @results 
    VALUES (@r, 2, DATEDIFF(ms, @ts, current_timestamp));
END;

SELECT 'Run ' + CAST(run AS VARCHAR) + 
  ', Statement ' + CAST(stmt AS VARCHAR) + ': ' + 
  CAST(CAST(elapsed / MIN(elapsed) OVER() AS DECIMAL(10, 5)) AS VARCHAR)
FROM @results;

Result:

Run 1, Statement 1: 1.22368
Run 1, Statement 1: 1.09211

Run 2, Statement 1: 1.05263
Run 2, Statement 1: 1.09211

Run 3, Statement 1: 1.00000
Run 3, Statement 1: 1.05263

Run 4, Statement 1: 1.05263
Run 4, Statement 1: 1.00000

Run 5, Statement 1: 1.09211
Run 5, Statement 1: 1.05263

As can be seen, predictably, in all databases the smaller non-composite index is slightly faster for this type of query than the composite index. In this particular benchmark, this is specifically true because the composite index acts as a covering index.

Yet both indexes can be used for the query in a reasonable way, so if disk space / insertion speed is an issue, the redundant single-column index can be dropped.

How to find such indexes

The following query will help you detect such indexes in Oracle, PostgreSQL, and SQL Server:

Oracle

WITH indexes AS (
  SELECT
    i.owner,
    i.index_name,
    i.table_name,
    listagg(c.column_name, ', ')
      WITHIN GROUP (ORDER BY c.column_position)
      AS columns
  FROM all_indexes i
  JOIN all_ind_columns c
    ON i.owner = c.index_owner
    AND i.index_name = c.index_name
  GROUP BY i.owner, i.table_name, i.index_name, i.leaf_blocks
)
SELECT
  i.owner,
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.owner = j.owner
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns || ',%'

Result:

TABLE_NAME   delete index   columns   existing index   columns
-------------------------------------------------------------------------
T            I1             A         I2               A, B

In short, it lists all the indexes whose columns are a prefix of another index’s columns

PostgreSQL

Get ready for a really nifty query. Here’s how to discover redundant indexes in PostgreSQL, which unfortunately doesn’t seem to have an easy, out-of-the-box dictionary view to discover index columns:

WITH indexes AS (
  SELECT 
    tnsp.nspname AS schema_name,
    trel.relname AS table_name,
    irel.relname AS index_name,
    string_agg(a.attname, ', ' ORDER BY c.ordinality) AS columns
  FROM pg_index AS i
  JOIN pg_class AS trel ON trel.oid = i.indrelid
  JOIN pg_namespace AS tnsp ON trel.relnamespace = tnsp.oid
  JOIN pg_class AS irel ON irel.oid = i.indexrelid
  JOIN pg_attribute AS a ON trel.oid = a.attrelid
  JOIN LATERAL unnest(i.indkey) 
    WITH ORDINALITY AS c(colnum, ordinality)
      ON a.attnum = c.colnum
  GROUP BY i, tnsp.nspname, trel.relname, irel.relname
)
SELECT
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.schema_name = j.schema_name
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns || ',%';

This is a really nice case of lateral unnesting with ordinality, which you should definitely add to your PostgreSQL tool chain.

SQL Server

Now, SQL Server doesn’t have a nice STRING_AGG function (yet), but we can work around this using STUFF and XML to get the same query.

Of course, there are other solutions using recursive SQL, but I’m too lazy to translate the simple string pattern-matching approach to something recursive.

WITH 
  i AS (
    SELECT 
	  s.name AS schema_name,
      t.name AS table_name,
      i.name AS index_name,
      c.name AS column_name,
      ic.index_column_id
    FROM sys.indexes i 
    JOIN sys.index_columns ic 
      ON i.object_id = ic.object_id 
      AND i.index_id = ic.index_id 
    JOIN sys.columns c 
      ON ic.object_id = c.object_id 
      AND ic.column_id = c.column_id 
    JOIN sys.tables t 
      ON i.object_id = t.object_id 
	JOIN sys.schemas s
	  ON t.schema_id = s.schema_id
  ),
  indexes AS (
    SELECT
	  schema_name,
      table_name,
      index_name,
      STUFF((
        SELECT ',' + j.column_name 
        FROM i j
        WHERE i.table_name = j.table_name 
        AND i.index_name = j.index_name 
        FOR XML PATH('') -- Yay, XML in SQL!
      ), 1, 1, '') columns
    FROM i
    GROUP BY schema_name, table_name, index_name
  )
SELECT
  i.schema_name,
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.schema_name = j.schema_name
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns + ',%';

Conclusion

Now go run the above query on your production database and… Very carefully and reasonably think about whether you really want to drop those indexes 😉

How to Execute a SQL Query Only if Another SQL Query has no Results

I stumbled upon an interesting question on Stack Overflow recently. A user wanted to query a table for a given predicate. If that predicate returns no rows, they wanted to run another query using a different predicate. Preferably in a single query.

Challenge accepted!

Canonical Idea: Use a Common Table Expression

We’re querying the Sakila database and we’re trying to find films of length 120 minutes. If there are no such films, then let’s find films of length 130 minutes. The following query is formally correct and runs without any adaptations on all of Oracle, PostgreSQL and SQL Server (and probably on other DBs too, as it’s pretty standard):

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

How does it work?

The common table expression (WITH clause) wraps the first query that we want to execute no matter what. We then select from the first query, and use UNION ALL to combine the result with the result of the second query, which we’re executing only if the first query didn’t yield any results (through NOT EXISTS). We’re hoping here that the database will be smart enough to run the existence check on a pre-calculated set from the first subquery, in order to be able to avoid running the second subquery.

Let’s see, which database actually does this.

PostgreSQL

Running EXPLAIN ANALYZE

EXPLAIN ANALYZE
WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

… we can see the following plan:

Append  (cost=68.50..137.26 rows=15 width=561) (actual time=0.052..0.300 rows=9 loops=1)
  CTE r
    ->  Seq Scan on film film_1  (cost=0.00..68.50 rows=9 width=394) (actual time=0.047..0.289 rows=9 loops=1)
          Filter: (length = 120)
          Rows Removed by Filter: 991
  ->  CTE Scan on r  (cost=0.00..0.18 rows=9 width=672) (actual time=0.051..0.297 rows=9 loops=1)
  ->  Result  (cost=0.02..68.52 rows=6 width=394) (actual time=0.002..0.002 rows=0 loops=1)
        One-Time Filter: (NOT $1)
        InitPlan 2 (returns $1)
          ->  CTE Scan on r r_1  (cost=0.00..0.18 rows=9 width=0) (actual time=0.000..0.000 rows=1 loops=1)
        ->  Seq Scan on film  (cost=0.00..68.50 rows=6 width=394) (never executed)
              Filter: (length = 130)
Planning time: 0.952 ms
Execution time: 0.391 ms

So, indeed, the database seems to be smart enough to avoid the second query, because the first one does yield 9 rows.

Can we see this in a benchmark as well? In principle, the complete query should take about as much time in a benchmark as the Common Table Expression alone. Here’s the benchmark logic:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

The result is:

INFO:  Run 1, Statement 1: 00:00:00.310325
INFO:  Run 1, Statement 2: 00:00:00.427744

INFO:  Run 2, Statement 1: 00:00:00.303202
INFO:  Run 2, Statement 2: 00:00:00.33568

INFO:  Run 3, Statement 1: 00:00:00.323699
INFO:  Run 3, Statement 2: 00:00:00.339835

INFO:  Run 4, Statement 1: 00:00:00.301084
INFO:  Run 4, Statement 2: 00:00:00.343838

INFO:  Run 5, Statement 1: 00:00:00.356343
INFO:  Run 5, Statement 2: 00:00:00.359891

As you can see, the second statement is consistently slower by around 5% – 10%. So we can safely say, the second subquery looking for length = 130 is not executed, but there’s still some overhead compared to making a decision in a client application to avoid that second subquery entirely. My guess here is that this is due to PostgreSQL’s Common Table Expression (CTE) being “optimisation fences”, i.e. the CTE is materialised every time. See also:
https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/

What about the inverse case?

In the above benchmark, we’ve measured how much time it takes when the first query succeeds (and the second query should be avoided). What about the inverse case, where the first query doesn’t match any rows and we have to run another query?

Benchmark time!

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 1200
      ) LOOP
        NULL;
      END LOOP;
      FOR rec IN (
        SELECT * FROM film WHERE length = 130
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 1200
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

The result is roughly the same:

INFO:  Run 1, Statement 1: 00:00:00.680222
INFO:  Run 1, Statement 2: 00:00:00.696036

INFO:  Run 2, Statement 1: 00:00:00.673141
INFO:  Run 2, Statement 2: 00:00:00.709034

INFO:  Run 3, Statement 1: 00:00:00.626873
INFO:  Run 3, Statement 2: 00:00:00.679469

INFO:  Run 4, Statement 1: 00:00:00.619584
INFO:  Run 4, Statement 2: 00:00:00.639092

INFO:  Run 5, Statement 1: 00:00:00.616275
INFO:  Run 5, Statement 2: 00:00:00.675317

A slight overhead in the single query case.

But what’s this? We didn’t even have an index on the LENGTH column. Let’s add one!

Now, the result is very different. Query 1 succeeds:

INFO:  Run 1, Statement 1: 00:00:00.055835
INFO:  Run 1, Statement 2: 00:00:00.093982

INFO:  Run 2, Statement 1: 00:00:00.038817
INFO:  Run 2, Statement 2: 00:00:00.084092

INFO:  Run 3, Statement 1: 00:00:00.041911
INFO:  Run 3, Statement 2: 00:00:00.078062

INFO:  Run 4, Statement 1: 00:00:00.039367
INFO:  Run 4, Statement 2: 00:00:00.081752

INFO:  Run 5, Statement 1: 00:00:00.039983
INFO:  Run 5, Statement 2: 00:00:00.081227

Query 1 fails:

INFO:  Run 1, Statement 1: 00:00:00.075469
INFO:  Run 1, Statement 2: 00:00:00.081766

INFO:  Run 2, Statement 1: 00:00:00.058276
INFO:  Run 2, Statement 2: 00:00:00.079613

INFO:  Run 3, Statement 1: 00:00:00.060492
INFO:  Run 3, Statement 2: 00:00:00.080672

INFO:  Run 4, Statement 1: 00:00:00.05877
INFO:  Run 4, Statement 2: 00:00:00.07936

INFO:  Run 5, Statement 1: 00:00:00.057584
INFO:  Run 5, Statement 2: 00:00:00.085798

Oracle

In Oracle, I couldn’t find any difference in execution speed (see below). The plan of a combined query also contains an element that prevents the execution of the second subquery. In this case, I’m using the /*+GATHER_PLAN_STATISTICS*/ hint to make sure we get actual execution values / times in our execution plan:

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT /*+GATHER_PLAN_STATISTICS*/ * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
);

SELECT p.*
FROM (
  SELECT *
  FROM v$sql
  WHERE upper(sql_text) LIKE '%LENGTH = 120%'
  ORDER BY last_active_time DESC
  FETCH NEXT 1 ROW ONLY
) s 
CROSS APPLY TABLE(dbms_xplan.display_cursor(
  sql_id => s.sql_id, 
  format => 'ALLSTATS LAST'
)) p;
---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      9 |
|   1 |  UNION-ALL          |      |      1 |        |      9 |
|*  2 |   TABLE ACCESS FULL | FILM |      1 |      7 |      9 |
|*  3 |   FILTER            |      |      1 |        |      0 |
|*  4 |    TABLE ACCESS FULL| FILM |      0 |      7 |      0 |
|*  5 |    TABLE ACCESS FULL| FILM |      1 |      2 |      1 |
---------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("LENGTH"=120)
   3 - filter( IS NULL)
   4 - filter("LENGTH"=130)
   5 - filter("LENGTH"=120)

While the estimates are off just as in PostgreSQL (an error that can propagate, see conclusion), the actual rows for the second subquery is zero, and the second subquery is run zero times (“Starts”), because we don’t have to really access it at all. Excellent. Exactly what we expected!

Here, I’ve finally created a benchmark that anonymises the results properly by normalising them in order to comply with Oracle’s forbidding of publishing benchmark results. The fastest execution time is simply 1, and the other execution times are multiples of that value:

SET SERVEROUTPUT ON
CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, 1, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, 2, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(5, 4)) ratio 
    FROM results
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio);
  END LOOP;
END;
/

DROP TABLE results;

The result being (query 1 succeeds, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.26901

Run 2, Statement 1 : 1.10218
Run 2, Statement 2 : 1.08792

Run 3, Statement 1 : 1.26038
Run 3, Statement 2 : 1.09426

Run 4, Statement 1 : 1.2245
Run 4, Statement 2 : 1.10829

Run 5, Statement 1 : 1.07164
Run 5, Statement 2 : 1.18562

Or in the inverse case (query 1 fails, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.17871

Run 2, Statement 1 : 1.07377
Run 2, Statement 2 : 1.12489

Run 3, Statement 1 : 1.05745
Run 3, Statement 2 : 1.13711

Run 4, Statement 1 : 1.11118
Run 4, Statement 2 : 1.23508

Run 5, Statement 1 : 1.08535
Run 5, Statement 2 : 1.11271

Adding an index doesn’t change much (query 1 succeeds):

Run 1, Statement 1 : 1.20699
Run 1, Statement 2 : 1.28221

Run 2, Statement 1 : 1
Run 2, Statement 2 : 1.21174

Run 3, Statement 1 : 1.0054
Run 3, Statement 2 : 1.2643

Run 4, Statement 1 : 1.0491
Run 4, Statement 2 : 1.31103

Run 5, Statement 1 : 1.02547
Run 5, Statement 2 : 1.23192

Yet, when query 1 fails:

Run 1, Statement 1 : 1.56287
Run 1, Statement 2 : 1.09471

Run 2, Statement 1 : 1.22219
Run 2, Statement 2 : 1.11227

Run 3, Statement 1 : 1.19739
Run 3, Statement 2 : 1.03929

Run 4, Statement 1 : 1.13503
Run 4, Statement 2 : 1

Run 5, Statement 1 : 1.14289
Run 5, Statement 2 : 1.01919

This time, the combined query is a bit faster!

As can be seen, both queries are executed in roughly the same time on Oracle 12c although again the single query seems to be a little bit slower, but not always. Which is an important reminder to do benchmarking properly! Meaning:

  • Repeat benchmarks several times
  • Beware of warmup penalties (the first run is often the slowest)
  • Beware of excessive caching effects in benchmarks
  • Don’t trust performance differences that aren’t significant
  • Don’t compile any Scala code or chat on Slack while benchmarking. Your system should be idle, otherwise
  • Remember to benchmark the right data set. We only have 600 films in this table. What would happen with 60 million films?

SQL Server

Same exercise again:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 2000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

DECLARE @results TABLE (
  run     INT,
  stmt    INT,
  elapsed DECIMAL
);

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR 
    SELECT title FROM film WHERE length = 120;

  SET @s2 = CURSOR FOR 
    WITH r AS (
      SELECT * FROM film WHERE length = 120
    )
    SELECT title FROM r
    UNION ALL
    SELECT title FROM film
    WHERE length = 130
    AND NOT EXISTS (
      SELECT * FROM r
    );

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  INSERT INTO @results VALUES (@r, 2, DATEDIFF(ms, @ts, current_timestamp));

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  INSERT INTO @results VALUES (@r, 1, DATEDIFF(ms, @ts, current_timestamp));
END;

SELECT 'Run ' + CAST(run AS VARCHAR) + 
  ', Statement ' + CAST(stmt AS VARCHAR) + 
  ': ' + CAST(CAST(elapsed / MIN(elapsed) OVER() AS DECIMAL(10, 5)) AS VARCHAR)
FROM @results;

The result, this time, is more drastic (no index, query 1 succeeds):

Run 1, Statement 1: 1.07292
Run 1, Statement 2: 1.35000

Run 2, Statement 1: 1.07604
Run 2, Statement 2: 1.40625

Run 3, Statement 1: 1.08333
Run 3, Statement 2: 1.40208

Run 4, Statement 1: 1.09375
Run 4, Statement 2: 1.34375

Run 5, Statement 1: 1.00000
Run 5, Statement 2: 1.46458

There is a 30% – 40% overhead for the CTE solution over the two query solution. If we don’t find any rows in the first query (no index):

Run 1, Statement 1: 1.08256
Run 1, Statement 2: 1.27546

Run 2, Statement 1: 1.16512
Run 2, Statement 2: 1.27778

Run 3, Statement 1: 1.00000
Run 3, Statement 2: 1.26235

Run 4, Statement 1: 1.04167
Run 4, Statement 2: 1.26003

Run 5, Statement 1: 1.05401
Run 5, Statement 2: 1.34259

… then the difference is slightly less drastic but still clear. The reason here is that SQL Server doesn’t avoid the unnecessary subquery:

Too bad! (Note I was using SQL Server 2014. Perhaps in 2016, this optimisation is implemented)

Note, you can trust me that adding an index doesn’t change much in this case.

Conclusion

We’ve seen that we can easily solve the original problem with SQL only: Select some data from a table using predicate A, and if we don’t find any data for predicate A, then try finding data using predicate B from the same table.

Oracle and PostgreSQL can both optimise away the unnecessary query 2 by inserting a “probe” in their execution plans that knows whether the query 2 needs to be executed or not. In Oracle, we’ve even seen a situation where the combined query outperforms two individual queries. SQL Server 2014 surprisingly does not have such an optimisation.

While the performance impact was negligible in all benchmarks (even in SQL Server), we should be careful with these kinds of queries and not entirely rely on the optimiser to “get it right”. In all three databases, the cardinality estimates were off. We’re working with small data sets, but if data sets grow larger, and queries like the above are embedded in more complex queries, then the wrong cardinality estimates can easily produce wrong execution plans (e.g. favouring hash join over nested loop joins because of a high number of estimated rows). An example of this was given in a previous blog post.

Nevertheless, we can get quite far with SQL, without resorting to procedural client languages and if I had conducted my benchmark with a JDBC client instead of procedural blocks directly inside of the database, perhaps the single query would have outperformed the double query case – at least in those cases where query 1 yielded no rows and query 2 had to be executed from a remote client. Probably in Oracle.

Ultimately, I can only repeat myself. Measure! Measure! Measure! There’s no point in guessing. Truth can only be found by measuring actual executions.

When to Use Bind Values, and When to Use Inline Values in SQL

Users of jOOQ, PL/SQL, T-SQL are spoiled as they hardly ever need to worry about bind values. Consider the following statements:

Using jOOQ

public int countActors(String firstName, String lastName) {
    return ctx.selectCount()
              .from(ACTOR)
              .where(ACTOR.FIRST_NAME.eq(firstName))
              .and(ACTOR.LAST_NAME.eq(lastName))
              .fetchOneInto(int.class);
    );
}

The method parameters firstName and lastName will be automatically mapped to bind values in the generated SQL statement. Here’s the debug log output when running the above, where the first statement is sent to the JDBC driver and then to the database, wheas the second statement is generated for debugging purposes only:

Executing query          : 
    select count(*)
    from "SAKILA"."ACTOR"
    where (
      "SAKILA"."ACTOR"."FIRST_NAME" = ?
      and "SAKILA"."ACTOR"."LAST_NAME" = ?
    )
-> with bind values      : 
    select count(*)
    from "SAKILA"."ACTOR"
    where (
      "SAKILA"."ACTOR"."FIRST_NAME" = 'SUSAN'
      and "SAKILA"."ACTOR"."LAST_NAME" = 'DAVIS'
    )
Fetched result           : +-----+
                         : |count|
                         : +-----+
                         : |    2|
                         : +-----+

With this API design, users don’t have to worry about binding variables explicitly at all, nor about remembering bind variable indexes or the data type of a bind value. This works because the overloaded Field<T>.eq(T) method (as well as pretty much every other method that works in a similar way) internally delegates to the more generic Field<T>.eq(Field<T>) method by wrapping the argument value in an explicit bind variable expression DSL.val(T) in the jOOQ SQL expression tree.

Using PL/SQL

The same is true when you’re using PL/SQL (or any other stored procedure language of another database), for instance:

SET SERVEROUTPUT ON
DECLARE
  
  FUNCTION count_actors (
    p_first_name VARCHAR2, 
    p_last_name VARCHAR2
  ) RETURN NUMBER IS 
    v_result NUMBER(10);
  BEGIN
    SELECT count(*)
    INTO v_result
    FROM actor
    WHERE first_name = p_first_name
    AND last_name = p_last_name;
    
    RETURN v_result;
  END count_actors;
  
BEGIN
  dbms_output.put_line(count_actors('SUSAN', 'DAVIS'));
END;
/

To be sure what happened, let’s consider the execution plan:

SELECT p.*
FROM (
  SELECT *
  FROM v$sql
  WHERE upper(sql_text) LIKE 'SELECT COUNT(*) FROM ACTOR%'
  ORDER BY last_active_time DESC
  FETCH NEXT 1 ROW ONLY
) s 
CROSS APPLY TABLE(dbms_xplan.display_cursor(sql_id => s.sql_id)) p;

As you can see below, bind variables were generated for the SQL query that was embedded in the PL/SQL function:

SQL_ID  9dgammbskx5tx, child number 0
-------------------------------------
SELECT COUNT(*) FROM ACTOR WHERE FIRST_NAME = :B2 AND LAST_NAME = :B1
 
Plan hash value: 3384208144
 
----------------------------------------------------
| Id  | Operation         | Name           | Rows  |
----------------------------------------------------
|   0 | SELECT STATEMENT  |                |       |
|   1 |  SORT AGGREGATE   |                |     1 |
|*  2 |   INDEX RANGE SCAN| IDX_ACTOR_NAME |     1 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("LAST_NAME"=:B1 AND "FIRST_NAME"=:B2)

Why is it important? 1: SQL injection

I’ve noticed this time and again when talking to Java developers: Many developers are aware of the risk of SQL injection when not using bind variables. This can happen when we would write dynamic SQL like this, e.g. using JDBC or jOOQ’s plain SQL API:

String sql = 
  "SELECT count(*) "
+ "FROM actor "
+ "WHERE 1 = 1 "
+ (firstName != null ? "AND first_name = " + firstName + " " : "")
+ (lastName != null ? "AND last_name = " + lastName + " " : "");

// JDBC
try (Statement s = connection.createStatement();
     ResultSet rs = s.executeQuery(sql)) {

    ...
}

// jOOQ
Result<?> result = ctx.fetch(sql);

Seriously. Don’t do this. Ever! Always use bind values for user input. There’s hardly any reason at all why you should inline the values. I mean, of course, you could if you always remember to manually escape all strings, e.g.:

public static String escape(String string) {
    // TODO: Handle MySQL's non-standard backslash escaping, too
    return string == null ? null : string.replace("'", "''");
}

And then:

String sql = 
  "SELECT count(*) "
+ "FROM actor "
+ "WHERE 1 = 1 "
+ (firstName != null ? "AND first_name = " + escape(firstName) + " " : "")
+ (lastName != null ? "AND last_name = " + escape(lastName) + " " : "");

Notice that there’s still a vulnerability risk in MySQL, which doesn’t necessarily conform to standard SQL string literal escaping. A very unfortunate MySQL “feature”, which is handled correctly by jOOQ:
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_backslash_escapes

But why run the risk? It’s so much easier with bind values.

Why is it important? 2: Performance!

So, most Java developers, luckily, are aware of SQL injection vulnerabilities and mostly get this right. But there’s another thing that few Java developers are aware of, unfortunately. And that’s the performance aspect of using bind values. Let’s assume we’re not using bind values for the above query. Check this out:

SELECT count(*)
FROM actor
WHERE first_name = 'SUSAN'
AND last_name = 'DAVIS';

SELECT count(*)
FROM actor
WHERE first_name = 'NICK'
AND last_name = 'WAHLBERG';

And now, let’s find execution plans:

SELECT p.*
FROM (
  SELECT *
  FROM v$sql
  WHERE upper(sql_text) LIKE 'SELECT COUNT(*) FROM ACTOR%'
  ORDER BY last_active_time DESC
  FETCH NEXT 2 ROWS ONLY
) s 
CROSS APPLY TABLE(dbms_xplan.display_cursor(sql_id => s.sql_id)) p;

Result:

SQL_ID  12r8afykqkzbd, child number 0
-------------------------------------
SELECT count(*) FROM actor WHERE first_name = 'NICK' AND last_name = 'WAHLBERG'
 
Plan hash value: 3384208144
 
----------------------------------------------------
| Id  | Operation         | Name           | Rows  |
----------------------------------------------------
|   0 | SELECT STATEMENT  |                |       |
|   1 |  SORT AGGREGATE   |                |     1 |
|*  2 |   INDEX RANGE SCAN| IDX_ACTOR_NAME |     1 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("LAST_NAME"='WAHLBERG' AND "FIRST_NAME"='NICK')
 
SQL_ID  gfppqr9gpjqws, child number 0
-------------------------------------
SELECT count(*) FROM actor WHERE first_name = 'SUSAN' AND last_name = 'DAVIS'
 
Plan hash value: 3384208144
 
----------------------------------------------------
| Id  | Operation         | Name           | Rows  |
----------------------------------------------------
|   0 | SELECT STATEMENT  |                |       |
|   1 |  SORT AGGREGATE   |                |     1 |
|*  2 |   INDEX RANGE SCAN| IDX_ACTOR_NAME |     1 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("LAST_NAME"='DAVIS' AND "FIRST_NAME"='SUSAN')

As you can see, the two almost identical queries produced two times the exact same execution plan (same plan hash value), but they are stored under two distinct SQL_ID values. Yes, they’re distinct SQL statements!

Why?

Most sophisticated databases (including Oracle, SQL Server, DB2 and others) implement an execution plan cache. The reason for this is simple. Calculating an execution plan with cost based optimisation is expensive. Not for these trivial statements, but imagine you have dozens of JOINs, semi joins, unions, and what not. There are thousands of candidate execution plans, and the database needs to find the best one for you. That can be tons of work and we don’t want to let the database do that work every time.

So, the database will run a soft-parse (Oracle speak) to identify a SQL query and translate its SQL string to a SQL_ID. It will then check if there’s already a suitable plan available for that particular SQL_ID, and if so, it will avoid the so-called hard-parse (Oracle speak) to calculate a new plan.

Let me repeat this one more time: Making best use of this plan cache (Oracle speak: Cursor cache) is extremely important. You can severely overload a system up to the point of bringing it down, if you’re not using bind variables.

There’s a workaround to use CURSOR_SHARING=FORCE, which will transform inline values to bind values, but I’m not even going to explain how it works, because most Oracle experts advise you not to use that feature due to the significant drawbacks it will bring.

What if I want to inline values?

As we’ve seen above, when using jOOQ or PL/SQL, the above problems are really non-discussions, because it is quite unlikely that you run into a situation of accidentally inlining your bind values:

  • In jOOQ, you’d have to use plain SQL
  • In PL/SQL, you’d have to use dynamic SQL using DBMS_SQL

Anyway, in rare cases, users want to inline their bind variables for them to appear directly in the resulting SQL statement. This is never the case for user input, for ID values, or for ordinary search values. But it can be the case when you query for discriminators or “enum” values (enforced through a CHECK constraint, for instance), or when you run a report only once a year (plan is never available from the cache).

In these cases, it can be of advantage to not use bind values specifically to prevent the database from re-using a cached execution plan, because you know that the cached plan, which works well for 90% of the queries, won’t work well for this particular bind value (or in the case of the once-per-year report, you might get a slightly better plan by giving the database more information).

An example, let’s insert 1×1 and 99999×0 into a table:

CREATE TABLE skewed (
  v NUMBER(1)
);

INSERT INTO skewed
SELECT decode(level, 1, 1, 0)
FROM dual
CONNECT BY level <= 100000;

CREATE INDEX skewed_i ON skewed(v);

EXEC dbms_stats.gather_table_stats('TEST', 'SKEWED');

Now, clearly we see that when querying the table SKEWED for values V = 0, the index is useless, whereas it is very useful for values V = 1. Let’s run this statement here:

SET SERVEROUTPUT ON
DECLARE
  v_bind NUMBER(1);
  v_result NUMBER(10);
BEGIN
  v_bind := 0;
  
  SELECT count(*)
  INTO v_result
  FROM skewed
  WHERE v = v_bind;
  
  FOR rec IN (
    SELECT * FROM TABLE(dbms_xplan.display_cursor)
  ) LOOP
    dbms_output.put_line(rec.plan_table_output);
  END LOOP;

  v_bind := 1;
  
  SELECT count(*)
  INTO v_result
  FROM skewed
  WHERE v = v_bind;
  
  FOR rec IN (
    SELECT * FROM TABLE(dbms_xplan.display_cursor)
  ) LOOP
    dbms_output.put_line(rec.plan_table_output);
  END LOOP;
  
  SELECT count(*)
  INTO v_result
  FROM skewed
  WHERE v = 1;
  
  FOR rec IN (
    SELECT * FROM TABLE(dbms_xplan.display_cursor)
  ) LOOP
    dbms_output.put_line(rec.plan_table_output);
  END LOOP;
END;

The above block runs the exact same statement three times:

  1. With a bind variable of 0 (can’t really use the index)
  2. With a bind variable of 1 (should be using the index)
  3. With an inline value of 1 (should be using the index)

Here’s the result:

SQL_ID  1q0qjm8za06w3, child number 0
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = :B1
 
Plan hash value: 4055318479
 
--------------------------------------------------
| Id  | Operation             | Name     | Rows  |
--------------------------------------------------
|   0 | SELECT STATEMENT      |          |       |
|   1 |  SORT AGGREGATE       |          |     1 |
|*  2 |   INDEX FAST FULL SCAN| SKEWED_I | 99999 |
--------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("V"=:B1)
 
SQL_ID  1q0qjm8za06w3, child number 0
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = :B1
 
Plan hash value: 4055318479
 
--------------------------------------------------
| Id  | Operation             | Name     | Rows  |
--------------------------------------------------
|   0 | SELECT STATEMENT      |          |       |
|   1 |  SORT AGGREGATE       |          |     1 |
|*  2 |   INDEX FAST FULL SCAN| SKEWED_I | 99999 |
--------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("V"=:B1)
 
SQL_ID  bdpjxzqpg2416, child number 0
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = 1
 
Plan hash value: 276090370
 
----------------------------------------------
| Id  | Operation         | Name     | Rows  |
----------------------------------------------
|   0 | SELECT STATEMENT  |          |       |
|   1 |  SORT AGGREGATE   |          |     1 |
|*  2 |   INDEX RANGE SCAN| SKEWED_I |     1 |
----------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("V"=1)

Observe the estimated numbers of rows above (in red):

  1. In the first case, the estimate is correct. With V = 0, we’ll get 99999 rows, so we might just as well scan the entire index to calculate the count value
  2. In the second case, we’ve inherited the cached execution plan from the first run, including the estimate of 99999 rows, which is clearly wrong in this case. The database should’ve estimated 1 row here
  3. In the third case, the estimate is again correct (note: different SQL_ID, and we get an optimal plan

We’re just out of luck. If we had inversed the order of queries, we would have gotten the right estimate for V = 1 but a wrong estimate for V = 0

Adaptive Cursor Sharing

Oracle knows some features to remedy the above problems. Oracle 11g introduced adaptive cursor sharing, which means that if we re-execute the above statements, the database will have already figured out that in this particular case, the plans should depend on the actual bind variable, because in hindsight, the second plan was wrong for V = 1

On a second execution of the previous block, we’ll see:

SQL_ID  1q0qjm8za06w3, child number 1
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = :B1
 
Plan hash value: 4055318479
 
--------------------------------------------------
| Id  | Operation             | Name     | Rows  |
--------------------------------------------------
|   0 | SELECT STATEMENT      |          |       |
|   1 |  SORT AGGREGATE       |          |     1 |
|*  2 |   INDEX FAST FULL SCAN| SKEWED_I | 99999 |
--------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("V"=:B1)
 
SQL_ID  1q0qjm8za06w3, child number 2
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = :B1
 
Plan hash value: 276090370
 
----------------------------------------------
| Id  | Operation         | Name     | Rows  |
----------------------------------------------
|   0 | SELECT STATEMENT  |          |       |
|   1 |  SORT AGGREGATE   |          |     1 |
|*  2 |   INDEX RANGE SCAN| SKEWED_I |     1 |
----------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("V"=:B1)
 
SQL_ID  bdpjxzqpg2416, child number 0
-------------------------------------
SELECT COUNT(*) FROM SKEWED WHERE V = 1
 
Plan hash value: 276090370
 
----------------------------------------------
| Id  | Operation         | Name     | Rows  |
----------------------------------------------
|   0 | SELECT STATEMENT  |          |       |
|   1 |  SORT AGGREGATE   |          |     1 |
|*  2 |   INDEX RANGE SCAN| SKEWED_I |     1 |
----------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("V"=1)

As you can see above, the first two executions are still using the same plan, but they now have a new “child number” value, indicating that for any given SQL_ID, there are now different candidate plan hash values, depending on the bind variable input that we’re getting.

This often works well with the following caveats:

  • It only works after the plan was wrong at least once. This translates to at least one poor user who suffers from a slow query
  • It only works if the plans remain in the cache (which isn’t the case for rarely executed queries). Once the plans are purged from the cache, we have to run the query again at least twice for the alternative “child numbers” to appear
  • It stops working well if your query is much more complex, i.e. when complicated correlations exist between the bind variables that should each produce different plans

So, if in doubt, in this particular case, it’s not a bad idea to simply use an inline value / constant literal in the query directly, to help the optimiser make the right choice.

Another interesting article about this topic is here, where the optimiser will always be fooled by bind variables, and where you should always use inline values: When querying discriminators from views.

Adaptive Execution Plans

Oracle 12c took adaptive cursor sharing one step further and now supports adaptive execution plans, meaning that certain execution plans are known to be “shaky” in advance and the optimiser already pre-calculated a fallback plan that applies if the estimates are wrong for a given execution. In that case, the plan can be changed “in flight” and an alternative plan, which is better for a particular execution, is applied.

This feature is still not too stable in Oracle 12cR1, which is why some Oracle experts generally recommend turning it off.

Conclusion

You should always use bind values by default. In 99% of all cases, they’re the right choice for two reasons:

  • SQL injection prevention (obvious)
  • Statement caching optimisation (less obvious)

The latter reason is one that is not really well known among developers, because it’s not a problem that appears on development environments. It’s a production-only problem that happens under heavy load. Yet, you should be aware of this problem, and always remember to avoid generating too many distinct SQL strings (see also this related article about IN lists)

In rare cases, it is better to use inline values / literals, as this will help the optimiser make a much better choice in a predictable manner. These cases include:

  • Querying skewed data (unless adaptive features can be expected to kick in)
  • Querying discriminators (in this case, it’s always advisable to use inline values)

Using languages like PL/SQL, T-SQL, pgplsql, or APIs like jOOQ definitely helps you get this right, because you don’t have to think about this problem anymore. You’ll get it right automatically.

Side-note: Hibernate’s Criteria Query

Like jOOQ, Hibernate supports a type safe DSL for constructing JPQL queries, which to some extent cover basic SQL functionality when querying entities. Hibernate historically chose quite interesting defaults (as of version 5.2.10):

  • String values are always transformed to bind values, regardless if you’re using implicit values, explicit parameters, or explicit literals. The goal of this is to prevent SQL injection (because currently, Hibernate doesn’t auto-escape inline string literals)
  • Implicit integer values are always inlined
  • Explicit parameters are always sent as parameters
  • Explicit literals are usually sent as literals (unless they’re strings)

The above implementation is unfortunate as we’ve seen before for these reasons:

  • Bind values should always be the default, especially when using ID values, which are integers, and as such, often inlined in the current implementation. Luckily, this has been recognised as a bug and will be fixed, soon: https://hibernate.atlassian.net/browse/HHH-9576
  • Literals should be sent to the server as literals, because when users need literals (e.g. in the above edge cases), they don’t want the API to override this behaviour through bind values. This might also be solved soon: https://hibernate.atlassian.net/browse/HHH-11778

I’m currently working with the team to remedy these problems, such that the criteria API won’t inhibit users from a performance perspective:

How to Generate at Least One Row in SQL

There are some situations where you would like to have at least one (empty) row in your result set in SQL.

Imagine the following situation. We’re querying the Sakila database for actors and their films:

SELECT first_name, last_name, title
FROM actor
JOIN film_actor USING (actor_id)
JOIN film USING (film_id)
ORDER BY 1, 2, 3

yielding something like:

+------------+-----------+---------------------+
| FIRST_NAME | LAST_NAME | TITLE               |
+------------+-----------+---------------------+
| ...        | ...       | ...                 |
| ADAM       | GRANT     | SEABISCUIT PUNK     |
| ADAM       | GRANT     | SPLENDOR PATTON     |
| ADAM       | GRANT     | TADPOLE PARK        |
| ADAM       | GRANT     | TWISTED PIRATES     |
| ADAM       | GRANT     | WANDA CHAMBER       |
| ADAM       | HOPPER    | BLINDNESS GUN       |
| ADAM       | HOPPER    | BLOOD ARGONAUTS     |
| ADAM       | HOPPER    | CHAMBER ITALIAN     |
| ...        | ...       | ...                 |
+------------+-----------+---------------------+

Now, let’s find actors called SUSAN, and in fact, let’s not care if they played in any films (I’ve added them to the Sakila database for the sake of the example):

SELECT actor_id, first_name, last_name, title
FROM actor
LEFT JOIN film_actor USING (actor_id)
LEFT JOIN film USING (film_id)
WHERE first_name = 'SUSAN'
ORDER BY 1, 2, 3

Interesting, there are now two actors without any films:

+----------+------------+-----------+-----------------+
| ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE           |
+----------+------------+-----------+-----------------+
| 110      | SUSAN      | DAVIS     | TROJAN TOMORROW |
| 110      | SUSAN      | DAVIS     | WASH HEAVENLY   |
| 110      | SUSAN      | DAVIS     | WORDS HUNTER    |
| 201      | SUSAN      | DAVIS     |                 |
| 202      | SUSAN      | SMITH     |                 |
+----------+------------+-----------+-----------------+

This worked, because I have changed the JOIN type from INNER JOIN to LEFT JOIN. That’s neat. But what if we hadn’t found any actor called SUSAN? What if we were looking for SUSANNE instead?

SELECT actor_id, first_name, last_name, title
FROM actor
LEFT JOIN film_actor USING (actor_id)
LEFT JOIN film USING (film_id)
WHERE first_name = 'SUSANNE'
ORDER BY 1, 2, 3

Empty. Void. Nothing:

+----------+------------+-----------+-----------------+
| ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE           |
+----------+------------+-----------+-----------------+

That’s fine in most cases, because the way we wrote this query, we’re expecting:

All actors called Susanne and their films, if any

But what if we wanted to have the same behaviour as we got for Films through LEFT JOIN also with the actors? I.e. if we wanted this, instead (i.e. a collection with 1..N cardinality):

+----------+------------+-----------+-----------------+
| ACTOR_ID | FIRST_NAME | LAST_NAME | TITLE           |
|          |            |           |                 |
+----------+------------+-----------+-----------------+

What would you need this for? Well, sometimes, we can simply not handle the depressing sadness of emptiness.

How to do this? We need another LEFT JOIN prepended to the ACTOR table, but not just to the ACTOR table itself, we need to prepend it to everything. E.g. like this:

SELECT actor_id, first_name, last_name, title

-- This dummy table will always generate exactly one row
FROM (
  SELECT 1 a
) a

-- This is the original query, including the WHERE clause
LEFT JOIN (
  SELECT *
  FROM actor
  LEFT JOIN film_actor USING (actor_id)
  LEFT JOIN film USING (film_id)
  WHERE first_name = 'SUSANNE'
) b ON 1 = 1
ORDER BY 1, 2, 3

The above query is guaranteed to produce at least one row, because the left side of the LEFT JOIN always produces exactly one row, which is joined to every row on the right side, if there is any row on the right side.

Caveats:

  • The WHERE clause (and potentially other clauses, like GROUP BY) must now go inside of the new derived table B. Otherwise, we’d be removing that single row from A again using WHERE. (This is because of the order of SQL operations. We must ensure WHERE “happens-before” LEFT JOIN)
  • The LEFT JOIN between A and B needs an ON clause for syntactic reasons, even if we don’t really need that here. Just put something that is always true (like TRUE in PostgreSQL).
  • Our result now has an additional, useless column A, which might bother us, e.g. when using SELECT *

Alternative: OUTER APPLY

If you’re using SQL Server or Oracle 12c, there’s an even more elegant solution using OUTER APPLY:

SQL Server

SELECT actor_id, first_name, last_name, title
FROM (SELECT 1 a) a
OUTER APPLY (
  SELECT a.actor_id, a.first_name, a.last_name, f.title
  FROM actor a
  LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
  LEFT JOIN film f ON fa.film_id = f.film_id
  WHERE first_name = 'SUSANNE'
) b
ORDER BY 1, 2, 3

Oracle 12c

SELECT actor_id, first_name, last_name, title
FROM (SELECT 1 a FROM dual) a
OUTER APPLY (
  SELECT *
  FROM actor
  LEFT JOIN film_actor USING (actor_id)
  LEFT JOIN film USING (film_id)
  WHERE first_name = 'SUSANNE'
) b
ORDER BY 1, 2, 3

While we don’t actually need the nice APPLY feature here, it just allows us to omit the ON clause and still have the LEFT OUTER semantics. Neat ey?

Geek bonus

And if you really want to geek out on this functionality, consider using the dee table from the dum/dee PostgreSQL example. Remember, the dee table is a table with exactly one row and no columns! This means we can use SELECT * without getting this dummy row!

SELECT *

-- This dummy table will always generate exactly one row
FROM dee

-- This is the original query, including the WHERE clause
LEFT JOIN (
  SELECT *
  FROM actor
  LEFT JOIN film_actor USING (actor_id)
  LEFT JOIN film USING (film_id)
  WHERE first_name = 'SUSANNE'
) b ON 1 = 1
ORDER BY 1, 2, 3

Ahh. Beautiful SQL!