How to Avoid Excessive Sorts in Window Functions

Usually, this blog is 100% pro window functions and advocates using them at any occasion. But like any tool, window functions come at a price and we must carefully evaluate if that’s a price we’re willing to pay. That price can be a sort operation. And as we all know, sort operations are expensive. They follow O(n log n) complexity, which must be avoided at all costs for large data sets.

In a previous post, I’ve described how to calculate a running total with window functions (among other ways). In this post, we’re going to calculate the cumulative revenue at each payment in our Sakila database.

SELECT
  customer_id,
  payment_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY payment_date, payment_id
  ) cumulative_amount
FROM payment
ORDER BY customer_id, payment_date, payment_id;

The above will yield something like this:

customer_id |payment_date        |amount |cumulative_amount 
------------|--------------------|-------|------------------
1           |2005-05-25 11:30:37 |2.99   |2.99              
1           |2005-05-28 10:35:23 |0.99   |3.98              
1           |2005-06-15 00:54:12 |5.99   |9.97              
1           |2005-06-15 18:02:53 |0.99   |10.96             
1           |2005-06-15 21:08:46 |9.99   |20.95             
1           |2005-06-16 15:18:57 |4.99   |25.94             
...

As can be seen, in spread sheet notation, cumulative_amount[N] = cumulative_amount[N-1] + amount.

Reusing this calculation in several queries

As in any other language, we don’t want to repeat ourselves, so the SQL way of doing DRY is to create a view or a table valued function. Let’s create a view, first. Something like this:

CREATE VIEW payment_with_revenue AS
SELECT
  customer_id,
  payment_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY payment_date, payment_id
  ) cumulative_amount
FROM payment

Now, we can do nice things like this:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM payment_with_revenue
WHERE customer_id IN (1, 2, 3)
AND payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date

yielding:

customer_id |payment_date        |amount |cumulative_amount 
------------|--------------------|-------|------------------
1           |2005-05-25 11:30:37 |2.99   |2.99              
1           |2005-05-28 10:35:23 |0.99   |3.98              
2           |2005-05-27 00:09:24 |4.99   |4.99              
3           |2005-05-27 17:17:09 |1.99   |1.99              

What about performance?

Now, if we have an index on (CUSTOMER_ID, PAYMENT_DATE), we’d expect to be able to use it, right? Because it seems that our predicate should be able to profit from it:

SELECT 
  count(*),
  count(*) FILTER (
    WHERE customer_id IN (1, 2, 3)
  ),
  count(*) FILTER (
    WHERE customer_id IN (1, 2, 3)
    AND payment_date < DATE '2005-05-29'
  ) 
FROM payment;

yielding:

count |count |count 
------|------|-----
16049 |85    |4     

(To learn more about the cool FILTER clause, read this article here)

How could we best use the index? Let’s look again at our original query, but this time, with an inlined view (“inlined”):

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
) inlined
WHERE customer_id IN (1, 2, 3)
AND payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date;

We should be able to apply two transformations that benefit using the index:

CUSTOMER_ID IN (1, 2, 3) predicate

The CUSTOMER_ID IN (1, 2, 3) predicate should be pushed down into the view, “past” the window function, because it does not affect the window function calculation, which partitions the data set by CUSTOMER_ID. By being pushed “past” the window function, I mean the fact that window functions are calculated late in the order of SELECT clauses.

This means that our original query should be equivalent to this one:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id IN (1, 2, 3) -- Pushed down
) inlined
WHERE payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date;

The PAYMENT_DATE predicate

The PAYMENT_DATE predicate is a bit more tricky. It cannot be pushed “past” the window function completely, because that would alter the semantics of the window function, which calculates the cumulative amount in the RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW range (which is the default, if we do not specify it).

But intuitively (and if you want to spend the time: formally as well), we can show that we can at least push the upper bound of our range predicate into the view, like this:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id IN (1, 2, 3)
  AND payment_date <= DATE '2005-05-29' -- Pushed down
) inlined
WHERE payment_date >= DATE '2005-05-25'
ORDER BY customer_id, payment_date;

And now, we can profit from the index very easily! But is this transformation being done by any database? Unfortunately not. Some databases manage to push down the “more obvious” CUSTOMER_ID predicate past the window function, but none can do the same with the “less obvious” range predicate on PAYMENT_DATE:

DB2 LUW 10.5

The CUSTOMER_ID predicate is pushed down into the view, which generates an index scan (blue) on the pre-existing foreign key index (which doesn’t contain the PAYMENT_DATE column), but the PAYMENT_DATE itself is only filtered much later using an in-memory filter (red):

Explain Plan                                                       
-------------------------------------------------------------------
ID | Operation                       |                  Rows | Cost
 1 | RETURN                          |                       |   40
 2 |  FILTER                         |     4 of 80 (  5.00%) |   40
 3 |   TBSCAN                        |    80 of 80 (100.00%) |   40
 4 |    SORT                         |    80 of 80 (100.00%) |   40
 5 |     NLJOIN                      |               80 of 3 |   40
 6 |      TBSCAN GENROW              |      3 of 3 (100.00%) |    0
 7 |      FETCH PAYMENT              |    27 of 27 (100.00%) |   13
 8 |       IXSCAN IDX_FK_CUSTOMER_ID | 27 of 16049 (   .17%) |    6
                                                                   
Predicate Information                                              
 2 - RESID (Q5.PAYMENT_DATE <= '2005-05-29')                       
     RESID ('2005-05-25' <= Q5.PAYMENT_DATE)                       
 5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                                
 8 - START (Q3.CUSTOMER_ID = Q2.$C0)                               
      STOP (Q3.CUSTOMER_ID = Q2.$C0)                               

Conversely, see the plan of the manually optimised query:

Explain Plan                                                  
--------------------------------------------------------------
ID | Operation                   |                 Rows | Cost
 1 | RETURN                      |                      |   40
 2 |  FILTER                     |     4 of 4 (100.00%) |   40
 3 |   TBSCAN                    |     4 of 4 (100.00%) |   40
 4 |    SORT                     |     4 of 4 (100.00%) |   40
 5 |     NLJOIN                  |               4 of 1 |   40
 6 |      TBSCAN GENROW          |     3 of 3 (100.00%) |    0
 7 |      FETCH PAYMENT          |     1 of 1 (100.00%) |   13
 8 |       IXSCAN IDX_PAYMENT_I1 | 1 of 16049 (   .01%) |    6
                                                              
Predicate Information                                         
 2 - RESID ('2005-05-25' <= Q5.PAYMENT_DATE)                  
 5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                           
 8 - START (Q3.CUSTOMER_ID = Q2.$C0)                          
      STOP (Q3.CUSTOMER_ID = Q2.$C0)                          
      STOP (Q3.PAYMENT_DATE <= '2005-05-29')                  

This is certainly a better plan.

MySQL 8.0.2

MySQL, very regrettably, doesn’t seem to show any effort at all in optimising this. We’re accessing the entire payment table to get this result.

id   table        type  rows    filtered    Extra
-----------------------------------------------------------------------
1    <derived2>   ALL   16086    3.33       Using where
2    payment      ALL   16086  100.00       Using filesort

Here’s the manually optimised plan:

id   table        type  key             rows  filtered    Extra
-------------------------------------------------------------------------------
1    <derived2>   ALL                   4     3.33        Using where
2    payment      range idx_payment_i1  4      100.00     Using index condition

Oracle 12.2.0.1

Oracle also cannot do this beyond pushing the more obvious CUSTOMER_ID predicate into the view:

-------------------------------------------------------------------------------
| Id  | Operation                              | Name                 | Rows  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                      |       |
|*  1 |  VIEW                                  | PAYMENT_WITH_REVENUE |    80 |
|   2 |   WINDOW SORT                          |                      |    80 |
|   3 |    INLIST ITERATOR                     |                      |       |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PAYMENT              |    80 |
|*  5 |      INDEX RANGE SCAN                  | IDX_FK_CUSTOMER_ID   |    80 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(("PAYMENT_DATE">=TO_DATE('2005-05-25 00:00:00') AND 
              "PAYMENT_DATE"<=TO_DATE('2005-05-29 00:00:00')))
   5 - access(("CUSTOMER_ID"=1 OR "CUSTOMER_ID"=2 OR "CUSTOMER_ID"=3))

The manually optimised plan looks better:

-------------------------------------------------------------------------
| Id  | Operation                              | Name           | Rows  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                |       |
|*  1 |  VIEW                                  |                |     1 |
|   2 |   WINDOW SORT                          |                |     1 |
|   3 |    INLIST ITERATOR                     |                |       |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PAYMENT        |     1 |
|*  5 |      INDEX RANGE SCAN                  | IDX_PAYMENT_I1 |     1 |
-------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("PAYMENT_DATE">=TO_DATE('2005-05-25 00:00:00'))
   5 - access(("CUSTOMER_ID" IN (1, 2, 3)) AND 
              "PAYMENT_DATE"<=TO_DATE('2005-05-29 00:00:00'))

Much better cardinality estimates!

PostgreSQL 10

PostgreSQL’s version of the Sakila database uses a partitioned payment table, but that should be irrelevant for this analysis. The CUSTOMER_ID predicate could be pushed down…

QUERY PLAN                                                                                          
---------------------------------------------------------------------------------------------------
Subquery Scan on payment_with_revenue  (cost=117.06..124.45 rows=8 width=52)                       
  Filter: ((payment_date >= '2005-05-25') AND (payment_date <= '2005-05-29'))
-> WindowAgg  (cost=117.06..121.49 rows=197 width=56)                                               
   -> Sort  (cost=117.06..117.55 rows=197 width=24)                                              
      Sort Key: payment.customer_id, payment.payment_date, payment.payment_id                  
      -> Result  (cost=0.29..109.55 rows=197 width=24)                                        
         -> Append  (cost=0.29..107.58 rows=197 width=24)                                  
            -> Index Scan using idx_fk.. on payment  (cost=0.29..18.21 rows=77 width=20)
               Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))
               -> Bitmap Heap Scan on payment_p2007_01  (cost=4.62..14.90 rows=20 width=26)
                  Recheck Cond: (customer_id = ANY ('{1,2,3}'::integer[]))               
                  -> Bitmap Index Scan on idx_fk.. (cost=0.00..4.61 rows=20 width=0)
                     Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))           
                  -> Bitmap Heap Scan on payment_p2007_02  (cost=4.62..14.90 rows=20 width=26)
                     Recheck Cond: (customer_id = ANY ('{1,2,3}'::integer[]))               
                  -> Bitmap Index Scan on idx_fk.. (cost=0.00..4.61 rows=20 width=0)
                     Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))           
              ...

But manual optimisation is required to get better behaviour for the date range:

QUERY PLAN                                                                                           
-----------------------------------------------------------------------------------------------------
Subquery Scan on inlined  (cost=18.46..18.56 rows=3 width=48)                                        
  Filter: (inlined.payment_date >= '2005-05-25'::date)                    
-> WindowAgg  (cost=18.46..18.52 rows=3 width=52)                                                 
   -> Sort  (cost=18.46..18.46 rows=3 width=20)                                                
      Sort Key: payment.customer_id, payment.payment_date, payment.payment_id                
      -> Result  (cost=0.29..18.43 rows=3 width=20)                                         
         -> Append  (cost=0.29..18.40 rows=3 width=20)                                   
            -> Index Scan using idx_fk.. on payment  (cost=0.29..18.40 rows=3 width=20)
                Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))
                Filter: (payment_date <= '2005-05-29'::date)

Interestingly, the index still isn’t used optimally on both columns, which has nothing to do with the current discussion on window functions. PostgreSQL seems to be unable to think of the IN predicate as an equality predicate. See also this article about other optimisations (such as predicate merging) that are not possible (yet) in PostgreSQL.

But still, this is much better as it brings down the estimated cardinalities (in case this query is a subquery in a more sophisticated context), and more importantly, it filters out many many rows prior to calculating the window function.

SQL Server 2014

Another database that cannot push down this predicate past the window function optimally. Only the “obvious” part is pushed down:

|--Sort(ORDER BY:([payment_date] ASC))
   |--Filter(WHERE:([payment_date]>='2005-05-25' AND [payment_date]<='2005-05-29'))
      |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1004]=(0) THEN NULL ELSE [Expr1005] END))
         |--Stream Aggregate(GROUP BY:([WindowCount1009]) DEFINE:(..))
            |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
               |--Segment
                  |--Segment
                     |--Sort(ORDER BY:([customer_id] ASC, [payment_date] ASC, [payment_id] ASC))
                        |--Table Scan(OBJECT:([payment]), WHERE:([customer_id] IN (1, 2, 3)))

Interestingly, this doesn’t even use the index at all, but at least the data is filtered out prior to the calculation that relies on sorting. With the manual optimisation, again the same, much better effect:

|--Filter(WHERE:([payment_date]>='2005-05-25'))
   |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1004]=(0) THEN NULL ELSE [Expr1005] END))
      |--Stream Aggregate(GROUP BY:([WindowCount1011]) DEFINE:(..))
         |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
            |--Segment
               |--Segment
                  |--Sort(ORDER BY:([payment_date] ASC, [payment_id] ASC))
                     |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]))
                        |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1007], [Expr1008], [Expr1006]))
                        |  |--Compute Scalar(DEFINE:(([Expr1007],[Expr1008],[Expr1006])=GetRangeWithMismatchedTypes(NULL,'2005-05-29',(42))))
                        |  |  |--Constant Scan
                        |  |--Index Seek(OBJECT:([idx_payment_i1]), SEEK:([customer_id] IN (1, 2, 3) AND [payment_date] > [Expr1007] AND [payment_date] < [Expr1008]))
                        |--RID Lookup(OBJECT:([payment]))

Certainly, this is a bit cryptic to read but it really means the same thing as always: The manual optimisation worked and we got a better plan.

Meh, does it matter?

I hope so! Let’s benchmark these things against each other! Some info about our benchmarking technique in our previous post and on this page here. Specifically, we don’t publish actual execution times, only relative times within the benchmark as we do not want to compare databases against each other but only against themselves.

DB2 LUW 10.5

RUN |STMT |RATIO  |
----|-----|-------|
1   |1    |3.0890 |
1   |2    |1.2272 |
2   |1    |3.0624 |
2   |2    |1.0100 |
3   |1    |3.0389 |
3   |2    |1.0000 |
4   |1    |3.1566 |
4   |2    |1.0948 |
5   |1    |3.1817 |
5   |2    |1.0905 |

The manually optimised statement is 3x faster in our benchmark. Do bear in mind that we’re operating on a rather small data set of a total of a few thousand rows! This gets worse in a larger data set.

MySQL 8.0.2

The difference is devastating in MySQL 8.0.2, which just recently introduced window functions. Surely, the MySQL team will be able to apply some further optimisations prior to GA – I’ve filed an issue for review:

0	1	431.1905
0	2	1.0000
1	1	372.4286
1	2	1.0000
2	1	413.4762
2	2	1.0000
3	1	381.2857
3	2	1.0000
4	1	400.1429
4	2	1.2857

Oracle 12.2.0.1

Another factor 4x can be observed in Oracle:

Run 1, Statement 1 : 4.58751
Run 1, Statement 2 : 1.37639
Run 2, Statement 1 : 4.71833
Run 2, Statement 2 : 1.03693
Run 3, Statement 1 : 4.05729
Run 3, Statement 2 : 1.04719
Run 4, Statement 1 : 3.86653
Run 4, Statement 2 : 1
Run 5, Statement 1 : 3.99603
Run 5, Statement 2 : 1.0212

PostgreSQL 10

PostgreSQL is quite bad too, here. A factor 7x can be observed:

RUN 1, Statement 1: 7.23373
RUN 1, Statement 2: 1.01438
RUN 2, Statement 1: 6.62028
RUN 2, Statement 2: 1.26183
RUN 3, Statement 1: 8.40322
RUN 3, Statement 2: 1.04074
RUN 4, Statement 1: 6.33401
RUN 4, Statement 2: 1.06750
RUN 5, Statement 1: 6.41649
RUN 5, Statement 2: 1.00000

SQL Server 2014

Another very significant penalty in SQL Server for the unoptimised version:

Run 1, Statement 1: 29.50000
Run 1, Statement 2: 1.07500
Run 2, Statement 1: 28.15000
Run 2, Statement 2: 1.00000
Run 3, Statement 1: 28.00000
Run 3, Statement 2: 1.00000
Run 4, Statement 1: 28.00000
Run 4, Statement 2: 1.00000
Run 5, Statement 1: 31.07500
Run 5, Statement 2: 1.00000

Bad news for views. Is there a better solution?

This is rather bad news for window functions inside of reusable views. None of the databases, not even DB2 or Oracle can push down range predicates past a derived table’s window function, if the column that is part of the range predicate doesn’t correspond to the window function’s PARTITION BY clause.

The problem described above can be easily fixed when the query is written manually, expanding all possible views into their calling SQL, but that kind of sucks – we’d love to make our code reusable. There’s one solution in databases that support inline table valued functions. Among the tested databases, these include:

  • DB2
  • PostgreSQL
  • SQL Server

MySQL doesn’t have table valued functions, and Oracle’s (very regrettably) are not inlineable because they have to be written in PL/SQL.

Here’s how to write these functions:

DB2

Function definition:

CREATE OR REPLACE FUNCTION f_payment_with_revenue (
  p_customer_id BIGINT,
  p_from_date DATE,
  p_to_date DATE
)
RETURNS TABLE (
  customer_id BIGINT,
  payment_date DATE,
  amount DECIMAL(10, 2),
  cumulative_amount DECIMAL(10, 2)
)
LANGUAGE SQL
RETURN
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = p_customer_id
  AND payment_date <= p_to_date
) t
WHERE payment_date >= p_from_date;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) c(customer_id),
TABLE(sakila.f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
))
ORDER BY payment_date;

Execution plan:

Explain Plan                                                    
----------------------------------------------------------------
ID | Operation                     |                 Rows | Cost
 1 | RETURN                        |                      |   33
 2 |  TBSCAN                       |     4 of 4 (100.00%) |   33
 3 |   SORT                        |     4 of 4 (100.00%) |   33
 4 |    NLJOIN                     |               4 of 1 |   33
 5 |     NLJOIN                    |               3 of 1 |   20
 6 |      TBSCAN GENROW            |     3 of 3 (100.00%) |    0
 7 |      IXSCAN PK_CUSTOMER       |   1 of 599 (   .17%) |    6
 8 |     FILTER                    |     1 of 1 (100.00%) |   13
 9 |      TBSCAN                   |     1 of 1 (100.00%) |   13
10 |       SORT                    |     1 of 1 (100.00%) |   13
11 |        FETCH PAYMENT          |     1 of 1 (100.00%) |   13
12 |         IXSCAN IDX_PAYMENT_I1 | 1 of 16049 (   .01%) |    6
                                                                
Predicate Information                                           
  5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                            
  7 - START (Q3.CUSTOMER_ID = Q2.$C0)                           
       STOP (Q3.CUSTOMER_ID = Q2.$C0)                           
  8 - RESID ('2005-05-25' <= Q6.PAYMENT_DATE)                   
 12 - START (Q4.CUSTOMER_ID = Q3.CUSTOMER_ID)                   
       STOP (Q4.CUSTOMER_ID = Q3.CUSTOMER_ID)                   
       STOP (Q4.PAYMENT_DATE <= '2005-05-29')                   

Much better!

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

RUN |STMT |RATIO  |
----|-----|-------|
1   |1    |1.5945 |
1   |2    |1.0080 |
2   |1    |1.6310 |
2   |2    |1.0768 |
3   |1    |1.5827 |
3   |2    |1.0090 |
4   |1    |1.5486 |
4   |2    |1.0084 |
5   |1    |1.5569 |
5   |2    |1.0000 |

Definitely a huge improvement. The comparison might not be entirely fair because

  • CROSS APPLY / LATERAL unnesting tends to generate nested loops that could be written more optimally with a classic join
  • We have an additional auxiliary customer table access (which could probably be tuned away with another rewrite)

PostgreSQL

Function definition:

CREATE OR REPLACE FUNCTION f_payment_with_revenue (
  p_customer_id BIGINT,
  p_from_date DATE,
  p_to_date DATE
)
RETURNS TABLE (
  customer_id SMALLINT,
  payment_date TIMESTAMP,
  amount DECIMAL(10, 2),
  cumulative_amount DECIMAL(10, 2)
)
AS $$
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = p_customer_id
  AND payment_date <= p_to_date
) t
WHERE payment_date >= p_from_date
$$ LANGUAGE SQL;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) c(customer_id)
CROSS JOIN LATERAL f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
)
ORDER BY payment_date;

Execution plan:

QUERY PLAN                                                                                    
----------------------------------------------------------------------------------------------
Sort  (cost=250.39..257.89 rows=3000 width=72)                                                
  Sort Key: f_payment_with_revenue.payment_date                                               
  ->  Nested Loop  (cost=0.53..77.13 rows=3000 width=72)                                      
        ->  Index Only Scan using customer_pkey on customer  (cost=0.28..16.88 rows=3 width=4)
              Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))                          
        ->  Function Scan on f_payment_with_revenue  (cost=0.25..10.25 rows=1000 width=72)    

Oops, no unnesting of the function is happening. The cardinality defaults to 1000. That’s bad news!

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

RUN 1, Statement 1: 25.77538
RUN 1, Statement 2: 1.00000
RUN 2, Statement 1: 27.55197
RUN 2, Statement 2: 1.11581
RUN 3, Statement 1: 27.99331
RUN 3, Statement 2: 1.16463
RUN 4, Statement 1: 29.11022
RUN 4, Statement 2: 1.01159
RUN 5, Statement 1: 26.65781
RUN 5, Statement 2: 1.01654

Rats. This has gotten much worse than with the view. Not surprising, though. Table valued functions are not that good of an idea when they cannot be inlined! Oracle would have had a similar result if I wasn’t too lazy to translate my function to an ordinary PL/SQL table valued function, or a pipelined function.

SQL Server

Function definition:

CREATE FUNCTION f_payment_with_revenue (
  @customer_id BIGINT,
  @from_date DATE,
  @to_date DATE
)
RETURNS TABLE
AS RETURN
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = @customer_id
  AND payment_date <= @to_date
) t
WHERE payment_date >= @from_date;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) AS c(customer_id)
CROSS APPLY f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
)
ORDER BY payment_date;

Execution plan

|--Sort(ORDER BY:([payment_date] ASC))
   |--Nested Loops(Inner Join, OUTER REFERENCES:([customer_id]))
      |--Index Seek(OBJECT:([PK__customer__CD65CB84E826462D]), SEEK:([customer_id] IN (1, 2, 3))
      |--Filter(WHERE:([payment_date]>='2005-05-25'))
         |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1007]=(0) THEN NULL ELSE [Expr1008] END))
            |--Stream Aggregate(GROUP BY:([WindowCount1014]) DEFINE:(..)))
               |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
                  |--Segment
                     |--Segment
                        |--Sort(ORDER BY:([payment_date] ASC, [payment_id] ASC))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1003]))
                              |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1010], [Expr1011], [Expr1009]))
                              |  |--Compute Scalar(DEFINE:(([Expr1010],[Expr1011],[Expr1009])=GetRangeWithMismatchedTypes(NULL,'2005-05-29',(42))))
                              |  |  |--Constant Scan
                              |  |--Index Seek(OBJECT:([idx_payment_i1]), SEEK:([customer_id]=CONVERT_IMPLICIT(bigint,[customer_id],0) AND [payment_date] > [Expr1010] AND [payment_date] < [Expr1011]))
                              |--RID Lookup(OBJECT:([payment]), SEEK:([Bmk1003]=[Bmk1003]))

Again, super unreadable IMO, but after looking a bit more closely, we can see that the plan is almost the same as the manually optimised one, and the predicate is applied early on, where it belongs.

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

Run 1, Statement 1: 2.50000
Run 1, Statement 2: 1.27778
Run 2, Statement 1: 2.11111
Run 2, Statement 2: 1.27778
Run 3, Statement 1: 2.11111
Run 3, Statement 2: 1.00000
Run 4, Statement 1: 2.22222
Run 4, Statement 2: 1.11111
Run 5, Statement 1: 2.02778
Run 5, Statement 2: 1.19444

Conclusion

Window functions are super cool and powerful. But they come at a price. They sort your data. Normally, when we write complex queries and reuse parts in views, we can profit from predicate push down operations into derived tables and views, which is something that most databases support (see also our previous blog post about such optimisations).

But when it comes to using window functions, they act like a “fence”, past which only few predicates can be pushed automatically. It’s not that it wouldn’t be possible, it simply isn’t done very well by most databases (and in the case of MySQL, not at all as of 8.0.2).

Inline table valued functions can be a remedy to avoid manual building of complex queries, such that at least some parts of your logic can be reused among queries. Unfortunately, they rely on CROSS APPLY or LATERAL JOIN, which can also cause performance issues in more complex setups. Besides, among the databases covered in this article, only DB2 and SQL Server support inline table valued functions. Oracle doesn’t support SQL functions at all, and PostgreSQL’s SQL functions are not inlinable (yet), which means that in these databases, in order to tune such queries, you might not be able to reuse the parts that use window functions in views or stored functions.

However, as always, do measure. Perhaps, a 4x waste of performance for a particular query is OK.

10 Cool SQL Optimisations That do not Depend on the Cost Model

Cost Based Optimisation is the de-facto standard way to optimise SQL queries in most modern databases. It is the reason why it is really really hard to implement a complex, hand-written algorithm in a 3GL (third generation programming language) such as Java that outperforms a dynamically calculated database execution plan, that has been generated from a modern optimiser. I’ve recently delivered a talk about that topic:

Today, we don’t want to talk about cost based optimisation, i.e. optimisations that depend on a database’s cost model. We’ll look into much simpler optimisations that can be implemented purely based on meta data (e.g. constraints) and the query itself. They’re usually no-brainers for a database to optimise, because the optimisation will always lead to a better execution plan, independently of whether there are any indexes, or how much data you have, or how skewed your data distribution is.

So, they’re not no-brainers in the sense whether they’re easy for the optimiser teams to implement, but they’re no-brainers in the sense whether they should be done.

These optimisations remove needless, optional work (as opposed to needless, mandatory work, which I’ve blogged about before)

Where do these optimisations apply?

Most of these optimisations are applied to:

  • Fix mistakes in queries
  • Allow for reusing complex views without actually executing the entire logic from the view

In the first case, you could claim: “Well, then fix the stupid SQL already”, but then again, who never makes any mistakes, right?

Specifically, the second case is really cool, as these optimisations allow us to build complex libraries of views and table valued functions, which we can reuse in several layers.

Databases being used

This post will evaluate 10 SQL optimisations on the 5 most popular RDBMS (according to the db-engines ranking):

  • Oracle 12.2
  • MySQL 8.0.2
  • SQL Server 2014
  • PostgreSQL 9.6
  • DB2 LUW 10.5

In all of this article, I will be using queries against the Sakila database – as always.

Sakila database

These will be the 10 optimisation types:

  1. Transitive Closure
  2. Impossible Predicates and Unneeded Table Accesses
  3. JOIN Elimination
  4. Removing “Silly” Predicates
  5. Projections in EXISTS Subqueries
  6. Predicate Merging
  7. Provably Empty Sets
  8. CHECK Constraints
  9. Unneeded Self JOIN
  10. Predicate Pushdown

One final note before you move on: Many of the following examples might be too simple. Some databases (e.g. SQL Server) might not apply a specific optimisation on a query that is “too trivial”. See also the comments for details.

1. Transitive Closure

Let’s start with something simple: transitive closure. It’s a really trivial concept that applies to a variety of maths operations, e.g. to the equality operator. It can be said that if:

  • A = B and…
  • B = C

then:

  • A = C

Duh, right? But this has some nice implications on SQL optimisers.

Let’s look at an example. Let’s get all films for ACTOR_ID = 1:

SELECT first_name, last_name, film_id
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE a.actor_id = 1;

The result being:

FIRST_NAME      LAST_NAME  FILM_ID
PENELOPE        GUINESS    1
PENELOPE        GUINESS    23
PENELOPE        GUINESS    25
PENELOPE        GUINESS    106
PENELOPE        GUINESS    140
PENELOPE        GUINESS    166
...

Now, observe the execution plan if we run this query in Oracle:

--------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |       |
|   1 |  NESTED LOOPS                |               |    19 |
|   2 |   TABLE ACCESS BY INDEX ROWID| ACTOR         |     1 |
|*  3 |    INDEX UNIQUE SCAN         | PK_ACTOR      |     1 |
|*  4 |   INDEX RANGE SCAN           | PK_FILM_ACTOR |    19 |
--------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("A"."ACTOR_ID"=1)
   4 - access("FA"."ACTOR_ID"=1)

Specifically the predicate section is really interesting. The predicate ACTOR_ID = 1 is applied to both the ACTOR and FILM_ACTOR tables because of transitive closure. If:

  • A.ACTOR_ID = 1 (from the WHERE predicate) and…
  • A.ACTOR_ID = FA.ACTOR_ID (from the ON predicate)

Then:

  • FA.ACTOR_ID = 1

In other words, the query is rewritten to this:

SELECT first_name, last_name, film_id
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE a.actor_id = 1
AND fa.actor_id = 1;

Or in this particular case, even this, as the A.ACTOR_ID = 1 predicate ensures a single column from the actor table, so a cross join might do as well (at least that’s what the plan indicates):

SELECT first_name, last_name, film_id
FROM actor a
JOIN film_actor fa ON fa.actor_id = 1
WHERE a.actor_id = 1;

This has a few nice effects on more complex queries. In particular, the cardinality estimates will be much more precise this way, as we can pick the estimate based on a concrete, constant predicate value, rather than e.g. the average number of films per actor as in this query (which returns the same result):

SELECT first_name, last_name, film_id
FROM actor a
JOIN film_actor fa ON a.actor_id = fa.actor_id
WHERE first_name = 'PENELOPE'
AND last_name = 'GUINESS'

The plan being:

----------------------------------------------------------------------------
| Id  | Operation                            | Name                | Rows  |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                     |       |
|   1 |  NESTED LOOPS                        |                     |     2 |
|*  2 |   TABLE ACCESS BY INDEX ROWID BATCHED| ACTOR               |     1 |
|*  3 |    INDEX RANGE SCAN                  | IDX_ACTOR_LAST_NAME |     3 |
|*  4 |   INDEX RANGE SCAN                   | PK_FILM_ACTOR       |    27 |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("A"."FIRST_NAME"='PENELOPE')
   3 - access("A"."LAST_NAME"='GUINESS')
   4 - access("A"."ACTOR_ID"="FA"."ACTOR_ID")

As you can see, the estimate for the number of FILM_ACTOR rows is too high, and the estimate for the NESTED LOOP result is too low. Here are some interesting numbers:

SELECT count(*) FROM film_actor WHERE actor_id = 1;

SELECT avg(c) FROM (
  SELECT count(*) c FROM film_actor GROUP BY actor_id
);

Resulting in:

19
27.315

That’s where those estimates come from. If the database knows we’re dealing with ACTOR_ID = 1, it can pick the statistics on the number of films for that actor. If it doesn’t know this (because our standard statistics don’t correlate FIRST_NAME / LAST_NAME with ACTOR_ID), then we get the average number of films for any actor. Simple, insignificant error in this particular case, but when that error propagates in a complex query, it can add up and lead to the wrong choice of JOIN down the line (or up the plan).

So, when you can, always design JOIN and ordinary predicates to profit from transitive closure.

What other databases support this?

DB2

Yes

Explain Plan                                               
-----------------------------------------------------------
ID | Operation              |                 Rows | Cost  
 1 | RETURN                 |                      |   13  
 2 |  NLJOIN                |              27 of 1 |   13  
 3 |   FETCH ACTOR          |     1 of 1 (100.00%) |    6  
 4 |    IXSCAN PK_ACTOR     |   1 of 200 (   .50%) |    0  
 5 |   IXSCAN PK_FILM_ACTOR | 27 of 5462 (   .49%) |    6  
                                                           
Predicate Information                                      
 4 - START (Q2.ACTOR_ID = 1)                               
      STOP (Q2.ACTOR_ID = 1)                               
 5 - START (1 = Q1.ACTOR_ID)                               
      STOP (1 = Q1.ACTOR_ID)                               

Btw, want cool execution plans like the above on DB2 LUW? Go visit Markus Winand’s script:
http://use-the-index-luke.com/s/last_explained

MySQL

Unfortunately, MySQL explain plans are not very useful for such analyses. We don’t really see the predicate itself in this output:

ID  SELECT TYPE  TABLE  TYPE   REF    ROWS
------------------------------------------
1   SIMPLE       a      const  const  1 
1   SIMPLE       fa     ref    const  19

But the fact that the REF column is two times “const” indicates that we’re scanning for a constant value in both tables. Conversely, the plan that queries for FIRST_NAME / LAST_NAME looks like this:

ID  SELECT TYPE  TABLE  TYPE   REF         ROWS
-----------------------------------------------
1   SIMPLE       a      ref    const       3 
1   SIMPLE       fa     ref    a.actor_id  27

And you can see that REF has now switched to a column reference from the JOIN predicate. The cardinality estimate is now almost the same as in Oracle.

So, yes, MySQL supports transitive closure, too.

PostgreSQL

Yes

QUERY PLAN                                                                          
------------------------------------------------------------------------------------
Nested Loop  (cost=4.49..40.24 rows=27 width=15)                                    
  ->  Seq Scan on actor a  (cost=0.00..4.50 rows=1 width=17)                        
        Filter: (actor_id = 1)                                                      
  ->  Bitmap Heap Scan on film_actor fa  (cost=4.49..35.47 rows=27 width=4)         
        Recheck Cond: (actor_id = 1)                                                
        ->  Bitmap Index Scan on film_actor_pkey  (cost=0.00..4.48 rows=27 width=0) 
              Index Cond: (actor_id = 1)                                            

SQL Server

Yes

  |--Nested Loops(Inner Join)
       |--Nested Loops(Inner Join)
       |    |--Index Seek (SEEK:([a].[actor_id]=(1)))
       |    |--RID Lookup
       |--Index Seek (SEEK:([fa].[actor_id]=(1)))

Summary

All databases can do transitive closure:

Database Transitive closure
DB2 LUW 10.5 Yep
MySQL 8.0.2 Yep
Oracle 12.2.0.1 Yep
PostgreSQL 9.6 Yep
SQL Server 2014 Yep

Stay tuned, though, for #6. There are more complex cases of transitive closure, where not all databases get it right.

2. Impossible Predicates and Unneeded Table Accesses

This optimisation is really silly, but hey, why not. If users write impossible predicates, then why even execute them? Here are some examples:

-- "Obvious"
SELECT * FROM actor WHERE 1 = 0

-- "Subtle"
SELECT * FROM actor WHERE NULL = NULL

The first query should obviously never return any results, but the same is true for the second one, because while NULL IS NULL yields TRUE, always, NULL = NULL evaluates to NULL, which has the same effect as FALSE according to three-valued logic.

This doesn’t need much explanation, so let’s immediately jump to see which databases optimise this:

DB2

Yes

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0

As you can see, the table access to the ACTOR table is completely eliminated from the plan. There’s only a GENROW operation, which generates zero rows. Perfect.

MySQL

Yes

ID  SELECT TYPE  TABLE   EXTRAS
-----------------------------------------
1   SIMPLE         Impossible WHERE

This time, MySQL has been so kind to indicate that the WHERE clause is impossible. Thanks, that’s helpful when analysing – more than the other databases

Oracle

Yes

---------------------------------------------------------------
| Id  | Operation          | Name  | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |      1 |        |      0 |
|*  1 |  FILTER            |       |      1 |        |      0 |
|   2 |   TABLE ACCESS FULL| ACTOR |      0 |    200 |      0 |
---------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NULL IS NOT NULL)

Now, observe that the plan still shows the table access to the ACTOR table, and the estimated number of rows is still 200, but there’s a FILTER operation on Id=1, which can never be true. Because Oracle really really doesn’t like the SQL standard BOOLEAN type, they display NULL IS NOT NULL in the plan, rather than simply FALSE. Oh well 🙂

But seriously, do check for this predicate. I’ve been debugging 1000-line-long execution plan subtrees with super high costs before noticing that the entire subtree was “cut off” by NULL IS NOT NULL. A bit misleading, if you ask me.

PostgreSQL

Yes

QUERY PLAN                                 
-------------------------------------------
Result  (cost=0.00..0.00 rows=0 width=228) 
  One-Time Filter: false                   

That’s nicer. No noisy ACTOR access and a nice little FALSE predicate.

SQL Server

Yes

  |--Constant Scan

SQL Server calls this a “constant scan”, i.e. a scan where nothing happens – just like DB2.

All databases can eliminate impossible predicates:

Database Impossible predicates Unneeded table access
DB2 LUW 10.5 Yep Yep
MySQL 8.0.2 Yep Yep
Oracle 12.2.0.1 Yep Yep
PostgreSQL 9.6 Yep Yep
SQL Server 2014 Yep Yep

3. JOIN Elimination

In the previous section, we’ve seen unneeded table access for single table queries. But what happens if one out of several table accesses is unneeded in a JOIN?

I’ve already blogged about JOIN elimination in a previous blog post. SQL engines can determine, based on the way a query is written, and based on the presence of PRIMARY KEYs and FOREIGN KEYs, whether any given JOIN is really required in a query, or whether it could be eliminated without affecting the semantics of the query.

In all of the following three examples, the JOIN is unnecessary:

to-one INNER JOINs can be removed if there’s a NOT NULL FOREIGN KEY

Instead of this:

SELECT first_name, last_name
FROM customer c
JOIN address a ON c.address_id = a.address_id

The database can run this:

SELECT first_name, last_name
FROM customer c

to-one INNER JOINs can be replaced if there’s a nullable FOREIGN KEY

The above works if there’s also a NOT NULL constraint on the FOREIGN KEY. If there isn’t, e.g. as in this query:

SELECT title
FROM film f
JOIN language l ON f.original_language_id = l.language_id

The JOIN can still be eliminated, but there needs to be a replacement NOT NULL predicate, as such:

SELECT title
FROM film
WHERE original_language_id IS NOT NULL

to-one OUTER JOINs can be removed if there’s a UNIQUE KEY

Instead of this:

SELECT first_name, last_name
FROM customer c
LEFT JOIN address a ON c.address_id = a.address_id

The database can again run this:

SELECT first_name, last_name
FROM customer c

… even if there is no FOREIGN KEY on CUSTOMER.ADDRESS_ID.

to-many DISTINCT OUTER JOINs can be removed

Instead of this:

SELECT DISTINCT first_name, last_name
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id

The database can run this:

SELECT DISTINCT first_name, last_name
FROM actor a

All of these examples are explained in detail in the previous article, so I’m not going to repeat it, but here’s again the summary of what each database can eliminate:

Here’s a summary of what databases can eliminate:

Database INNER JOIN:
to-one
INNER JOIN nullable:
to-one
OUTER JOIN:
to-one
OUTER JOIN DISTINCT:
to-many
DB2 LUW 10.5 Yep Yep Yep Yep
MySQL 8.0.2 Nope Nope Nope Nope
Oracle 12.2.0.1 Yep Yep Yep Nope
PostgreSQL 9.6 Nope Nope Yep Nope
SQL Server 2014 Yep Nope Yep Yep

Unfortunately, not all databases can eliminate all joins. DB2 and SQL Server are the clear winners here!

4. Removing “Silly” Predicates

Equally silly are predicates that are (almost) always true. As you can imagine, if you search for

SELECT * FROM actor WHERE 1 = 1;

Then, databases will not actually evaluate the predicate but ignore it. This was a recent Stack Overflow question that I’ve answered, which actually gave me the idea to write this blog post.

I’ll leave it to you to check this, but what happens if the predicate is just slightly less silly, e.g.:

SELECT * FROM film WHERE release_year = release_year;

Do we actually have to compare the value with itself on each row? No, there’s no value where this can be FALSE, right? Right. But we still have to do a check. While the predicate can never be FALSE, it can totally be NULL, again because of three valued logic. The RELEASE_YEAR column is a nullable column, and if RELEASE_YEAR IS NULL for any given row, then NULL = NULL yields NULL, and the row must be excluded.

So, the query is transformed into this:

SELECT * FROM film WHERE release_year IS NOT NULL;

Which databases do this?

DB2

Yes

Explain Plan                                     
-------------------------------------------------
ID | Operation    |                   Rows | Cost
 1 | RETURN       |                        |   49
 2 |  TBSCAN FILM | 1000 of 1000 (100.00%) |   49
                                                 
Predicate Information                            
 2 - SARG Q1.RELEASE_YEAR IS NOT NULL            

MySQL

Very regrettably, again, because MySQL doesn’t display predicates in their execution plans, it’s a bit hard to find out whether MySQL performs this particular optimisation. We could benchmark things and see if some really big string comparisons are executed or not. Or, we add an index:

CREATE INDEX i_release_year ON film (release_year);

And get the plans for these queries instead:

SELECT * FROM film WHERE release_year = release_year;
SELECT * FROM film WHERE release_year IS NOT NULL;

If the optimisation works, then both queries should produce exactly the same plan. But they don’t in this case:

ID  TABLE  POSSIBLE_KEYS   ROWS  FILTERED  EXTRA
------------------------------------------------------
1   film             1000  10.00           Using where

ID  TABLE  POSSIBLE_KEYS   ROWS  FILTERED  EXTRA
------------------------------------------------------
1   film   i_release_year  1000  100.00    Using where

As you can see, the two queries differ substantially in that the POSSIBLE_KEYS and FILTERED columns yield different values. I’m making an educated guess and say MySQL does not optimise this.

Oracle

Yes

----------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows |
----------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |
|*  1 |  TABLE ACCESS FULL| FILM |      1 |   1000 |
----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("RELEASE_YEAR" IS NOT NULL)

PostgreSQL

Disappointingly, no!

QUERY PLAN                                                    
--------------------------------------------------------------
Seq Scan on film  (cost=0.00..67.50 rows=5 width=386)         
  Filter: ((release_year)::integer = (release_year)::integer) 

The plans and the costs are different. Specifically, observe the cardinality estimate, which is totally off, when this predicate:

SELECT * FROM film WHERE release_year IS NOT NULL;

… yields much better results

QUERY PLAN                                               
---------------------------------------------------------
Seq Scan on film  (cost=0.00..65.00 rows=1000 width=386) 
  Filter: (release_year IS NOT NULL)                     

Bummer!

SQL Server

Surprisingly, also SQL Server doesn’t seem to do this:

  |--Table Scan(OBJECT:([film]), WHERE:([release_year]=[release_year]))

However, the cardinality estimate is correct when looking at the visual plan, and the costs are all correct as well. From what I’ve seen in the past on SQL Server, though, I’m going to say that in this case, the optimisation is not taking place because SQL server would display the actually executed predicate in the plan (look at the CHECK constraint examples below to see why).

What about “silly” predicates on NOT NULL columns?

The above transformation was only needed, because RELEASE_YEAR is a nullable column. What if we did the same silly query with e.g. FILM_ID?

SELECT * FROM film WHERE film_id = film_id

This is now the same as not putting a predicate at all. Or at least it should be. Is it, though?

DB2

Yes!

Explain Plan                                     
-------------------------------------------------
ID | Operation    |                   Rows | Cost
 1 | RETURN       |                        |   49
 2 |  TBSCAN FILM | 1000 of 1000 (100.00%) |   49

No predicate is applied at all, and we’re selecting all the films.

MySQL

Yes (educated guess, again)

ID  TABLE  POSSIBLE_KEYS   ROWS  FILTERED  EXTRA
------------------------------------------------------
1   film                   1000  100.00

Observe how now the EXTRA column is empty as if we didn’t have any WHERE clause!

Oracle

Yes

----------------------------------------------------
| Id  | Operation         | Name | Starts | E-Rows |
----------------------------------------------------
|   0 | SELECT STATEMENT  |      |      1 |        |
|   1 |  TABLE ACCESS FULL| FILM |      1 |   1000 |
----------------------------------------------------

Again, no predicates are applied.

PostgreSQL

Gee, still no!

QUERY PLAN                                            
------------------------------------------------------
Seq Scan on film  (cost=0.00..67.50 rows=5 width=386) 
  Filter: (film_id = film_id)                         

The filter is applied and the cardinality estimate is still 5. Bummer!

SQL Server

Also, still no!

  |--Table Scan(OBJECT:([film]), WHERE:([film_id]=[film_id]))

Summary

This appears like a simple optimisation, but it is not applied in all databases, surprisingly not in SQL Server!

Database Silly but needed predicates
(NULL semantics)
Silly unneeded predicates
(no NULL semantics)
DB2 LUW 10.5 Yep Yep
MySQL 8.0.2 Nope Yep
Oracle 12.2.0.1 Yep Yep
PostgreSQL 9.6 Nope Nope
SQL Server 2014 Nope Nope

5. Projections in EXISTS Subqueries

Interestingly, this one, I get asked all the time in my SQL Masterclass where I advocate that SELECT * is mostly bad.

The question then is, is it OK to use SELECT * in an EXISTS subquery? For instance, if we wanted to find actors who have played in films:

SELECT first_name, last_name
FROM actor a
WHERE EXISTS (
  SELECT * -- Is this OK?
  FROM film_actor fa
  WHERE a.actor_id = fa.actor_id
)

And the answer is: Yes it is OK. The asterisk has no impact on the query. How can we “prove” this? Consider the following query:

-- DB2
SELECT 1 / 0 FROM sysibm.dual

-- Oracle
SELECT 1 / 0 FROM dual

-- PostgreSQL, SQL Server
SELECT 1 / 0

-- MySQL
SELECT pow(-1, 0.5);

All databases report a division by zero error. Note that interestingly, in MySQL, dividing by zero yields NULL, not an error, so we’re doing something else that’s illegal.

Now, what happens if we do this, instead?

-- DB2
SELECT CASE WHEN EXISTS (
  SELECT 1 / 0 FROM sysibm.dual
) THEN 1 ELSE 0 END
FROM sysibm.dual

-- Oracle
SELECT CASE WHEN EXISTS (
  SELECT 1 / 0 FROM dual
) THEN 1 ELSE 0 END
FROM dual

-- PostgreSQL
SELECT EXISTS (SELECT 1 / 0)

-- SQL Server
SELECT CASE WHEN EXISTS (
  SELECT 1 / 0
) THEN 1 ELSE 0 END

-- MySQL
SELECT EXISTS (SELECT pow(-1, 0.5));

Now, none of the databases fail the query. All of them return TRUE or 1. This means that none of the databases actually evaluated the projection (i.e. the SELECT clause) of the EXISTS subquery.

SQL Server, for instance, shows the following plan:

  |--Constant Scan(VALUES:((CASE WHEN (1) THEN (1) ELSE (0) END)))

As you can see, the CASE expression was transformed to a constant, the subquery has been eliminated. Other databases still have the subquery in their plan and don’t mention anything about a projection, so let’s again look at the original query’s plan in Oracle:

SELECT first_name, last_name
FROM actor a
WHERE EXISTS (
  SELECT *
  FROM film_actor fa
  WHERE a.actor_id = fa.actor_id
)

The plan for the above is:

------------------------------------------------------------------
| Id  | Operation             | Name                    | E-Rows |
------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                         |        |
|*  1 |  HASH JOIN SEMI       |                         |    200 |
|   2 |   TABLE ACCESS FULL   | ACTOR                   |    200 |
|   3 |   INDEX FAST FULL SCAN| IDX_FK_FILM_ACTOR_ACTOR |   5462 |
------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("A"."ACTOR_ID"="FA"."ACTOR_ID")
 
Column Projection Information (identified by operation id):
-----------------------------------------------------------
 
   1 - (#keys=1) LAST_NAME, FIRST_NAME
   2 - (rowset=256) A.ACTOR_ID, FIRST_NAME, LAST_NAME
   3 - FA.ACTOR_ID

Observe the projection information on the FILM_ACTOR access in Id=3. In fact, we’re not even accessing the FILM_ACTOR table, because we don’t have to. The EXISTS predicate can be executed using the foreign key index on the ACTOR_ID column only, that’s all we need for this query – despite us having written SELECT *

Summary

Luckily, all databases can remove the projection in EXISTS subqueries

Database EXISTS projection
DB2 LUW 10.5 Yep
MySQL 8.0.2 Yep
Oracle 12.2.0.1 Yep
PostgreSQL 9.6 Yep
SQL Server 2014 Yep

6. Predicate Merging

This one is interesting and has bitten me in the past when I erroneously assumed that a given database could do it.

Consider the following query:

SELECT * 
FROM actor
WHERE actor_id IN (2, 3, 4)
AND actor_id IN (1, 2, 3);

Obviously, the two predicates overlap and can be merged. I would expect the database to transform the above into:

SELECT * 
FROM actor
WHERE actor_id IN (2, 3);

Looks obvious, right? It is a more sophisticated case of transitive closure. Another case would be merging two ranges. When running the following query:

SELECT * 
FROM film
WHERE film_id BETWEEN 1 AND 100
AND film_id BETWEEN 99 AND 200

We’d hope for the database to rewrite the query to this:

SELECT * 
FROM film
WHERE film_id BETWEEN 99 AND 100

The cardinality of the latter predicate should be 2 rows, but the first, combined ranges might not look like it, and the database might choose a full table scan when it should pick the index

Which database can do these optimisations?

DB2

Merging IN predicates

Yes

Explain Plan                                      
--------------------------------------------------
ID | Operation         |               Rows | Cost
 1 | RETURN            |                    |   11
 2 |  FETCH ACTOR      |   2 of 2 (100.00%) |   11
 3 |   IXSCAN PK_ACTOR | 2 of 200 (  1.00%) |    0
                                                  
Predicate Information                             
 3 - SARG Q3.ACTOR_ID IN (2, 3)                   

Merging range predicates

Yes (but don’t be fooled by the plan!)

Explain Plan                                      
--------------------------------------------------
ID | Operation        |                Rows | Cost
 1 | RETURN           |                     |   13
 2 |  FETCH FILM      |    2 of 2 (100.00%) |   13
 3 |   IXSCAN PK_FILM | 2 of 1000 (   .20%) |    6
                                                  
Predicate Information                             
 3 - START (99 <= Q1.FILM_ID)                     
      STOP (Q1.FILM_ID <= 100)                    
      SARG (Q1.FILM_ID <= 200)                    
      SARG (1 <= Q1.FILM_ID)                      

As you can see, the predicate was not optimised away entirely. There’s still a filter (SARG) that checks for the overall upper and lower bounds of the combined range, but the important bits are the START and STOP operations, which indicate fast index access. Besides, the cardinality is also correct.

If you want to be sure, just run this impossible predicate here:

SELECT * 
FROM film
WHERE film_id BETWEEN 1 AND 2
AND film_id BETWEEN 199 AND 200;

To get the correct plan:

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

MySQL

Merging IN predicates

Again, unfortunately, MySQL doesn’t display the predicate information very nicely. We get the same plan for both queries:

ID  TABLE  TYPE   KEY      ROWS  FILTERED  EXTRA
------------------------------------------------------
1   actor  range  PRIMARY  2     100.00    Using where

2x the same cardinalities, 2x “Using where” with no indication what exactly is being done inside of “where”, but given the cardinality, we can assume that the transformation happened correctly. We can look at it differently, let’s try this query:

SELECT * FROM actor
WHERE actor_id IN (3, 4, 5)
AND actor_id IN (1, 2, 3);

Which should be transformed into this one:

SELECT * FROM actor
WHERE actor_id = 3;

And indeed, it happens:

ID  TABLE  TYPE   KEY      ROWS  FILTERED  EXTRA
------------------------------------------------------
1   actor  const  PRIMARY  1     100.00

Observe how TYPE=range changed to TYPE=const

So, we can conclude that yes, MySQL implements this optimisation.

Merging range predicates

Again, the plan is not helpful at all:

ID  TABLE  TYPE   KEY      ROWS  FILTERED  EXTRA
------------------------------------------------------
1   film   range  PRIMARY  2     100.00    Using where

But we can again prove that the optimisation is being done by creating an “impossible” predicate as such:

SELECT * 
FROM film
WHERE film_id BETWEEN 1 AND 2
AND film_id BETWEEN 199 AND 200

In case of which the plan switches to:

ID  TABLE  EXTRA
-----------------------------------------
1          no matching row in const table

So, again good news for MySQL

Oracle

Merging IN predicates

Yes

----------------------------------------------------------
| Id  | Operation                    | Name     | E-Rows |
----------------------------------------------------------
|   0 | SELECT STATEMENT             |          |        |
|   1 |  INLIST ITERATOR             |          |        |
|   2 |   TABLE ACCESS BY INDEX ROWID| ACTOR    |      2 |
|*  3 |    INDEX UNIQUE SCAN         | PK_ACTOR |      2 |
----------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("ACTOR_ID"=2 OR "ACTOR_ID"=3))

The predicate being applied only includes the values 2 and 3, so the transformation has worked out correctly.

Merging range predicates

Again, yes:

----------------------------------------------------------------
| Id  | Operation                           | Name    | E-Rows |
----------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |        |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| FILM    |      2 |
|*  2 |   INDEX RANGE SCAN                  | PK_FILM |      2 |
----------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("FILM_ID">=99 AND "FILM_ID"<=100)

PostgreSQL

Merging IN predicates

Regrettably, no, this is not optimised!

QUERY PLAN                                                                                     
-----------------------------------------------------------------------------------------------
Seq Scan on actor  (cost=0.00..5.50 rows=1 width=25)                                           
  Filter: ((actor_id = ANY ('{2,3,4}'::integer[])) AND (actor_id = ANY ('{1,2,3}'::integer[])))

Both predicates are still present in the execution plan, and the cardinality estimate is wrong, it should be 2, not 1. If I manually transform the query, I’m getting this plan instead:

QUERY PLAN                                           
-----------------------------------------------------
Seq Scan on actor  (cost=0.00..4.50 rows=2 width=25) 
  Filter: (actor_id = ANY ('{2,3}'::integer[]))      

In particular, we can see the wrong plan if the two predicates do not overlap, in case of which an impossible predicate is formed:

SELECT * 
FROM actor
WHERE actor_id IN (2, 3, 4)
AND actor_id IN (7, 8, 9)

Still, this yields a “wrong” plan:

QUERY PLAN                                                                                     
-----------------------------------------------------------------------------------------------
Seq Scan on actor  (cost=0.00..5.50 rows=1 width=25)                                           
  Filter: ((actor_id = ANY ('{2,3,4}'::integer[])) AND (actor_id = ANY ('{7,8,9}'::integer[])))

Bummer!

Merging range predicates

This doesn’t look better

QUERY PLAN                                                                                  
--------------------------------------------------------------------------------------------
Index Scan using film_pkey on film  (cost=0.28..8.30 rows=1 width=386)                      
  Index Cond: ((film_id >= 1) AND (film_id <= 100) AND (film_id >= 99) AND (film_id <= 200))

Now, it’s hard to say whether this worked or not. Ultimately, we have gotten the correct plan with a reasonable cardinality as before, and it might just work out as on DB2. But what happens if we again create an impossible predicate?

SELECT * 
FROM film
WHERE film_id BETWEEN 1 AND 2
AND film_id BETWEEN 199 AND 200;

The plan got worse:

QUERY PLAN                                                                                 
-------------------------------------------------------------------------------------------
Index Scan using film_pkey on film  (cost=0.28..8.42 rows=5 width=386)                     
  Index Cond: ((film_id >= 1) AND (film_id >= 2) AND (film_id >= 199) AND (film_id >= 200))

The cardinality increased instead of it decreasing! And after all, we shouldn’t run this query anyway. No points for PostgreSQL

SQL Server

Merging IN predicates

Yes, this works:

  |--Nested Loops(Inner Join)
       |--Index Seek(SEEK:([actor_id]=(2) OR [actor_id]=(3)))
       |--RID Lookup(OBJECT:([actor]))

Merging range predicates

This again looks like the DB2 case:

  |--Nested Loops(Inner Join)
       |--Index Seek(SEEK:([film_id] >= (1) AND [film_id] <= (100)), WHERE:([film_id]>=(99) AND [film_id]<=(200)))
       |--RID Lookup(OBJECT:([film]))

Unfortunately, observe the distinction between SEEK and WHERE. We want the range [99, 100] in SEEK (as DB2 did) because SEEK is the fast, O(log N) index access, whereas WHERE is linear in O(N) time. Bummer!

This looks like a bug to me, because the impossible predicate yields a more reasonable:

  |--Constant Scan

Summary

Note that there are many different kinds of predicates that might be merged in one database but not in the other. If in doubt, do check your execution plans!

Database Merging IN Merging ranges
DB2 LUW 10.5 Yep Yep
MySQL 8.0.2 Yep Yep
Oracle 12.2.0.1 Yep Yep
PostgreSQL 9.6 Nope Nope
SQL Server 2014 Yep Nope

7. Provably Empty Sets

This one is really cool. We’ve seen Impossible predicates and unneeded table accesses before. What if we do this again, but this time with a JOIN? Can JOIN elimination kick in, too?

We’re trying these queries:

IS NULL on NOT NULL column

The predicate in the WHERE clause cannot be TRUE, because we have a NOT NULL constraint on the FILM_ID column.

SELECT first_name, last_name
FROM actor a
JOIN (
  SELECT *
  FROM film_actor
  WHERE film_id IS NULL
) fa ON a.actor_id = fa.actor_id;

The derived table FA cannot return any rows, because of that NOT NULL constraint on the FA.FILM_ID column, so it is provably empty. Because an INNER JOIN with an empty table cannot produce any rows either, this should save us from accessing the ACTOR table, so the above query should be rewritten to something like this:

SELECT NULL AS first_name, NULL AS last_name
WHERE 1 = 0;

I.e. the predicate is never evaluated and the JOIN is eliminated.

INTERSECT NULL and NOT NULL columns

In principle, this is the same as the previous example, but using a bit more sophisticated syntax:

SELECT *
FROM actor a
JOIN (
  SELECT actor_id, film_id
  FROM film_actor
  INTERSECT
  SELECT NULL, NULL
  FROM dual
) fa ON a.actor_id = fa.actor_id;

Because of the NOT NULL constraints on both FA.ACTOR_ID and FA.FILM_ID, an INTERSECT operation with a (NULL, NULL) tuple should not yield any results, and thus the derived table is provably empty, and thus the INNER JOIN can be eliminated.

Funky, but why not?

Let’s repeat, with EXISTS

Finally, let’s repeat the above type of query, but this time with an SEMI JOIN instead of an INNER JOIN. First with an impossible predicate

SELECT *
FROM actor a
WHERE a.actor_id IN (
  SELECT actor_id
  FROM film_actor
  WHERE actor_id IS NULL
);

… then again with an intersection.

SELECT *
FROM actor a
WHERE a.actor_id IN (
  SELECT actor_id
  FROM film_actor
  INTERSECT
  SELECT NULL
  FROM sysibm.dual
)

Let’s go. Which database can do which optimisation?

DB2

Joining a provably empty set (IS NULL predicate):

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

Joining a provably empty set (INTERSECT):

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

Semi joining a provably empty set (IS NULL predicate):

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

Semi joining a provably empty set (INTERSECT):

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

Wow, cool! Looks like a winner!

MySQL

Joining a provably empty set (IS NULL predicate):

ID  TABLE   EXTRA
----------------------------
1           Impossible WHERE

Cool! I didn’t expect this!

Joining a provably empty set (INTERSECT):

MySQL doesn’t support INTERSECT, regrettably.

Semi joining a provably empty set (IS NULL predicate):

ID  TABLE   EXTRA
----------------------------
1           Impossible WHERE

Semi joining a provably empty set (INTERSECT):

MySQL doesn’t support INTERSECT, regrettably.

But still, that’s a great result for MySQL!

Oracle

Joining a provably empty set (IS NULL predicate):

---------------------------------------------------------------------------
| Id  | Operation              | Name          | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |               |      1 |        |      0 |
|*  1 |  FILTER                |               |      1 |        |      0 |
|*  2 |   HASH JOIN            |               |      0 |   5462 |      0 |
|   3 |    TABLE ACCESS FULL   | ACTOR         |      0 |    200 |      0 |
|   4 |    INDEX FAST FULL SCAN| PK_FILM_ACTOR |      0 |   5462 |      0 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NULL IS NOT NULL)
   2 - access("A"."ACTOR_ID"="FILM_ACTOR"."ACTOR_ID")

Again, a very confusing execution plan in Oracle, but the NULL IS NOT NULL filter is there, and it happens before all the other operations, which are not executed.

Joining a provably empty set (INTERSECT):

---------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |      1 |        |      0 |
|   1 |  NESTED LOOPS                |               |      1 |      1 |      0 |
|   2 |   NESTED LOOPS               |               |      1 |      1 |      0 |
|   3 |    VIEW                      |               |      1 |      1 |      0 |
|   4 |     INTERSECTION             |               |      1 |        |      0 |
|   5 |      SORT UNIQUE             |               |      1 |   5462 |   5463 |
|   6 |       INDEX FAST FULL SCAN   | PK_FILM_ACTOR |      1 |   5462 |   5463 |
|   7 |      FAST DUAL               |               |      1 |      1 |      1 |
|*  8 |    INDEX UNIQUE SCAN         | PK_ACTOR      |      0 |      1 |      0 |
|   9 |   TABLE ACCESS BY INDEX ROWID| ACTOR         |      0 |      1 |      0 |
---------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   8 - access("A"."ACTOR_ID"="FA"."ACTOR_ID")

Interesting. This plan will indeed access the entire FILM_ACTOR primary key. It can save accesses to the ACTOR table and primary key index, because it does the derived table first (which yields no rows), but still those Ids=5 and 6 should not be there. Bummer!

Semi joining a provably empty set (IS NULL predicate):

This works again:

-------------------------------------------------------------------------------------
| Id  | Operation              | Name                    | Starts | E-Rows | A-Rows |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                         |      1 |        |      0 |
|*  1 |  FILTER                |                         |      1 |        |      0 |
|*  2 |   HASH JOIN SEMI       |                         |      0 |    200 |      0 |
|   3 |    TABLE ACCESS FULL   | ACTOR                   |      0 |    200 |      0 |
|   4 |    INDEX FAST FULL SCAN| IDX_FK_FILM_ACTOR_ACTOR |      0 |   5462 |      0 |
-------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NULL IS NOT NULL)
   2 - access("A"."ACTOR_ID"="ACTOR_ID")

… with the same confusing plan that keeps around the unexecuted subtree.

Semi joining a provably empty set (INTERSECT):

Again, no optimisation here:

-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Starts | E-Rows | A-Rows |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |      1 |        |      0 |
|   1 |  NESTED LOOPS                |                         |      1 |      1 |      0 |
|   2 |   NESTED LOOPS               |                         |      1 |      1 |      0 |
|   3 |    VIEW                      | VW_NSO_1                |      1 |      1 |      0 |
|   4 |     INTERSECTION             |                         |      1 |        |      0 |
|   5 |      SORT UNIQUE             |                         |      1 |   5462 |    200 |
|   6 |       INDEX FAST FULL SCAN   | IDX_FK_FILM_ACTOR_ACTOR |      1 |   5462 |   5463 |
|   7 |      FAST DUAL               |                         |      1 |      1 |      1 |
|*  8 |    INDEX UNIQUE SCAN         | PK_ACTOR                |      0 |      1 |      0 |
|   9 |   TABLE ACCESS BY INDEX ROWID| ACTOR                   |      0 |      1 |      0 |
-------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   8 - access("A"."ACTOR_ID"="ACTOR_ID")

Not so good!

PostgreSQL

Disappointingly, PostgreSQL doesn’t fare well in this experiment!

Joining a provably empty set (IS NULL predicate):

Nope:

QUERY PLAN                                                                                  
--------------------------------------------------------------------------------------------
Hash Join  (cost=8.31..13.07 rows=1 width=13)                                               
  Hash Cond: (a.actor_id = film_actor.actor_id)                                             
  ->  Seq Scan on actor a  (cost=0.00..4.00 rows=200 width=17)                              
  ->  Hash  (cost=8.30..8.30 rows=1 width=2)                                                
        ->  Index Scan using idx_fk_film_id on film_actor  (cost=0.28..8.30 rows=1 width=2) 
              Index Cond: (film_id IS NULL)                                                 

Joining a provably empty set (INTERSECT):

Even worse:

QUERY PLAN                                                                                         
---------------------------------------------------------------------------------------------------
Hash Join  (cost=166.60..171.36 rows=1 width=29)                                                   
  Hash Cond: (a.actor_id = fa.actor_id)                                                            
  ->  Seq Scan on actor a  (cost=0.00..4.00 rows=200 width=25)                                     
  ->  Hash  (cost=166.59..166.59 rows=1 width=4)                                                   
        ->  Subquery Scan on fa  (cost=0.00..166.59 rows=1 width=4)                                
              ->  HashSetOp Intersect  (cost=0.00..166.58 rows=1 width=8)                          
                    ->  Append  (cost=0.00..139.26 rows=5463 width=8)                              
                          ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=8)      
                                ->  Result  (cost=0.00..0.01 rows=1 width=4)                       
                          ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..139.24 rows=5462 width=8) 
                                ->  Seq Scan on film_actor  (cost=0.00..84.62 rows=5462 width=4)   

Semi joining a provably empty set (IS NULL predicate):

Same as inner join:

QUERY PLAN                                                                                       
-------------------------------------------------------------------------------------------------
Hash Semi Join  (cost=6.06..10.60 rows=1 width=25)                                               
  Hash Cond: (a.actor_id = film_actor.actor_id)                                                  
  ->  Seq Scan on actor a  (cost=0.00..4.00 rows=200 width=25)                                   
  ->  Hash  (cost=6.05..6.05 rows=1 width=2)                                                     
        ->  Index Only Scan using film_actor_pkey on film_actor  (cost=0.28..6.05 rows=1 width=2)
              Index Cond: (actor_id IS NULL)                                                     

Semi joining a provably empty set (INTERSECT):

Unsurprisingly:

QUERY PLAN                                                                                        
--------------------------------------------------------------------------------------------------
Hash Semi Join  (cost=152.94..157.48 rows=1 width=25)                                             
  Hash Cond: (a.actor_id = "ANY_subquery".actor_id)                                               
  ->  Seq Scan on actor a  (cost=0.00..4.00 rows=200 width=25)                                    
  ->  Hash  (cost=152.93..152.93 rows=1 width=2)                                                  
        ->  Subquery Scan on "ANY_subquery"  (cost=0.00..152.93 rows=1 width=2)                   
              ->  HashSetOp Intersect  (cost=0.00..152.92 rows=1 width=6)                         
                    ->  Append  (cost=0.00..139.26 rows=5463 width=6)                             
                          ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..0.02 rows=1 width=6)     
                                ->  Result  (cost=0.00..0.01 rows=1 width=2)                      
                          ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..139.24 rows=5462 width=6)
                                ->  Seq Scan on film_actor  (cost=0.00..84.62 rows=5462 width=2)  

SQL Server

SQL Server shines, like DB2:

Joining a provably empty set (IS NULL predicate):

  |--Constant Scan

Joining a provably empty set (INTERSECT):

  |--Constant Scan

Semi joining a provably empty set (IS NULL predicate):

  |--Constant Scan

Semi joining a provably empty set (INTERSECT):

  |--Constant Scan

Summary

Database JOIN / NULL JOIN / INTERSECT SEMI JOIN / NULL SEMI JOIN / INTERSECT
DB2 LUW 10.5 Yep Yep Yep Yep
MySQL 8.0.2 Yep Not supported Yep Not supported
Oracle 12.2.0.1 Yep Nope Yep Nope
PostgreSQL 9.6 Nope Nope Nope Nope
SQL Server 2014 Yep Yep Yep Yep

On a side note, this could be done in thousands of other ways. Feel free to comment with your own ideas on how to create “provably empty sets” to see if this is optimised by any of the databases.

8. CHECK Constraints

Oh, this is cool! Our Sakila database has a CHECK constraint on the FILM.RATING table:

CREATE TABLE film (
  ..
  RATING varchar(10) DEFAULT 'G',
  ..
  CONSTRAINT check_special_rating 
    CHECK (rating IN ('G','PG','PG-13','R','NC-17')),
  ..
);

Seriously, use CHECK constraints for data integrity. The cost of adding them is super low – much less than other constraints like PRIMARY, UNIQUE, and FOREIGN KEY constraints, as the do not profit from an index to enforce them, so you get them almost for “free”.

But there’s also an interesting optimisation aspect here! Check out these queries:

Impossible predicate

We’ve seen impossible predicates before, even with NOT NULL constraints (which are special types of CHECK constraints, in fact), but this is even more powerful:

SELECT *
FROM film
WHERE rating = 'N/A';

There can be no such film, because the CHECK constraint prevents its insertion (or update). This should again be transformed into a NOOP. Now, what about this?

CREATE INDEX idx_film_rating ON film (rating);

SELECT count(*)
FROM film
WHERE rating NOT IN ('G','PG','PG-13','R');

With the above index, we should probably simply run a quick index scan to count all the films of rating = ‘NC-17’, because that’s the only remaining rating. So the query should be rewritten to this:

SELECT count(*)
FROM film
WHERE rating = 'NC-17';

It should be, regardless of the index, because comparing the column with a single value is faster than comparing it with 4 values.

So, which database can do these things?

DB2

Impossible predicate (rating = ‘N/A’)

Cool!

Explain Plan                       
-----------------------------------
ID | Operation      |   Rows | Cost
 1 | RETURN         |        |    0
 2 |  TBSCAN GENROW | 0 of 0 |    0
                                   
Predicate Information              
 2 - RESID (1 = 0)                 

Inverse predicate (rating = ‘NC-17’)

Nope…

Explain Plan                                                
------------------------------------------------------------
ID | Operation                |                  Rows | Cost
 1 | RETURN                   |                       |   34
 2 |  GRPBY (COMPLETE)        |    1 of 210 (   .48%) |   34
 3 |   IXSCAN IDX_FILM_RATING | 210 of 1000 ( 21.00%) |   34
                                                            
Predicate Information                                       
 3 - SARG  NOT(Q1.RATING IN ('G', 'PG', 'PG-13', 'R'))      

While the index is used on ID=3 and while the cardinalities are correct, it is scanned entirely, as we do not have a range predicate but a “SARG” predicate. For more details, see Markus Winand’s overview here.

We can also show this by manually inverting the predicate to get:

Explain Plan                                                
------------------------------------------------------------
ID | Operation                |                  Rows | Cost
 1 | RETURN                   |                       |    7
 2 |  GRPBY (COMPLETE)        |    1 of 210 (   .48%) |    7
 3 |   IXSCAN IDX_FILM_RATING | 210 of 1000 ( 21.00%) |    7
                                                            
Predicate Information                                       
 3 - START (Q1.RATING = 'NC-17')                            
      STOP (Q1.RATING = 'NC-17')                            

Now, we’re getting the desired range predicate

MySQL

MySQL supports the CHECK constraint syntax but doesn’t enforce it for whatever reason. Try this:

CREATE TABLE x (a INT CHECK (a != 0));
INSERT INTO x VALUES (0);
SELECT * FROM x;

You’ll get:

A
-
0

Zero points for MySQL (really, why not just support CHECK constraints?)

Oracle

Impossible predicate (rating = ‘N/A’)

--------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      0 |
|*  1 |  FILTER            |      |      1 |        |      0 |
|*  2 |   TABLE ACCESS FULL| FILM |      0 |     89 |      0 |
--------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NULL IS NOT NULL)
   2 - filter("RATING"='N/A')

Again, the super confusing NULL IS NOT NULL filter that cuts off the FULL TABLE SCAN, which might as well be removed entirely from the plan. But at least it works!

Inverse predicate (rating = ‘NC-17’)

Ooops:

----------------------------------------------------------------------------
| Id  | Operation             | Name            | Starts | E-Rows | A-Rows |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                 |      1 |        |      1 |
|   1 |  SORT AGGREGATE       |                 |      1 |      1 |      1 |
|*  2 |   INDEX FAST FULL SCAN| IDX_FILM_RATING |      1 |    415 |    210 |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter((RATING'PG-13' AND RATING'R' AND RATING'PG' AND RATING'G'))

The predicate could not be inversed, we get a much off cardinality estimate, we get an INDEX FAST FULL SCAN, instead of an INDEX RANGE SCAN, and a filter predicate rather than an access predicate. Here’s what we should have gotten, e.g. when manually inverting the predicate:

------------------------------------------------------------------------
| Id  | Operation         | Name            | Starts | E-Rows | A-Rows |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                 |      1 |        |      1 |
|   1 |  SORT AGGREGATE   |                 |      1 |      1 |      1 |
|*  2 |   INDEX RANGE SCAN| IDX_FILM_RATING |      1 |    210 |    210 |
------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("RATING"='NC-17')

Bummer!

PostgreSQL

Note that the Sakila database in its PostgreSQL version uses an ENUM type instead of a CHECK constraint on the RATING column. I’ve duplicated the table to use a CHECK constraint instead.

Impossible predicate (rating = ‘N/A’)

Doesn’t work:

QUERY PLAN                                            
------------------------------------------------------
Seq Scan on film2  (cost=0.00..67.50 rows=1 width=385)
  Filter: ((rating)::text = 'N/A'::text)              

Inverse predicate (rating = ‘NC-17’)

Also nope:

QUERY PLAN                                                        
------------------------------------------------------------------
Aggregate  (cost=70.53..70.54 rows=1 width=8)                     
  ->  Seq Scan on film2  (cost=0.00..70.00 rows=210 width=0)      
        Filter: ((rating)::text  ALL ('{G,PG,PG-13,R}'::text[]))

Too bad!

NOTE: As was kindly pointed out by David Rowley in the comments, this feature can be opted in by specifying:

SET constraint_exclusion TO on;

SQL Server

Impossible predicate (rating = ‘N/A’)

Yes!

  |--Constant Scan

Inverse predicate (rating = ‘NC-17’)

Also yes!

  |--Compute Scalar
       |--Stream Aggregate
            |--Index Seek(OBJECT:([idx_film_rating]), SEEK:([rating]='NC-17'))

Summary

Database Impossible predicate Inverse predicate
DB2 LUW 10.5 Yep Nope
MySQL 8.0.2 Not supported Not supported
Oracle 12.2.0.1 Yep Nope
PostgreSQL 9.6 Nope Nope
SQL Server 2014 Yep Yep

9. Unneeded Self JOIN

When your queries get more complex, it might well happen that you’re going to self JOIN a table based on its primary key. Trust me, this is common practice when you build complex views and JOIN them to each other, so a database noticing this is a crucial step in optimising complex SQL. I won’t show a complex example, but a simple one, e.g.

SELECT a1.first_name, a1.last_name
FROM actor a1
JOIN actor a2 ON a1.actor_id = a2.actor_id;

This could be considered a special case of JOIN elimination as we don’t really need the JOIN of A2, we can do everything with A1 only. Now, INNER JOIN elimination normally works in the presence of a FOREIGN KEY only, which we don’t have here. But because of the PRIMARY KEY on ACTOR_ID, we can prove that in fact A1 = A2. In a way, this is transitive closure all over again.

We can take this one step further and use columns from both A1 and A2:

SELECT a1.first_name, a2.last_name
FROM actor a1
JOIN actor a2 ON a1.actor_id = a2.actor_id;

In the classic JOIN elimination case, we could no longer eliminate the JOIN because we’re projecting from both tables. But since we’ve already proven that A1 = A2, we can use them interchangeably, so the expectation is for this query to be transformed into:

SELECT first_name, last_name
FROM actor;

Who can do this?

DB2

Projecting from A1 only

Yes:

Explain Plan                                    
------------------------------------------------
ID | Operation     |                 Rows | Cost
 1 | RETURN        |                      |   20
 2 |  TBSCAN ACTOR | 200 of 200 (100.00%) |   20

Projecting from A1 and A2

… and yes:

Explain Plan                                    
------------------------------------------------
ID | Operation     |                 Rows | Cost
 1 | RETURN        |                      |   20
 2 |  TBSCAN ACTOR | 200 of 200 (100.00%) |   20

MySQL

Projecting from A1 only

Nope

ID  TABLE  REF          EXTRA
-----------------------------------
1   a1
1   a2     a1.actor_id  Using index

Projecting from A1 and A2

… and nope

ID  TABLE  REF          EXTRA
-----------------------------------
1   a1
1   a2     a1.actor_id  

That’s disappointing…

Oracle

Projecting from A1 only

Yes

--------------------------------------------
| Id  | Operation         | Name  | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT  |       |        |
|   1 |  TABLE ACCESS FULL| ACTOR |    200 |
--------------------------------------------

Projecting from A1 and A2

And yes

--------------------------------------------
| Id  | Operation         | Name  | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT  |       |        |
|   1 |  TABLE ACCESS FULL| ACTOR |    200 |
--------------------------------------------

PostgreSQL

Projecting from A1 only

Nope:

QUERY PLAN                                                          
--------------------------------------------------------------------
Hash Join  (cost=6.50..13.25 rows=200 width=13)                     
  Hash Cond: (a1.actor_id = a2.actor_id)                            
  ->  Seq Scan on actor a1  (cost=0.00..4.00 rows=200 width=17)     
  ->  Hash  (cost=4.00..4.00 rows=200 width=4)                      
        ->  Seq Scan on actor a2  (cost=0.00..4.00 rows=200 width=4)

Projecting from A1 and A2

And nope:

QUERY PLAN                                                           
---------------------------------------------------------------------
Hash Join  (cost=6.50..13.25 rows=200 width=13)                      
  Hash Cond: (a1.actor_id = a2.actor_id)                             
  ->  Seq Scan on actor a1  (cost=0.00..4.00 rows=200 width=10)      
  ->  Hash  (cost=4.00..4.00 rows=200 width=11)                      
        ->  Seq Scan on actor a2  (cost=0.00..4.00 rows=200 width=11)

SQL Server

Projecting from A1 only

Surprisingly, no! (But remember, this is SQL Server 2014, maybe this got fixed in a more recent version. I should definitely upgrade!)

  |--Merge Join(Inner Join, MERGE:([a2].[actor_id])=([a1].[actor_id]))
       |--Index Scan(OBJECT:([a2]))
       |--Sort(ORDER BY:([a1].[actor_id] ASC))
            |--Table Scan(OBJECT:([a1]))

Projecting from A1 and A2

Also no, and even with a different, worse plan:

  |--Hash Match(Inner Join, HASH:([a1].[actor_id])=([a2].[actor_id]))
       |--Table Scan(OBJECT:([sakila].[dbo].[actor] AS [a1]))
       |--Table Scan(OBJECT:([sakila].[dbo].[actor] AS [a2]))

Summary

I would have frankly expected this to work on all databases, but I was proven very wrong, which is a shame. Along with JOIN elimination, this is one of the most crucial optimisations to enable building huge SQL queries from reusable parts, such as views and table valued functions. Unfortunately, this is not supported 3/5 of the most popular databases.

Database Self-join elimination,
single table projection
Self-join elimination,
complete projection
DB2 LUW 10.5 Yep Yep
MySQL 8.0.2 Nope Nope
Oracle 12.2.0.1 Yep Yep
PostgreSQL 9.6 Nope Nope
SQL Server 2014 Nope Nope

10. Predicate Pushdown

This optimisation doesn’t belong here 100%, because it is not entirely true to assume this transformation is not cost based. But since I cannot think of a single obvious reason why an optimiser should not push down predicates into derived tables, I’m listing this here along with the other, non-cost-based optimisations.

Consider this query:

SELECT *
FROM (
  SELECT *
  FROM actor
) a
WHERE a.actor_id = 1;

The derived table has absolutely no value in this query and it should be eliminated as well, by unnesting it. But let’s ignore that for a moment.

We’d expect the database to perform this query instead:

SELECT *
FROM (
  SELECT *
  FROM actor
  WHERE actor_id = 1
) a;

And then again, possibly, eliminate the outer query.

A more sophisticated example would be when using UNION:

SELECT *
FROM (
  SELECT first_name, last_name, 'actor' type
  FROM actor
  UNION ALL
  SELECT first_name, last_name, 'customer' type
  FROM customer
) people
WHERE people.last_name = 'DAVIS';

The result of this query is:

FIRST_NAME  LAST_NAME  TYPE
----------------------------
JENNIFER    DAVIS      actor
SUSAN       DAVIS      actor
SUSAN       DAVIS      actor
JENNIFER    DAVIS      customer

Now, we’d love the database optimiser to run this statement instead:

SELECT *
FROM (
  SELECT first_name, last_name, 'actor' type
  FROM actor
  WHERE last_name = 'DAVIS'
  UNION ALL
  SELECT first_name, last_name, 'customer' type
  FROM customer
  WHERE last_name = 'DAVIS'
) people;

I.e. pushing down the predicate into the derived table, and from there on into the two UNION ALL subqueries, because after all, we have indexes on both ACTOR.LAST_NAME and CUSTOMER.LAST_NAME columns.

Again, this transformation might be motivated based on costs in most databases, but I still think it’s a no-brainer to do anyway, because it’s almost always better to reduce the number of processed tuples as early as possible in any algorithm. If you know a case where this transformation is a bad idea, please comment! I’d be very curious.

So, which databases can do this? (And please, this is so basic, yet important, let the answer be: all)

DB2

Simple derived table

Yes

Explain Plan                                      
--------------------------------------------------
ID | Operation         |               Rows | Cost
 1 | RETURN            |                    |    6
 2 |  FETCH ACTOR      |   1 of 1 (100.00%) |    6
 3 |   IXSCAN PK_ACTOR | 1 of 200 (   .50%) |    0
                                                  
Predicate Information                             
 3 - START (Q1.ACTOR_ID = 1)                      
      STOP (Q1.ACTOR_ID = 1)                      

UNION derived table

Yes, again:

Explain Plan                                                     
-----------------------------------------------------------------
ID | Operation                        |               Rows | Cost
 1 | RETURN                           |                    |   20
 2 |  UNION                           |             2 of 1 |   20
 3 |   FETCH CUSTOMER                 |   1 of 1 (100.00%) |   13
 4 |    IXSCAN IDX_CUSTOMER_LAST_NAME | 1 of 599 (   .17%) |    6
 5 |   FETCH ACTOR                    |   1 of 1 (100.00%) |    6
 6 |    IXSCAN IDX_ACTOR_LAST_NAME    | 1 of 200 (   .50%) |    0
                                                                 
Predicate Information                                            
 4 - START (Q1.LAST_NAME = 'DAVIS')                              
      STOP (Q1.LAST_NAME = 'DAVIS')                              
 6 - START (Q3.LAST_NAME = 'DAVIS')                              
      STOP (Q3.LAST_NAME = 'DAVIS')                              

Also, in both cases, the derived table (view) was removed from the plan as it is not really necessary.

MySQL

Simple derived table

Yes

ID  TABLE  TYPE   KEY      REF    EXTRA
---------------------------------------
1   actor  const  PRIMARY  const

The usual PRIMARY KEY access by a constant value is applied.

UNION derived table

Oops, nope

ID  SELECT_TYPE  TABLE       TYPE  KEY          REF    ROWS  EXTRA
------------------------------------------------------------------
1   PRIMARY        ref   	const  10
2   DERIVED      actor       ALL                       200
3   UNION        customer    ALL                       599

The manual transformation would yield:

ID  SELECT_TYPE  TABLE       TYPE  KEY                  REF    ROWS  EXTRA
--------------------------------------------------------------------------
1   PRIMARY        ALL                               5
2   DERIVED      actor       ref   idx_actor_last_name  const  3
3   UNION        customer    ref   idx_last_name        const  1

That’s really a problem if you want to nest complex queries in MySQL!

Oracle

Simple derived table

Yes, works

---------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |      1 |        |      1 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ACTOR    |      1 |      1 |      1 |
|*  2 |   INDEX UNIQUE SCAN         | PK_ACTOR |      1 |      1 |      1 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ACTOR"."ACTOR_ID"=1)

The derived table has been unnested, too.

UNION derived table

Works as well:

---------------------------------------------------------------------------------
| Id  | Operation                             | Name                   | E-Rows |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                        |        |
|   1 |  VIEW                                 |                        |      4 |
|   2 |   UNION-ALL                           |                        |        |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| ACTOR                  |      3 |
|*  4 |     INDEX RANGE SCAN                  | IDX_ACTOR_LAST_NAME    |      3 |
|   5 |    TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMER               |      1 |
|*  6 |     INDEX RANGE SCAN                  | IDX_CUSTOMER_LAST_NAME |      1 |
---------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("LAST_NAME"='DAVIS')
   6 - access("LAST_NAME"='DAVIS')

However, without unnesting the derived table. The Id=1 “VIEW” indicates that it’s still there. This isn’t a problem in this case, just perhaps a bit cosmetic overhead.

PostgreSQL

Simple derived table

Yes, it works:

QUERY PLAN                                          
----------------------------------------------------
Seq Scan on actor  (cost=0.00..4.50 rows=1 width=25)
  Filter: (actor_id = 1)                            

Note, interestingly, PostgreSQL sometimes doesn’t even use the PRIMARY KEY for a single row lookup but scans the entire table. In this case, 200 rows x 25 bytes per row (“width”) fits in a single block, so why bother reading the index anyway, generating more I/O for this small table access?

UNION derived table

Yes, this works as well:

QUERY PLAN                                                                         
-----------------------------------------------------------------------------------
Append  (cost=0.00..12.83 rows=4 width=45)                                         
  ->  Seq Scan on actor  (cost=0.00..4.50 rows=3 width=45)                         
        Filter: ((last_name)::text = 'DAVIS'::text)                                
  ->  Index Scan using idx_last_name on customer  (cost=0.28..8.29 rows=1 width=45)
        Index Cond: ((last_name)::text = 'DAVIS'::text)                            

Again, the index on ACTOR.LAST_NAME is not used, but the one on CUSTOMER.LAST_NAME is, as the CUSTOMER table is quite larger.

SQL Server

Simple derived table

Yep, works

  |--Nested Loops(Inner Join)
       |--Index Seek(SEEK:([actor_id]=(1)))
       |--RID Lookup(OBJECT:([actor]))

UNION derived table

Works as well.

  |--Concatenation
       |--Compute Scalar(DEFINE:([Expr1003]='actor'))
       |    |--Nested Loops(Inner Join)
       |         |--Index Seek(SEEK:([actor].[last_name]='DAVIS'))
       |         |--RID Lookup(OBJECT:([actor]))
       |--Compute Scalar(DEFINE:([Expr1007]='customer'))
            |--Nested Loops(Inner Join)
                 |--Index Seek(SEEK:([customer].[last_name]='DAVIS'))
                 |--RID Lookup(OBJECT:([customer]))

Summary

My hopes were scattered. MySQL 8.0.2 doesn’t support this simple optimisation completely yet. All others do, however:

Database Simple derived table pushdown UNION derived table pushdown
DB2 LUW 10.5 Yep Yep
MySQL 8.0.2 Yep Nope
Oracle 12.2.0.1 Yep Yep
PostgreSQL 9.6 Yep Yep
SQL Server 2014 Yep Yep

Conclusion

The list presented here is far from complete. There are many more of these simple SQL transformations that are (or should be) a no-brainer for a database to implement, even before the cost-based optimiser kicks in. They remove what I call unnecessary, optional work (as opposed to unnecessary, mandatory work). They are essential tools for:

  • Preventing silly mistakes from affecting SQL performance. Everyone makes mistakes, and as projects grow larger and SQL queries grow more complex, these mistakes might accumulate, yet hopefully, without effect
  • Enabling the reuse of complex building blocks, such as views and table-valued functions, which can be inlined into parent SQL queries, transformed, and parts removed or rewritten

These features are essential for the second part. Without them, it is very difficult to build 4000 LOC SQL queries that still perform decently, based on a library of reusable SQL components.

Unfortunately for users of PostgreSQL and MySQL, these two popular Open Source databases are still much behind their commercial counterparts DB2, Oracle, and SQL Server – where DB2 fared best in this article, Oracle and SQL Server being roughly on par.

SQL is a wonderful language, because it is declarative and any statement can be rewritten to something simpler or more sophisticated, which performs much better than what the author has written. If you have liked this article, you may also like:

How to Execute a SQL Query Only if Another SQL Query has no Results

I stumbled upon an interesting question on Stack Overflow recently. A user wanted to query a table for a given predicate. If that predicate returns no rows, they wanted to run another query using a different predicate. Preferably in a single query.

Challenge accepted!

Canonical Idea: Use a Common Table Expression

We’re querying the Sakila database and we’re trying to find films of length 120 minutes. If there are no such films, then let’s find films of length 130 minutes. The following query is formally correct and runs without any adaptations on all of Oracle, PostgreSQL and SQL Server (and probably on other DBs too, as it’s pretty standard):

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

How does it work?

The common table expression (WITH clause) wraps the first query that we want to execute no matter what. We then select from the first query, and use UNION ALL to combine the result with the result of the second query, which we’re executing only if the first query didn’t yield any results (through NOT EXISTS). We’re hoping here that the database will be smart enough to run the existence check on a pre-calculated set from the first subquery, in order to be able to avoid running the second subquery.

Let’s see, which database actually does this.

PostgreSQL

Running EXPLAIN ANALYZE

EXPLAIN ANALYZE
WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
)

… we can see the following plan:

Append  (cost=68.50..137.26 rows=15 width=561) (actual time=0.052..0.300 rows=9 loops=1)
  CTE r
    ->  Seq Scan on film film_1  (cost=0.00..68.50 rows=9 width=394) (actual time=0.047..0.289 rows=9 loops=1)
          Filter: (length = 120)
          Rows Removed by Filter: 991
  ->  CTE Scan on r  (cost=0.00..0.18 rows=9 width=672) (actual time=0.051..0.297 rows=9 loops=1)
  ->  Result  (cost=0.02..68.52 rows=6 width=394) (actual time=0.002..0.002 rows=0 loops=1)
        One-Time Filter: (NOT $1)
        InitPlan 2 (returns $1)
          ->  CTE Scan on r r_1  (cost=0.00..0.18 rows=9 width=0) (actual time=0.000..0.000 rows=1 loops=1)
        ->  Seq Scan on film  (cost=0.00..68.50 rows=6 width=394) (never executed)
              Filter: (length = 130)
Planning time: 0.952 ms
Execution time: 0.391 ms

So, indeed, the database seems to be smart enough to avoid the second query, because the first one does yield 9 rows.

Can we see this in a benchmark as well? In principle, the complete query should take about as much time in a benchmark as the Common Table Expression alone. Here’s the benchmark logic:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

The result is:

INFO:  Run 1, Statement 1: 00:00:00.310325
INFO:  Run 1, Statement 2: 00:00:00.427744

INFO:  Run 2, Statement 1: 00:00:00.303202
INFO:  Run 2, Statement 2: 00:00:00.33568

INFO:  Run 3, Statement 1: 00:00:00.323699
INFO:  Run 3, Statement 2: 00:00:00.339835

INFO:  Run 4, Statement 1: 00:00:00.301084
INFO:  Run 4, Statement 2: 00:00:00.343838

INFO:  Run 5, Statement 1: 00:00:00.356343
INFO:  Run 5, Statement 2: 00:00:00.359891

As you can see, the second statement is consistently slower by around 5% – 10%. So we can safely say, the second subquery looking for length = 130 is not executed, but there’s still some overhead compared to making a decision in a client application to avoid that second subquery entirely. My guess here is that this is due to PostgreSQL’s Common Table Expression (CTE) being “optimisation fences”, i.e. the CTE is materialised every time. See also:
https://blog.2ndquadrant.com/postgresql-ctes-are-optimization-fences/

What about the inverse case?

In the above benchmark, we’ve measured how much time it takes when the first query succeeds (and the second query should be avoided). What about the inverse case, where the first query doesn’t match any rows and we have to run another query?

Benchmark time!

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 2000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 1200
      ) LOOP
        NULL;
      END LOOP;
      FOR rec IN (
        SELECT * FROM film WHERE length = 130
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 1200
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

The result is roughly the same:

INFO:  Run 1, Statement 1: 00:00:00.680222
INFO:  Run 1, Statement 2: 00:00:00.696036

INFO:  Run 2, Statement 1: 00:00:00.673141
INFO:  Run 2, Statement 2: 00:00:00.709034

INFO:  Run 3, Statement 1: 00:00:00.626873
INFO:  Run 3, Statement 2: 00:00:00.679469

INFO:  Run 4, Statement 1: 00:00:00.619584
INFO:  Run 4, Statement 2: 00:00:00.639092

INFO:  Run 5, Statement 1: 00:00:00.616275
INFO:  Run 5, Statement 2: 00:00:00.675317

A slight overhead in the single query case.

But what’s this? We didn’t even have an index on the LENGTH column. Let’s add one!

Now, the result is very different. Query 1 succeeds:

INFO:  Run 1, Statement 1: 00:00:00.055835
INFO:  Run 1, Statement 2: 00:00:00.093982

INFO:  Run 2, Statement 1: 00:00:00.038817
INFO:  Run 2, Statement 2: 00:00:00.084092

INFO:  Run 3, Statement 1: 00:00:00.041911
INFO:  Run 3, Statement 2: 00:00:00.078062

INFO:  Run 4, Statement 1: 00:00:00.039367
INFO:  Run 4, Statement 2: 00:00:00.081752

INFO:  Run 5, Statement 1: 00:00:00.039983
INFO:  Run 5, Statement 2: 00:00:00.081227

Query 1 fails:

INFO:  Run 1, Statement 1: 00:00:00.075469
INFO:  Run 1, Statement 2: 00:00:00.081766

INFO:  Run 2, Statement 1: 00:00:00.058276
INFO:  Run 2, Statement 2: 00:00:00.079613

INFO:  Run 3, Statement 1: 00:00:00.060492
INFO:  Run 3, Statement 2: 00:00:00.080672

INFO:  Run 4, Statement 1: 00:00:00.05877
INFO:  Run 4, Statement 2: 00:00:00.07936

INFO:  Run 5, Statement 1: 00:00:00.057584
INFO:  Run 5, Statement 2: 00:00:00.085798

Oracle

In Oracle, I couldn’t find any difference in execution speed (see below). The plan of a combined query also contains an element that prevents the execution of the second subquery. In this case, I’m using the /*+GATHER_PLAN_STATISTICS*/ hint to make sure we get actual execution values / times in our execution plan:

WITH r AS (
  SELECT * FROM film WHERE length = 120
)
SELECT /*+GATHER_PLAN_STATISTICS*/ * FROM r
UNION ALL
SELECT * FROM film
WHERE length = 130
AND NOT EXISTS (
  SELECT * FROM r
);

SELECT p.*
FROM (
  SELECT *
  FROM v$sql
  WHERE upper(sql_text) LIKE '%LENGTH = 120%'
  ORDER BY last_active_time DESC
  FETCH NEXT 1 ROW ONLY
) s 
CROSS APPLY TABLE(dbms_xplan.display_cursor(
  sql_id => s.sql_id, 
  format => 'ALLSTATS LAST'
)) p;
---------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      9 |
|   1 |  UNION-ALL          |      |      1 |        |      9 |
|*  2 |   TABLE ACCESS FULL | FILM |      1 |      7 |      9 |
|*  3 |   FILTER            |      |      1 |        |      0 |
|*  4 |    TABLE ACCESS FULL| FILM |      0 |      7 |      0 |
|*  5 |    TABLE ACCESS FULL| FILM |      1 |      2 |      1 |
---------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("LENGTH"=120)
   3 - filter( IS NULL)
   4 - filter("LENGTH"=130)
   5 - filter("LENGTH"=120)

While the estimates are off just as in PostgreSQL (an error that can propagate, see conclusion), the actual rows for the second subquery is zero, and the second subquery is run zero times (“Starts”), because we don’t have to really access it at all. Excellent. Exactly what we expected!

Here, I’ve finally created a benchmark that anonymises the results properly by normalising them in order to comply with Oracle’s forbidding of publishing benchmark results. The fastest execution time is simply 1, and the other execution times are multiples of that value:

SET SERVEROUTPUT ON
CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM film WHERE length = 120
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, 1, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        WITH r AS (
          SELECT * FROM film WHERE length = 120
        )
        SELECT * FROM r
        UNION ALL
        SELECT * FROM film
        WHERE length = 130
        AND NOT EXISTS (
          SELECT * FROM r
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, 2, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(5, 4)) ratio 
    FROM results
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio);
  END LOOP;
END;
/

DROP TABLE results;

The result being (query 1 succeeds, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.26901

Run 2, Statement 1 : 1.10218
Run 2, Statement 2 : 1.08792

Run 3, Statement 1 : 1.26038
Run 3, Statement 2 : 1.09426

Run 4, Statement 1 : 1.2245
Run 4, Statement 2 : 1.10829

Run 5, Statement 1 : 1.07164
Run 5, Statement 2 : 1.18562

Or in the inverse case (query 1 fails, no index):

Run 1, Statement 1 : 1
Run 1, Statement 2 : 1.17871

Run 2, Statement 1 : 1.07377
Run 2, Statement 2 : 1.12489

Run 3, Statement 1 : 1.05745
Run 3, Statement 2 : 1.13711

Run 4, Statement 1 : 1.11118
Run 4, Statement 2 : 1.23508

Run 5, Statement 1 : 1.08535
Run 5, Statement 2 : 1.11271

Adding an index doesn’t change much (query 1 succeeds):

Run 1, Statement 1 : 1.20699
Run 1, Statement 2 : 1.28221

Run 2, Statement 1 : 1
Run 2, Statement 2 : 1.21174

Run 3, Statement 1 : 1.0054
Run 3, Statement 2 : 1.2643

Run 4, Statement 1 : 1.0491
Run 4, Statement 2 : 1.31103

Run 5, Statement 1 : 1.02547
Run 5, Statement 2 : 1.23192

Yet, when query 1 fails:

Run 1, Statement 1 : 1.56287
Run 1, Statement 2 : 1.09471

Run 2, Statement 1 : 1.22219
Run 2, Statement 2 : 1.11227

Run 3, Statement 1 : 1.19739
Run 3, Statement 2 : 1.03929

Run 4, Statement 1 : 1.13503
Run 4, Statement 2 : 1

Run 5, Statement 1 : 1.14289
Run 5, Statement 2 : 1.01919

This time, the combined query is a bit faster!

As can be seen, both queries are executed in roughly the same time on Oracle 12c although again the single query seems to be a little bit slower, but not always. Which is an important reminder to do benchmarking properly! Meaning:

  • Repeat benchmarks several times
  • Beware of warmup penalties (the first run is often the slowest)
  • Beware of excessive caching effects in benchmarks
  • Don’t trust performance differences that aren’t significant
  • Don’t compile any Scala code or chat on Slack while benchmarking. Your system should be idle, otherwise
  • Remember to benchmark the right data set. We only have 600 films in this table. What would happen with 60 million films?

SQL Server

Same exercise again:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 2000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

DECLARE @results TABLE (
  run     INT,
  stmt    INT,
  elapsed DECIMAL
);

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR 
    SELECT title FROM film WHERE length = 120;

  SET @s2 = CURSOR FOR 
    WITH r AS (
      SELECT * FROM film WHERE length = 120
    )
    SELECT title FROM r
    UNION ALL
    SELECT title FROM film
    WHERE length = 130
    AND NOT EXISTS (
      SELECT * FROM r
    );

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  INSERT INTO @results VALUES (@r, 2, DATEDIFF(ms, @ts, current_timestamp));

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  INSERT INTO @results VALUES (@r, 1, DATEDIFF(ms, @ts, current_timestamp));
END;

SELECT 'Run ' + CAST(run AS VARCHAR) + 
  ', Statement ' + CAST(stmt AS VARCHAR) + 
  ': ' + CAST(CAST(elapsed / MIN(elapsed) OVER() AS DECIMAL(10, 5)) AS VARCHAR)
FROM @results;

The result, this time, is more drastic (no index, query 1 succeeds):

Run 1, Statement 1: 1.07292
Run 1, Statement 2: 1.35000

Run 2, Statement 1: 1.07604
Run 2, Statement 2: 1.40625

Run 3, Statement 1: 1.08333
Run 3, Statement 2: 1.40208

Run 4, Statement 1: 1.09375
Run 4, Statement 2: 1.34375

Run 5, Statement 1: 1.00000
Run 5, Statement 2: 1.46458

There is a 30% – 40% overhead for the CTE solution over the two query solution. If we don’t find any rows in the first query (no index):

Run 1, Statement 1: 1.08256
Run 1, Statement 2: 1.27546

Run 2, Statement 1: 1.16512
Run 2, Statement 2: 1.27778

Run 3, Statement 1: 1.00000
Run 3, Statement 2: 1.26235

Run 4, Statement 1: 1.04167
Run 4, Statement 2: 1.26003

Run 5, Statement 1: 1.05401
Run 5, Statement 2: 1.34259

… then the difference is slightly less drastic but still clear. The reason here is that SQL Server doesn’t avoid the unnecessary subquery:

Too bad! (Note I was using SQL Server 2014. Perhaps in 2016, this optimisation is implemented)

Note, you can trust me that adding an index doesn’t change much in this case.

Conclusion

We’ve seen that we can easily solve the original problem with SQL only: Select some data from a table using predicate A, and if we don’t find any data for predicate A, then try finding data using predicate B from the same table.

Oracle and PostgreSQL can both optimise away the unnecessary query 2 by inserting a “probe” in their execution plans that knows whether the query 2 needs to be executed or not. In Oracle, we’ve even seen a situation where the combined query outperforms two individual queries. SQL Server 2014 surprisingly does not have such an optimisation.

While the performance impact was negligible in all benchmarks (even in SQL Server), we should be careful with these kinds of queries and not entirely rely on the optimiser to “get it right”. In all three databases, the cardinality estimates were off. We’re working with small data sets, but if data sets grow larger, and queries like the above are embedded in more complex queries, then the wrong cardinality estimates can easily produce wrong execution plans (e.g. favouring hash join over nested loop joins because of a high number of estimated rows). An example of this was given in a previous blog post.

Nevertheless, we can get quite far with SQL, without resorting to procedural client languages and if I had conducted my benchmark with a JDBC client instead of procedural blocks directly inside of the database, perhaps the single query would have outperformed the double query case – at least in those cases where query 1 yielded no rows and query 2 had to be executed from a remote client. Probably in Oracle.

Ultimately, I can only repeat myself. Measure! Measure! Measure! There’s no point in guessing. Truth can only be found by measuring actual executions.

The Difficulty of Tuning Queries Over a Database Link – Or How I Learned to Stop Worrying and Love the DUAL@LINK Table

A large-ish customer in banking (largest tables on that particular system: ~1 billion rows) once decided to separate the OLTP database from the “log database” in order to better use resources and prevent contention on some tables, as the append-only log database is used heavily for analytic querying of all sorts. That seems to make perfect sense. Except that sometimes, joins need to be done between “main database” and “log database” tables. This is when things get really hard to tune in Oracle – and probably in other databases too.

In this article, however, I’d like to focus on a much simpler example. One that seems to cause no trouble to the optimiser because all joined tables are from the “log database” only. Let’s use the following setup:

-- This is the database link
CREATE PUBLIC DATABASE LINK LOOPBACK 
CONNECT TO TEST IDENTIFIED BY TEST 
USING 'ORCLCDB';

-- Just making sure we get all statistics in execution plans
ALTER SESSION SET statistics_level = ALL;

And then, create this schema:

CREATE TABLE t (
  a INT NOT NULL,
  b INT NOT NULL,
  CONSTRAINT pk_t PRIMARY KEY (a)
);
CREATE TABLE u (
  a INT NOT NULL,
  b INT NOT NULL,
  CONSTRAINT pk_u PRIMARY KEY (a)
);

INSERT INTO t
SELECT
  level,
  level
FROM dual
CONNECT BY level <= 500000;

INSERT INTO u
SELECT
  level,
  level
FROM dual
CONNECT BY level <= 500000;

CREATE INDEX i_t ON t(b);

ALTER TABLE u ADD CONSTRAINT fk_u FOREIGN KEY (a) REFERENCES t;

EXEC dbms_stats.gather_table_stats('TEST', 'T');
EXEC dbms_stats.gather_table_stats('TEST', 'U');

It’s a really boring emulation of the real schema, and it doesn’t have nearly as many columns / rows. But the essence is:

  • There are (at least) two tables
  • Both have quite a few rows (that’s important here. I’ll show why, later)
  • We’ll use an index for searching rows
  • We’ll join by a one-to-many relationship

There may be other setups to reproduce the same issue, of course.

Now, let’s consider the following query (not using the database link yet).

SELECT CASE WHEN EXISTS (
  SELECT *
  FROM t 
  JOIN u USING (a)
  WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual

Unfortunately, Oracle doesn’t support boolean types and always requires a FROM clause. Otherwise, we could be writing this more concise version, as in PostgreSQL:

SELECT EXISTS (
  SELECT *
  FROM t 
  JOIN u USING (a)
  WHERE t.b BETWEEN 0 AND 1000
)

We’re checking for the existence of rows in both tables, given a predicate that runs on the previously created index.

As shown in a previous article, it’s much better to use EXISTS rather than COUNT(*), in pretty much all databases. The algorithm is optimal, because the usage of the EXISTS predicate hints to the optimiser that a SEMI JOIN can be used instead of an INNER JOIN

--------------------------------------------------------------------------
| Operation                            | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT                     |      |      1 |        |      1 |
|  NESTED LOOPS SEMI                   |      |      1 |      4 |      1 |
|   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |   1000 |      1 |
|    INDEX RANGE SCAN                  | I_T  |      1 |   1000 |      1 |
|   INDEX UNIQUE SCAN                  | PK_U |      1 |    333K|      1 |
|  FAST DUAL                           |      |      1 |      1 |      1 |
--------------------------------------------------------------------------

Some observations:

So, this is optimal (until I learn a new trick, of course).

Let’s bring in database links

Assuming that these two tables are on a remote database, we might naively proceed with writing this query:

SELECT CASE WHEN EXISTS (
  SELECT *
  FROM t@loopback 
  JOIN u@loopback USING (a)
  WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual;

So now, we’re selecting from T@LOOPBACK and U@LOOPBACK, but the rest is exactly the same. For the sake of simplicity, I’m running this reproduction on the same instance, thus “LOOPBACK”. The logical impact is the same, though.

------------------------------------------------------
| Operation        | Name | Starts | E-Rows | A-Rows |
------------------------------------------------------
| SELECT STATEMENT |      |      1 |        |      1 |
|  REMOTE          |      |      1 |        |      1 |
|  FAST DUAL       |      |      1 |      1 |      1 |
------------------------------------------------------

Interesting. Or rather: Not too interesting. Sure, our own database knows the correct estimate: 1 row that comes out of the EXISTS() predicate. But the interesting thing happens at the remote database. Let’s look at the plan there. The query being executed on the remote database is this:

SQL_ID  80fczs4r1c9yd, child number 0
-------------------------------------

SELECT 0 FROM "T" "A2","U" "A1" WHERE "A2"."B">=0 AND "A2"."B"=0

So, the EXISTS() predicate is not propagated to the remote database. Thus, the plan:

Plan hash value: 165433672
 
--------------------------------------------------------------------------
| Operation                            | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT                     |      |      1 |        |      1 |
|  HASH JOIN                           |      |      1 |   1000 |      1 |
|   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |   1000 |   1000 |
|    INDEX RANGE SCAN                  | I_T  |      1 |   1000 |   1000 |
|   INDEX FAST FULL SCAN               | PK_U |      1 |    500K|      1 |
--------------------------------------------------------------------------

Oops. Observations:

  • We’re now running a hash join (as expected, given the query that the remote database knows of)
  • We’re materialising the expected 1000 rows from the predicate on T.B
  • But we’re still not fetching all the expected 500,000 rows from the U table because the database that calls this query will abort as soon as it finds a single row

Huh. Bummer. So while we’re not running into a major catastrophe (of materialising all the rows from U), this is still far from optimal. The remote database has no knowledge at all of the fact that we’re going to be selecting 0 or 1 rows only, and that it thus should always run a SEMI JOIN.

You can try adding a /*+FIRST_ROWS(1)*/ hint, but that doesn’t work. It won’t make it to the remote database.

Arcane DUAL@LINK to the rescue

This is when I had an idea. The problem might just be the fact that Oracle always needs a FROM clause, even if it doesn’t make any sense here. So what if we use DUAL@LOOPBACK instead of DUAL. Because that DUAL table, technically, is a table on our own database, so even if it looks as though the entire query can be run on the remote database, that seems not to be true here! So let’s try this:

SELECT CASE WHEN EXISTS (
  SELECT *
  FROM t@loopback 
  JOIN u@loopback USING (a)
  WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual@loopback; -- Subtle difference here!

As I hoped, this subtle change leads to the EXISTS() predicate being sent to the remote database. The query executed on the remote database is now:

SQL_ID  9bz87xw0zc23c, child number 0
-------------------------------------
SELECT CASE  WHEN  EXISTS (SELECT 0 FROM "T" "A3","U" "A2" WHERE 
"A3"."B">=0 AND "A3"."B"<=1000 AND "A3"."A"="A2"."A") THEN 1 ELSE 0 END 
 FROM "DUAL" "A1"

And the plan, now again including the desired SEMI JOIN:

Plan hash value: 1561559448
 
--------------------------------------------------------------------------
| Operation                            | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT                     |      |      1 |        |      1 |
|  NESTED LOOPS SEMI                   |      |      1 |      4 |      1 |
|   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |   1000 |      1 |
|    INDEX RANGE SCAN                  | I_T  |      1 |   1000 |      1 |
|   INDEX UNIQUE SCAN                  | PK_U |      1 |    333K|      1 |
|  FAST DUAL                           |      |      1 |      1 |      1 |
--------------------------------------------------------------------------

Excellent!

Benchmark time

Plans and estimates are one thing. What ultimately counts to business is wall clock time. So, let’s try this again using a benchmark:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 100;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT CASE WHEN EXISTS (
          SELECT *
          FROM t 
          JOIN u USING (a)
          WHERE t.b BETWEEN 0 AND 1000
        ) THEN 1 ELSE 0 END
        FROM dual
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 1 : ' 
      || (SYSTIMESTAMP - v_ts));
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT CASE WHEN EXISTS (
          SELECT *
          FROM t@loopback 
          JOIN u@loopback USING (a)
          WHERE t.b BETWEEN 0 AND 1000
        ) THEN 1 ELSE 0 END
        FROM dual
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 2 : ' 
      || (SYSTIMESTAMP - v_ts));
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT CASE WHEN EXISTS (
          SELECT *
          FROM t@loopback 
          JOIN u@loopback USING (a)
          WHERE t.b BETWEEN 0 AND 1000
        ) THEN 1 ELSE 0 END
        FROM dual@loopback
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 3 : ' 
      || (SYSTIMESTAMP - v_ts));
    dbms_output.put_line('');
  END LOOP;
END;
/

And here are the resulting times:

Run 1, Statement 1 : +000000000 00:00:00.008110000
Run 1, Statement 2 : +000000000 00:00:00.213404000
Run 1, Statement 3 : +000000000 00:00:00.043044000

Run 2, Statement 1 : +000000000 00:00:00.003466000
Run 2, Statement 2 : +000000000 00:00:00.198487000
Run 2, Statement 3 : +000000000 00:00:00.042717000

Run 3, Statement 1 : +000000000 00:00:00.003077000
Run 3, Statement 2 : +000000000 00:00:00.191802000
Run 3, Statement 3 : +000000000 00:00:00.048740000

Run 4, Statement 1 : +000000000 00:00:00.005008000
Run 4, Statement 2 : +000000000 00:00:00.192828000
Run 4, Statement 3 : +000000000 00:00:00.043461000

Run 5, Statement 1 : +000000000 00:00:00.002970000
Run 5, Statement 2 : +000000000 00:00:00.190786000
Run 5, Statement 3 : +000000000 00:00:00.043910000

Clearly, not using the database link is always the fastest, roughly by a factor of 10 compared to the DUAL@LOOPBACK solution. But due to the system design, we don’t have this choice. Nonetheless, you can still see that DUAL@LOOPBACK consistently outperforms DUAL by another factor of around 5 as it still prevents the HASH JOIN!

Caveat: Small data != Big data

There, I said it. “Big Data”. Before, we had a predicate that ran on 1,000 rows in a 500,000 row strong table. Our customer had millions of rows. But what happens if you query small data sets? Let’s reduce the predicate to this:

WHERE t.b BETWEEN 0 AND 10

The benchmark result is now completely different:

Run 1, Statement 1 : +000000000 00:00:00.007093000
Run 1, Statement 2 : +000000000 00:00:00.047539000
Run 1, Statement 3 : +000000000 00:00:00.071546000

Run 2, Statement 1 : +000000000 00:00:00.003023000
Run 2, Statement 2 : +000000000 00:00:00.041259000
Run 2, Statement 3 : +000000000 00:00:00.052132000

Run 3, Statement 1 : +000000000 00:00:00.002767000
Run 3, Statement 2 : +000000000 00:00:00.034190000
Run 3, Statement 3 : +000000000 00:00:00.054023000

Run 4, Statement 1 : +000000000 00:00:00.003468000
Run 4, Statement 2 : +000000000 00:00:00.026141000
Run 4, Statement 3 : +000000000 00:00:00.047415000

Run 5, Statement 1 : +000000000 00:00:00.002818000
Run 5, Statement 2 : +000000000 00:00:00.026100000
Run 5, Statement 3 : +000000000 00:00:00.046875000

And as you can see, the DUAL@LOOPBACK solution actually worsens performance for these queries. The reason for this is that we’re now running, again, a NESTED LOOP JOIN (but not SEMI JOIN) rather than a HASH JOIN on the remote database:

Query on remote database:

SQL_ID  7349t2363uc9m, child number 0
-------------------------------------
SELECT 0 FROM "T" "A2","U" "A1" WHERE "A2"."B">=0 AND "A2"."B"=0

Plan on remote database:

Plan hash value: 2558931407
 
--------------------------------------------------------------------------
| Operation                            | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT                     |      |      1 |        |      1 |
|  NESTED LOOPS                        |      |      1 |     10 |      1 |
|   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |     10 |      1 |
|    INDEX RANGE SCAN                  | I_T  |      1 |     10 |      1 |
|   INDEX UNIQUE SCAN                  | PK_U |      1 |      1 |      1 |
--------------------------------------------------------------------------

I haven’t analysed what the reason for this difference is, as the difference is not significant enough, compared to the improvement for large data sets.

Conclusion

Tuning queries over database links is hard. Much much harder than tuning “ordinary” queries. Ideally, you’ll simply avoid database links and run all queries on a single instance. But sometimes that’s not possible.

In that case, the best solution is to move the logic to the remote query completely and collect only the result. Ideally, this is done using a stored procedure on the remote database and calculating this 1/0 result completely remotely. I think, hipsters these days call this a Microservice, or better, a Lambda:

CREATE FUNCTION t_u RETURN NUMBER IS
  v_result NUMBER;
BEGIN
  SELECT CASE WHEN EXISTS (
    SELECT *
    FROM t 
    JOIN u USING (a)
    WHERE t.b BETWEEN 0 AND 1000
  ) THEN 1 ELSE 0 END
  INTO v_result
  FROM dual;
  
  RETURN v_result;
END t_u;
/

Comparing the benchmark call with the other options:

T.B BETWEEN 0 AND 10

Statement 1 : +000000000 00:00:00.003022000 -- Local query
Statement 2 : +000000000 00:00:00.027416000 -- DUAL
Statement 3 : +000000000 00:00:00.043823000 -- Remote DUAL
Statement 4 : +000000000 00:00:00.022181000 -- Remote stored procedure

T.B BETWEEN 0 AND 1000

Statement 1 : +000000000 00:00:00.002877000 -- Local query
Statement 2 : +000000000 00:00:00.188588000 -- Local DUAL
Statement 3 : +000000000 00:00:00.050163000 -- Remote DUAL
Statement 4 : +000000000 00:00:00.018736000 -- Remote stored procedure

But that, too, might not be possible as you may not have the required rights to create stored procedures on that database. You could call DBMS_SQL on the remote database and run a PL/SQL block dynamically on the remote database (didn’t try that in my benchmark).

Or, you simply use an occasional DUAL@LINK, which might already do the trick with a minimal change to the original query.

How to Calculate Multiple Aggregate Functions in a Single Query

At a customer site, I’ve recently encountered a report where a programmer needed to count quite a bit of stuff from a single table. The counts all differed in the way they used specific predicates. The report looked roughly like this (as always, I’m using the Sakila database for illustration):

-- Total number of films
SELECT count(*)
FROM film

-- Number of films with a given length
SELECT count(*)
FROM film
WHERE length BETWEEN 120 AND 150

-- Number of films with a given language
SELECT count(*)
FROM film
WHERE language_id = 1

-- Number of films for a given rating
SELECT count(*)
FROM film
WHERE rating = 'PG'

And then, unsurprisingly, combinations of these predicates were needed as well, i.e.

-- Number of films with a given length / language_id
SELECT count(*)
FROM film
WHERE length BETWEEN 120 AND 150
AND language_id = 1

-- Number of films with a given length / rating
SELECT count(*)
FROM film
WHERE length BETWEEN 120 AND 150
AND rating = 'PG'

-- Number of films with a given language_id / rating
SELECT count(*)
FROM film
WHERE language_id = 1
AND rating = 'PG'

-- Number of films with a given length / language_id / rating
SELECT count(*)
FROM film
WHERE length BETWEEN 120 AND 150
AND language_id = 1
AND rating = 'PG'

In the end, there were 32 queries in total (or 8 in my example) with all the possible combinations of predicates. Needless to say that running them all took quite a while, because the table had around 200M records and only one predicate could profit from an index.

But in fact, the improvement is really easy. There are several options to calculate all these counts in a single query

Simplest solution works in all databases: Filtered aggregate functions (or manual pivot)

This solution allows for calculating all results in a single query by using 8 different, explicit, filtered aggregate functions and no GROUP BY clause (none in this example. More complex cases where GROUP BY persists are sill imaginable).

This is how it works on all databases:

SELECT 
  count(*),
  count(length),
  count(language_id),
  count(rating),
  count(length + language_id),
  count(length + rating),
  count(language_id + rating),
  count(length + language_id + rating)
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
    CASE WHEN language_id = 1            THEN 1 END language_id,
    CASE WHEN rating = 'PG'              THEN 1 END rating
  FROM film
) film

Which yields:

col1  col2  col3  col4  col5  col6  col7  col8
1000  224   1000  194   224   43    194   43

How to read the above query?

Instead of evaluating the three different predicates in a WHERE clause, we pre-calculate it in a derived table (subquery in the FROM clause) and translate the predicate in some random value (e.g. 1) if TRUE and NULL if FALSE. Note, I omitted the ELSE clause from the CASE expression, which means that we get NULLs per default. Running the nested select on its own…

SELECT
  CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
  CASE WHEN language_id = 1            THEN 1 END language_id,
  CASE WHEN rating = 'PG'              THEN 1 END rating
FROM film

… yields something along the lines of:

length  language_id  rating
---------------------------
NULL    1            1
NULL    1            NULL
NULL    1            NULL
NULL    1            NULL
1       1            NULL
NULL    1            1
NULL    1            NULL
...

(Note, of course, we could have used actual BOOLEAN types, e.g. in PostgreSQL, but that wouldn’t work on all databases)

Now, in the outer query, we’re using once COUNT(*), which simply counts all the rows regardless of any predicates in the CASE expressions. The other COUNT(expr) aggregate functions do something that surprisingly few people are aware of (yet a lot of people use this form “by accident”). They count only the number of non-NULL rows. For instance:

SELECT 
  ...
  count(length),
  ...
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
    ...
  FROM film
) film

Or also:

SELECT 
  count(CASE WHEN length BETWEEN 120 AND 150 THEN 1 END)
FROM
  film

These queries will count those films whose length is BETWEEN 120 AND 150 (because those rows produce the value 1, which is non-NULL, and thus counted), whereas all the other films are not being counted.

Finally, I just used a trick to combine nullable values to make sure they’re all non-NULL:

SELECT 
  ...
  count(length + language_id),
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
    CASE WHEN language_id = 1            THEN 1 END language_id,
    ...
  FROM film
) film

This counts those rows whose length BETWEEN 120 AND 150 and whose language_id = 1, because if either predicate was FALSE, the number would be NULL and thus the sum is NULL as well.

PostgreSQL and HSQLDB variant: FILTER

In PostgreSQL and HSQLDB (and in the SQL standard), there’s a special syntax for this. We can use the FILTER clause instead of encoding values in NULL / non-NULL like this:

SELECT 
  count(*),
  count(*) FILTER (WHERE length IS NOT NULL),
  count(*) FILTER (WHERE language_id IS NOT NULL),
  count(*) FILTER (WHERE rating IS NOT NULL),
  count(*) FILTER (WHERE length + language_id IS NOT NULL),
  count(*) FILTER (WHERE length + rating IS NOT NULL),
  count(*) FILTER (WHERE language_id + rating IS NOT NULL),
  count(*) FILTER (
    WHERE length + language_id + rating IS NOT NULL)
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
    CASE WHEN language_id = 1            THEN 1 END language_id,
    CASE WHEN rating = 'PG'              THEN 1 END rating
  FROM film
) film

Or even, writing out the entire predicates again:

SELECT 
  count(*),
  count(*) FILTER (WHERE length BETWEEN 120 AND 150),
  count(*) FILTER (WHERE language_id = 1),
  count(*) FILTER (WHERE rating = 'PG'),
  count(*) FILTER (
    WHERE length BETWEEN 120 AND 150 AND language_id = 1),
  count(*) FILTER (
    WHERE length BETWEEN 120 AND 150 AND rating = 'PG'),
  count(*) FILTER (
    WHERE language_id = 1 AND rating = 'PG'),
  count(*) FILTER (
    WHERE length BETWEEN 120 AND 150 
    AND language_id = 1 AND rating = 'PG')
FROM film

Usually, the FILTER clause is more convenient, but both approaches are equivalent, and we’re running only a single query!

I also call this “manual PIVOT“, because it really works like a PIVOT table. And the good news is… There is a PIVOT syntax!

A more fancy solution: PIVOT

This solution is vendor-specific and only works in Oracle and with a bit less features in SQL Server. Here’s the Oracle version:

SELECT 
  a + b + c + d + e + f + g + h,
                  e + f + g + h,
          c + d         + g + h,
      b     + d     + f     + h,
                          g + h,
                      f     + h,
              d             + h,
                              h
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 
         THEN 1 ELSE 0 END length,
    CASE WHEN language_id = 1            
         THEN 1 ELSE 0 END language_id,
    CASE WHEN rating = 'PG'              
         THEN 1 ELSE 0 END rating
  FROM film
) film
PIVOT (
  count(*) FOR (length, language_id, rating) IN (
    (0, 0, 0) AS a,
    (0, 0, 1) AS b,
    (0, 1, 0) AS c,
    (0, 1, 1) AS d,
    (1, 0, 0) AS e,
    (1, 0, 1) AS f,
    (1, 1, 0) AS g,
    (1, 1, 1) AS h
  )
)

How to read this solution? There are 3 steps:

Step 1: The derived table

As in the previous example, we’re translating the desired predicates for our report into three columns that produce values 1 and 0. That’s understood so I won’t repeat the explanation.

Step 2: The PIVOT clause

The PIVOT clause can be applied to a table expression to “pivot” it in a similar way as we’re used from Microsoft Excel’s powerful pivot tables. It takes three parts:

  • A list of aggregate functions
  • An expression (FOR clause)
  • A list of expected values (IN clause)

The resulting table expression groups the PIVOT‘s input table by all the remaining columns (i.e. all the columns that are not part of the FOR clause, in our example, that’s no columns), and aggregates all the aggregate functions (in our case, only one) for all the values in the IN list.

If we SELECT * from this PIVOT table:

SELECT *
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 
         THEN 1 ELSE 0 END length,
    CASE WHEN language_id = 1            
         THEN 1 ELSE 0 END language_id,
    CASE WHEN rating = 'PG'              
         THEN 1 ELSE 0 END rating
  FROM film
) film
PIVOT (
  count(*) FOR (length, language_id, rating) IN (
    (0, 0, 0) AS a,
    (0, 0, 1) AS b,
    (0, 1, 0) AS c,
    (0, 1, 1) AS d,
    (1, 0, 0) AS e,
    (1, 0, 1) AS f,
    (1, 1, 0) AS g,
    (1, 1, 1) AS h
  )
)

… we’ll get these values:

a    b    c    d    e    f    g    h
------------------------------------
0    0  625  151    0    0  181   43

As you can see, the column names are generated from the IN list of expected values and the values contained in these columns are aggregations for the different predicates. These aggregations are not exactly the ones we wanted. For instance, column G is all the films whose length BETWEEN 120 AND 150 and whose language_id = 1 and whose RATING != 'PG'.

Step 3: Summing the count values

So, in order to get the expected results, we have to sum all the partial counts as such:

SELECT 
  a + b + c + d + e + f + g + h,
                  e + f + g + h,
          c + d         + g + h,
      b     + d     + f     + h,
                          g + h,
                      f     + h,
              d             + h,
                              h
FROM 
  ...

The result is now the same.

A more fancy solution: GROUPING SETS

GROUPING SETS are a SQL standard and they’re supported in at least:

  • DB2
  • HANA
  • Oracle
  • PostgreSQL
  • SQL Server
  • Sybase SQL Anywhere

Simply put, GROUPING SETS allow for grouping a table several times and creating a UNION of all the results. For example, the following two queries are the same, conceptually, although the GROUPING SETS one is usually faster:

-- Grouping once by language_id, then by rating
SELECT language_id, rating, count(*)
FROM film
GROUP BY GROUPING SETS (
  (language_id),
  (rating)
)

-- Grouping first by language_id
SELECT language_id, NULL, count(*)
FROM film
GROUP BY language_id
UNION ALL
SELECT NULL, rating, count(*)
FROM film
GROUP BY rating

Both queries yield:

language_id   rating   count
          1             1000 -- First grouping set / union subquery
              G          178 \
              PG         194  |
              PG-13      223  | Second grouping set / union subquery
              R          195  |
              NC-17      210 /

Clearly, the GROUPING SETS variant is more concise. Let’s imagine, we’d like to add more combinations of grouping columns, e.g.

SELECT language_id, rating, count(*)
FROM film
GROUP BY GROUPING SETS (
  (),
  (language_id),
  (rating),
  (language_id, rating)
)

Now, we’re grouping by all the combinations of columns, and the result is:

language_id   rating   count
                        1000 -- First grouping set: ()
          1             1000 -- Second grouping set: (language_id)
              G          178 \
              PG         194  |
              PG-13      223  | Third grouping set: (rating)
              R          195  |
              NC-17      210 /
          1   G          178 \
          1   PG         194  |
          1   PG-13      223  | Fourth grouping set: (language_id, rating)
          1   R          195  |
          1   NC-17      210 /

Of course, this would all be more impressive if we had more than one language in the system…

So, how do we solve the original problem with GROUPING SETS? Here’s how:

SELECT 
  GROUPING_ID (length, language_id, rating),
  length,
  language_id,
  rating,
  count(*)
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 
         THEN 1 ELSE 0 END length,
    CASE WHEN language_id = 1            
         THEN 1 ELSE 0 END language_id,
    CASE WHEN rating = 'PG'              
         THEN 1 ELSE 0 END rating
  FROM film
) film
GROUP BY GROUPING SETS (
  (),
  (length),
  (language_id),
  (rating),
  (length, language_id),
  (length, rating),
  (rating, language_id),
  (length, language_id, rating)
)
HAVING COALESCE (length, 1) != 0 
AND COALESCE (language_id, 1) != 0 
AND COALESCE (rating, 1) != 0
ORDER BY GROUPING_ID (length, language_id, rating) DESC

Wow. How to read this? In 4 steps:

Step 1: Again, the derived table

This time, we’ll encode FALSE as 0, not NULL, because NULL already has a different meaning in GROUPING SETS. It means that for a given GROUPING SET, we didn’t group by that column. We’ll see that in step 3.

Step 2: The GROUPING SETS

In this section, we’re just listing all the possible combinations of GROUP BY columns that we want to use, which produces 8 distinct GROUPING SETS. I’ve already explained this in the previous introduction to GROUPING SETS, so this is no different.

Step 3: Filter out unwanted groupings

Just like in the PIVOT example, we’re also getting results for which the predicates are FALSE, but we don’t want those in the result. So we’re filtering them out in the HAVING clause:

SELECT 
  ...
HAVING COALESCE (length, 1) != 0 
AND COALESCE (language_id, 1) != 0 
AND COALESCE (rating, 1) != 0
...

How to read this? E.g. LENGTH can be any of:

  • 1: The length predicate was TRUE
  • 0: The length predicate was FALSE
  • NULL: The length column is not considered for a given GROUPING SET, e.g. () or (rating, language_id)

So, using COALESCE, we’re making sure that we include only 1 and NULL lengths, not 0 lengths.

Step 4: Ordering the results

This is optional, but in order to get the same output order as before, we can use the special GROUPING_ID() (or GROUPING() depending on the DB) function which returns an ID for each GROUPING SET. The output is:

grouping   length   language_id   rating   count
------------------------------------------------
       7     NULL          NULL     NULL    1000
       6     NULL          NULL        1     194
       5     NULL             1     NULL    1000
       4     NULL             1        1     194
       3        1          NULL     NULL     224
       2        1          NULL        1      43
       1        1             1     NULL     224
       0        1             1        1      43

Excellent! And hey, there’s even syntax sugar for “special” GROUPING SETS configurations like ours, where we list all the possible column permutations. In this case, we can use CUBE()!

SELECT 
  GROUPING_ID (length, language_id, rating),
  length,
  language_id,
  rating,
  count(*)
FROM (
  SELECT
    CASE WHEN length BETWEEN 120 AND 150 
         THEN 1 ELSE 0 END length,
    CASE WHEN language_id = 1            
         THEN 1 ELSE 0 END language_id,
    CASE WHEN rating = 'PG'              
         THEN 1 ELSE 0 END rating
  FROM film
) film
GROUP BY CUBE (length, language_id, rating)
HAVING COALESCE(length, 1) != 0 
AND COALESCE(language_id, 1) != 0 
AND COALESCE(rating, 1) != 0
ORDER BY GROUPING_ID (length, language_id, rating) DESC

Performance

Such a comparison blog post wouldn’t be complete if we wouldn’t benchmark for performance. This time, I’ll be benchmarking only for Oracle, as PostgreSQL doesn’t support PIVOT and SQL Server’s PIVOT is more limited than Oracle’s.

Here’s the complete benchmark:

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP

    -- Individual statements
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT count(*) FROM film
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE length BETWEEN 120 AND 150
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE language_id = 1
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE rating = 'PG'
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE length BETWEEN 120 AND 150
        AND language_id = 1
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE length BETWEEN 120 AND 150
        AND rating = 'PG'
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE language_id = 1
        AND rating = 'PG'
      ) LOOP
        NULL;
      END LOOP;

      FOR rec IN (
        SELECT count(*) FROM film 
        WHERE length BETWEEN 120 AND 150
        AND language_id = 1
        AND rating = 'PG'
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 2 : ' || (SYSTIMESTAMP - v_ts));

    -- Manual PIVOT
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT
          count(*),
          count(length),
          count(language_id),
          count(rating),
          count(length + language_id),
          count(length + rating),
          count(language_id + rating),
          count(length + language_id + rating)
        FROM (
          SELECT
            CASE WHEN length BETWEEN 120 AND 150 THEN 1 END length,
            CASE WHEN language_id = 1            THEN 1 END language_id,
            CASE WHEN rating = 'PG'              THEN 1 END rating
          FROM film
        ) film
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 2 : ' || (SYSTIMESTAMP - v_ts));
    
    -- PIVOT
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT 
          a + b + c + d + e + f + g + h,
                          e + f + g + h,
                  c + d         + g + h,
              b     + d     + f     + h,
                                  g + h,
                              f     + h,
                      d             + h,
                                      h
        FROM (
          SELECT
            CASE WHEN length BETWEEN 120 AND 150 THEN 1 ELSE 0 END length,
            CASE WHEN language_id = 1            THEN 1 ELSE 0 END language_id,
            CASE WHEN rating = 'PG'              THEN 1 ELSE 0 END rating
          FROM film
        ) film
        PIVOT (
          count(*) FOR (length, language_id, rating) IN (
            (0, 0, 0) AS a,
            (0, 0, 1) AS b,
            (0, 1, 0) AS c,
            (0, 1, 1) AS d,
            (1, 0, 0) AS e,
            (1, 0, 1) AS f,
            (1, 1, 0) AS g,
            (1, 1, 1) AS h
          )
        )
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 3 : ' || (SYSTIMESTAMP - v_ts));

    -- GROUPING SETS
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT 
          GROUPING_ID (length, language_id, rating),
          length,
          language_id,
          rating,
          count(*)
        FROM (
          SELECT
            CASE WHEN length BETWEEN 120 AND 150 THEN 1 ELSE 0 END length,
            CASE WHEN language_id = 1            THEN 1 ELSE 0 END language_id,
            CASE WHEN rating = 'PG'              THEN 1 ELSE 0 END rating
          FROM film
        ) film
        GROUP BY CUBE (length, language_id, rating)
        HAVING COALESCE (length, 1) != 0 
        AND COALESCE (language_id, 1) != 0 
        AND COALESCE (rating, 1) != 0
        ORDER BY GROUPING_ID (length, language_id, rating) DESC
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r ||', Statement 4 : ' || (SYSTIMESTAMP - v_ts));
  END LOOP;
END;
/

And the results:

Run 1, Statement 1 : +000000000 00:00:01.928497000
Run 1, Statement 2 : +000000000 00:00:01.136341000
Run 1, Statement 3 : +000000000 00:00:02.751679000
Run 1, Statement 4 : +000000000 00:00:00.797529000

Run 2, Statement 1 : +000000000 00:00:01.695543000
Run 2, Statement 2 : +000000000 00:00:01.004073000
Run 2, Statement 3 : +000000000 00:00:02.490895000
Run 2, Statement 4 : +000000000 00:00:00.838979000

Run 3, Statement 1 : +000000000 00:00:01.634047000
Run 3, Statement 2 : +000000000 00:00:01.016266000
Run 3, Statement 3 : +000000000 00:00:02.566895000
Run 3, Statement 4 : +000000000 00:00:00.790159000

Run 4, Statement 1 : +000000000 00:00:01.669844000
Run 4, Statement 2 : +000000000 00:00:01.015502000
Run 4, Statement 3 : +000000000 00:00:02.574646000
Run 4, Statement 4 : +000000000 00:00:00.807804000

Run 5, Statement 1 : +000000000 00:00:01.653498000
Run 5, Statement 2 : +000000000 00:00:00.980375000
Run 5, Statement 3 : +000000000 00:00:02.556186000
Run 5, Statement 4 : +000000000 00:00:00.890283000

Very disappointingly, the PIVOT solution is the slowest every time. I’m assuming there’s some substantial temporary object overhead which wouldn’t be as severe if the table were much larger, but clearly, the manual PIVOT solution (COUNT(CASE ...)) and the GROUPING SETS solution heavily outperform the initial attempt, where we calculate 8 counts individually.

To get back to the original report where 32 counts were calculated: The report was roughly 20x as fast with manual PIVOT on 200M rows and imagine if you need to JOIN – you definitely want to avoid those 32 individual queries and calculate everything in one go.

Cheers!

SQL IN Predicate: With IN List or With Array? Which is Faster?

Hah! Got nerd-sniped again:

http://stackoverflow.com/questions/43099226/how-to-make-jooq-to-use-arrays-in-the-in-clause/43102102

A jOOQ user was wondering why jOOQ would generate an IN list for a predicate like this:

Java

COLUMN.in(1, 2, 3, 4)

SQL

COLUMN in (?, ?, ?, ?)

… when in fact there could have been the following predicate being generated, instead:

COLUMN = any(?::int[])

In the second case, there would have been only one single bind variable instead of 4, and the SQL generation and parsing work would have been “much” less (maybe not for the IN list of size 4, but let’s imagine a list of 50 values).

A disclaimer

First off, a disclaimer: In databases that have a cursor cache / plan cache (e.g. Oracle or SQL Server), you should be careful with long IN lists, because they will probably trigger a hard parse every time you run them, as by the time you run the exact same predicate (with 371 elements in the list) again, the execution plan will have been purged from the cache. So, you cannot really profit from the cache.

I’m aware of this problem, and it will be topic of another blog post, soon. Let’s stick to PostgreSQL whose “plan cache” isn’t really that sophisticated.

Measure, don’t guess

The question was about improving the speed of parsing a SQL statement. Parsers are really fast, so parsing shouldn’t be a problem. Generating an execution plan certainly does cost more time, but again, since PostgreSQL’s plan cache isn’t very sophisticated, this won’t play into the issue here. So the question is really:

Is an IN list really that bad in PostgreSQL?

Would an array bind variable be much better?

Since our recent post about benchmarking, we now know that we shall never guess, but always measure. I’m using again the Sakila database to run these two queries:

-- IN list
SELECT * 
FROM film 
JOIN film_actor USING (film_id) 
JOIN actor USING (actor_id) 
WHERE film_id IN (?, ?, ?, ?)

-- Array
SELECT * 
FROM film 
JOIN film_actor USING (film_id) 
JOIN actor USING (actor_id) 
WHERE film_id = ANY(?)

Let’s try lists of length 4, first. The benchmark is here:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 1000;
  rec RECORD;
  v_e1 INT := 1;
  v_e2 INT := 2;
  v_e3 INT := 4;
  v_e4 INT := 8;
  v_any_arr INT[] := ARRAY[v_e1, v_e2, v_e3, v_e4];
BEGIN
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * 
        FROM film 
        JOIN film_actor USING (film_id) 
        JOIN actor USING (actor_id) 
        WHERE film_id IN (v_e1, v_e2, v_e3, v_e4)
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', 
      r, (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * 
        FROM film 
        JOIN film_actor USING (film_id) 
        JOIN actor USING (actor_id) 
        WHERE film_id = ANY(v_any_arr)
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', 
      r, (clock_timestamp() - v_ts); 
  END LOOP;
END$$;

The result being:

INFO:  Run 1, Statement 1: 00:00:00.112195
INFO:  Run 1, Statement 2: 00:00:00.450461
INFO:  Run 2, Statement 1: 00:00:00.109792
INFO:  Run 2, Statement 2: 00:00:00.446518
INFO:  Run 3, Statement 1: 00:00:00.105413
INFO:  Run 3, Statement 2: 00:00:00.44298
INFO:  Run 4, Statement 1: 00:00:00.108249
INFO:  Run 4, Statement 2: 00:00:00.476527
INFO:  Run 5, Statement 1: 00:00:00.120229
INFO:  Run 5, Statement 2: 00:00:00.448214

Interesting. So, the IN list outperforms the array bind variable every time by a factor of 4 (which is the size of the array / list!) So, let’s try 8 values, then. Here are the values and the adapted query 1:

-- values
  v_e1 INT := 1;
  v_e2 INT := 2;
  v_e3 INT := 4;
  v_e4 INT := 8;
  v_e5 INT := 16;
  v_e6 INT := 32;
  v_e7 INT := 64;
  v_e8 INT := 128;
  v_any_arr INT[] := ARRAY[v_e1, v_e2, v_e3, v_e4, v_e5, v_e6, v_e7, v_e8];

-- adapted query 1 ...
        WHERE film_id IN (v_e1, v_e2, v_e3, v_e4, v_e5, v_e6, v_e7, v_e8)
-- ...

The result is still impressive:

INFO:  Run 1, Statement 1: 00:00:00.182646
INFO:  Run 1, Statement 2: 00:00:00.63624
INFO:  Run 2, Statement 1: 00:00:00.184814
INFO:  Run 2, Statement 2: 00:00:00.685976
INFO:  Run 3, Statement 1: 00:00:00.188108
INFO:  Run 3, Statement 2: 00:00:00.634903
INFO:  Run 4, Statement 1: 00:00:00.184933
INFO:  Run 4, Statement 2: 00:00:00.626616
INFO:  Run 5, Statement 1: 00:00:00.185879
INFO:  Run 5, Statement 2: 00:00:00.636723

The IN list query now takes almost 2x as long (but not quite 2x), whereas the array query now takes around 1.5x as long. It looks as though arrays become the better choice when their size increases. So, let’s do this! With 32 bind variables in the IN list, or 32 array elements respectively:

INFO:  Run 1, Statement 1: 00:00:00.905064
INFO:  Run 1, Statement 2: 00:00:00.752819
INFO:  Run 2, Statement 1: 00:00:00.760475
INFO:  Run 2, Statement 2: 00:00:00.758247
INFO:  Run 3, Statement 1: 00:00:00.777667
INFO:  Run 3, Statement 2: 00:00:00.895875
INFO:  Run 4, Statement 1: 00:00:01.308167
INFO:  Run 4, Statement 2: 00:00:00.789537
INFO:  Run 5, Statement 1: 00:00:00.788606
INFO:  Run 5, Statement 2: 00:00:00.776159

Both are about equally fast. 64 bind values!

INFO:  Run 1, Statement 1: 00:00:00.915069
INFO:  Run 1, Statement 2: 00:00:01.058966
INFO:  Run 2, Statement 1: 00:00:00.951488
INFO:  Run 2, Statement 2: 00:00:00.906285
INFO:  Run 3, Statement 1: 00:00:00.907489
INFO:  Run 3, Statement 2: 00:00:00.892393
INFO:  Run 4, Statement 1: 00:00:00.900424
INFO:  Run 4, Statement 2: 00:00:00.903447
INFO:  Run 5, Statement 1: 00:00:00.961805
INFO:  Run 5, Statement 2: 00:00:00.951697

Still about the same. OK… INTERN! Get over here. I need you to “generate” 128 bind values on this query.

Yep, as expected. Finally, arrays start to outperform IN lists:

INFO:  Run 1, Statement 1: 00:00:01.122866
INFO:  Run 1, Statement 2: 00:00:01.083816
INFO:  Run 2, Statement 1: 00:00:01.416469
INFO:  Run 2, Statement 2: 00:00:01.134882
INFO:  Run 3, Statement 1: 00:00:01.122723
INFO:  Run 3, Statement 2: 00:00:01.087755
INFO:  Run 4, Statement 1: 00:00:01.143148
INFO:  Run 4, Statement 2: 00:00:01.124902
INFO:  Run 5, Statement 1: 00:00:01.236722
INFO:  Run 5, Statement 2: 00:00:01.113741

Using Oracle

Oracle also has array types (although you have to declare them as nominal types first, but that’s not a problem here).

Here are some benchmark results (as always, not actual benchmark results, but anonymised units of measurement. I.e. these aren’t seconds but… Larrys):

4 bind values

Run 1, Statement 1 : 01.911000000
Run 1, Statement 2 : 02.852000000
Run 2, Statement 1 : 01.659000000
Run 2, Statement 2 : 02.680000000
Run 3, Statement 1 : 01.628000000
Run 3, Statement 2 : 02.664000000
Run 4, Statement 1 : 01.629000000
Run 4, Statement 2 : 02.657000000
Run 5, Statement 1 : 01.636000000
Run 5, Statement 2 : 02.688000000

128 bind values

Run 1, Statement 1 : 04.010000000
Run 1, Statement 2 : 06.275000000
Run 2, Statement 1 : 03.749000000
Run 2, Statement 2 : 05.440000000
Run 3, Statement 1 : 03.985000000
Run 3, Statement 2 : 05.387000000
Run 4, Statement 1 : 03.807000000
Run 4, Statement 2 : 05.688000000
Run 5, Statement 1 : 03.782000000
Run 5, Statement 2 : 05.803000000

The size of the number of bind values doesn’t seem to matter really. There’s always a constant overhead of using the array bind variable compared to the IN list, but that might as well be a benchmarking error. For instance, when I add the /*+GATHER_PLAN_STATISTICS*/ hint to both queries, interestingly, the one with the array got significantly faster, whereas the IN list one was not affected… Weird?

Conclusion

This article doesn’t go into why there’s such a big difference for small lists when the benefit is only apparent for quite large lists.

But it has once again shown, that we must not optimise prematurely in SQL, but measure, measure, measure things. IN lists in dynamic SQL queries can be a big issue in production when they lead to cursor cache / plan cache saturation and a lot of “hard parsing”. So, the benefit of using the array is much more drastic when the content is big, as we can recycle execution plans much more often than with IN lists.

But chances are, that IN lists may be faster for single executions.

In any case: Choose carefully when following advice that you find somewhere on the Internet. Also, when following this advice. I ran the benchmark on PostgreSQL 9.5 and Oracle 11gR2 XE. Both are not the latest database versions. Try to measure things again on your side, to be sure that your “improvement” is really an actual improvement! And if in doubt, don’t optimise, until you’re sure you actually have a problem.

How to Benchmark Alternative SQL Queries to Find the Fastest Query

Tuning SQL isn’t always easy, and it takes a lot of practice to recognise how any given query can be optimised. One of the most important slides of my SQL training is the one summarising “how to be fast”:

How to be fast with SQL. Find out with the Data Geekery SQL Training

Some of these bullets were already covered on this blog. For instance avoiding needless, mandatory work, when client code runs queries or parts of queries that aren’t really necessary (e.g. selecting too many columns: “needless”), but the database cannot prove they’re needless, thus: “mandatory” for the database to execute.

But as with many other performance related topics, one key message is not to guess, but to measure! Or, in other words, not to optimise prematurely, but to optimise actual problems.

SQL is full of myths

SQL is a 4GL (Fourth-generation programming language) and as such, has always been a cool, convenient way to express data related constraints and queries. But the declarative nature of the language also often meant that programmers are really looking into a crystal ball. A lot of people have blogged about a lot of half-true discoveries that might have been correct in some context and at some point of time (this blog is no exception).

For instance:

  • Are correlated subqueries slower than their LEFT JOIN equivalents?
  • Are derived tables faster than views or common table expressions?
  • Is COUNT(*) faster than COUNT(1)?

Tons of myhts!

Measure your queries

To bust a myth, if you have good reasons to think that a differently written, but semantically equivalent query might be faster (on your database), you should measure. Don’t even trust any execution plan, because ultimately, what really counts is the wall clock time in your production system.

If you can measure your queries in production, that’s perfect. But often, you cannot – but you don’t always have to. One way to compare two queries with each other is to benchmark them by executing each query hundreds or even thousands of times in a row.

As any technique, benchmarking has pros and cons. Here is a non-exhaustive list:

Pros

  • Easy to do (see examples below)
  • Easy to reproduce, also on different environments
  • Easy to quickly get an idea in terms of orders of magnitude difference

Cons

  • Not actually measuring productive situations (no one runs the same query thousands of times in a row, without any other queries in parallel)
  • Queries may profit from unrealistic caching due to heavy repetition
  • “Real query” might be dynamic, so the “same query” might really manifest itself in dozens of different productive queries

But if you’re fine with the cons above, the pros might outweigh, for instance, if you want to find out whether a correlated subquery is slower than its LEFT JOIN equivalent for a given query. Note my using italics here, because even if you find out it’s slower for that given query it might be faster for other queries. Never jump to generalised rules before measuring again! (More info and scripts about benchmarks here)

For instance, consider these two equivalent queries that run on the Sakila database. Both versions try to find those actors whose last name starts with the letter A and counts their corresponding films:

LEFT JOIN

SELECT first_name, last_name, count(fa.actor_id) AS c
FROM actor a
LEFT JOIN film_actor fa
ON a.actor_id = fa.actor_id
WHERE last_name LIKE 'A%'
GROUP BY a.actor_id, first_name, last_name
ORDER BY c DESC

Correlated subquery

SELECT first_name, last_name, (
  SELECT count(*)
  FROM film_actor fa
  WHERE a.actor_id =
  fa.actor_id
) AS c
FROM actor a
WHERE last_name LIKE 'A%' 
ORDER BY c DESC

The result is always:

The queries have different execution plans on PostgreSQL, Oracle, SQL Server as can be seen below:

PostgreSQL LEFT JOIN

(Plan looks “better”)

PostgreSQL correlated subquery

(Plan looks “worse”)

Oracle LEFT JOIN

(Plan looks “more complicated”)

Oracle correlated subquery

(Plan looks “simpler”)

SQL Server LEFT JOIN

(Plan looks “reasonable”)

SQL Server correlated subquery

(Plan looks… geez, where’s my correlated subquery? It’s been transformed to a LEFT JOIN!)

Huh, as you can see, in SQL Server, both queries produce the exact same plan (as they should, because the queries are really equivalent). But not all databases recognise this and/or optimise this. At least, that’s what the estimated plans suggest.

Also, don’t jump to the conclusion that if the cost of one plan is lower then it’s a better plan than an alternative. Costs can only really be compared when comparing alternative plans for the same query, e.g. in the Oracle example, we had both HASH JOIN and NESTED LOOP JOIN in a single plan, because Oracle 12c may collect runtime statistics and switch plans in flight thanks to the Oracle 12c Adaptive Query Optimization features.

But let’s ignore all of this and look at actual execution times, instead:

Benchmarking the alternatives

As always, disclaimer: Some commercial databases do not allow for publishing benchmark results without prior written consent. As I never ask for permission, but always ask for forgiveness, I do not have consent, and I’m thus not publishing actual benchmark results.

I have anonymized the benchmark results by introducing hypothetical, non-comparable units of measurement, so you cannot see that PostgreSQL is totally slower than Oracle and/or SQL Server. And you cannot see that SQL Server’s procedural language is totally uglier than PostgreSQL’s and/or Oracle’s.

Legal people.

Solving problems we wouldn’t have without legal people, in the first place

Enough ranting. Some important considerations:

  • Ideally, you’ll run benchmarks directly in the database using a procedural language, rather than, e.g. over JDBC to avoid network latency that incurs with JDBC calls, and other non-desired side-effects.
  • Repeat the benchmarks several times to prevent warmup side-effects and other random issues, as your OS / file system may be busy with accidental Scala compilation, or Slack UI refreshes
  • Be sure to actually consume the entire result set of each query in a loop, rather than just executing the query. Some databases may optimise for lazy cursor consumption (and possibly abortion). It would be unfair not to consume the entire result set

PostgreSQL

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 10000;
  rec RECORD;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR i IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT first_name, last_name, count(fa.actor_id) AS c
        FROM actor a
        LEFT JOIN film_actor fa
        ON a.actor_id = fa.actor_id
        WHERE last_name LIKE 'A%'
        GROUP BY a.actor_id, first_name, last_name
        ORDER BY c DESC
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', i, (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT first_name, last_name, (
          SELECT count(*)
          FROM film_actor fa
          WHERE a.actor_id =
          fa.actor_id
        ) AS c
        FROM actor a
        WHERE last_name LIKE 'A%' 
        ORDER BY c DESC
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', i, (clock_timestamp() - v_ts); 
  END LOOP;
END$$;

The result is:

INFO:  Run 1, Statement 1: 00:00:01.708257
INFO:  Run 1, Statement 2: 00:00:01.252012
INFO:  Run 2, Statement 1: 00:00:02.33151  -- Slack message received here
INFO:  Run 2, Statement 2: 00:00:01.064007
INFO:  Run 3, Statement 1: 00:00:01.638518
INFO:  Run 3, Statement 2: 00:00:01.149005
INFO:  Run 4, Statement 1: 00:00:01.670045
INFO:  Run 4, Statement 2: 00:00:01.230755
INFO:  Run 5, Statement 1: 00:00:01.81718
INFO:  Run 5, Statement 2: 00:00:01.166089

As you can see, in all 5 benchmark executions, the version with the correlated subquery seemed to have outperformed the version with the LEFT JOIN in this case by roughly 60%! As this is PostgreSQL and open source, benchmark results are in actual seconds for 10000 query executions. Neat. Let’s move on to…

Oracle

SET SERVEROUTPUT ON
DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 10000;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT first_name, last_name, count(fa.actor_id) AS c
        FROM actor a
        LEFT JOIN film_actor fa
        ON a.actor_id = fa.actor_id
        WHERE last_name LIKE 'A%'
        GROUP BY a.actor_id, first_name, last_name
        ORDER BY c DESC
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r || ', Statement 1 : ' || (SYSTIMESTAMP - v_ts));
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT first_name, last_name, (
          SELECT count(*)
          FROM film_actor fa
          WHERE a.actor_id =
          fa.actor_id
        ) AS c
        FROM actor a
        WHERE last_name LIKE 'A%' 
        ORDER BY c DESC
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    dbms_output.put_line('Run ' || r || ', Statement 2 : ' || (SYSTIMESTAMP - v_ts));
  END LOOP;
END;
/

Gee, check out the difference now (and remember, these are totally not seconds, but a hypothetical unit of measurment, let’s call them Newtons. Or Larrys. Let’s call them Larrys (great idea, Axel)):

Run 1, Statement 1 : 07.721731000
Run 1, Statement 2 : 00.622992000
Run 2, Statement 1 : 08.077535000
Run 2, Statement 2 : 00.666481000
Run 3, Statement 1 : 07.756182000
Run 3, Statement 2 : 00.640541000
Run 4, Statement 1 : 07.495021000
Run 4, Statement 2 : 00.731321000
Run 5, Statement 1 : 07.809564000
Run 5, Statement 2 : 00.632615000

Wow, the correlated subquery totally outperformed the LEFT JOIN query by an order of magnitude. This is totally insane. Now, check out…

SQL Server

… beautiful procedural language in SQL Server: Transact-SQL. With nice features like:

  • Needing to cast INT values to VARCHAR when concatenating them.
  • No indexed loop, only WHILE loop
  • No implicit cursor loops (instead: DEALLOCATE!)

Oh well. It’s just for a benchmark. So here goes:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 10000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy1 VARCHAR;
DECLARE @dummy2 VARCHAR;
DECLARE @dummy3 INT;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR 
    SELECT first_name, last_name, count(fa.actor_id) AS c
    FROM actor a
    LEFT JOIN film_actor fa
    ON a.actor_id = fa.actor_id
    WHERE last_name LIKE 'A%'
    GROUP BY a.actor_id, first_name, last_name
    ORDER BY c DESC

  SET @s2 = CURSOR FOR 
    SELECT first_name, last_name, (
      SELECT count(*)
      FROM film_actor fa
      WHERE a.actor_id =
      fa.actor_id
    ) AS c
    FROM actor a
    WHERE last_name LIKE 'A%' 
    ORDER BY c DESC

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy1, @dummy2, @dummy3;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy1, @dummy2, @dummy3;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 1: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms';

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy1, @dummy2, @dummy3;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy1, @dummy2, @dummy3;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  PRINT 'Run ' + CAST(@r AS VARCHAR) + ', Statement 2: ' + CAST(DATEDIFF(ms, @ts, current_timestamp) AS VARCHAR) + 'ms';
END;

And again, remember, these aren’t seconds. Really. They’re … Kilowatts. Yeah, let’s settle with kilowatts.

Run 1, Statement 1:  2626
Run 1, Statement 2: 20340
Run 2, Statement 1:  2450
Run 2, Statement 2: 17910
Run 3, Statement 1:  2706
Run 3, Statement 2: 18396
Run 4, Statement 1:  2696
Run 4, Statement 2: 19103
Run 5, Statement 1:  2716
Run 5, Statement 2: 20453

Oh my… Wait a second. Now suddenly, the correlated subquery is factor 5… more energy consuming (remember: kilowatts). Who would have thought?

Conclusion

This article won’t explain the differences in execution time between the different databases. There are a lot of reasons why a given execution plan will outperform another. There are also a lot of reasons why the same plan (at least what looks like the same plan) really isn’t because a plan is only a description of an algorithm. Each plan operation can still contain other operations that might still be different.

In summary, we can say that in this case (I can’t stress this enough. This isn’t a general rule. It only explains what happens in this case. Don’t create the next SQL myth!), the correlated subquery and the LEFT JOIN performed in the same order of magnitude on PostgreSQL (subquery being a bit faster), the correlated subquery drastically outperformed the LEFT JOIN in Oracle, whereas the LEFT JOIN drastically outperformed the correlated subquery in SQL Server (despite the plan having been the same!)

This means:

  • Don’t trust your intitial judgment
  • Don’t trust any historic blog posts saying A) is faster than B)
  • Don’t trust execution plans
  • Don’t trust this blog post here, because it is using uncomparable time scales (seconds vs newtons vs kilowatts)
  • Don’t fully trust your own benchmarks, because you’re not measuring things as they happen in production

And sadly:

  • Even for such a simple query, there’s no optimal query for all databases

(and I haven’t even included MySQL in the benchmarks)

BUT

by measuring two alternative, equivalent queries, you may just get an idea what might perform better for your system in case you do have a slow query somewhere. Perhaps this helps.

And now that you’re all hot on the subject, go book our 2 day SQL training, where we have tons of other interesting, myth busting content!