How to Avoid Excessive Sorts in Window Functions

Usually, this blog is 100% pro window functions and advocates using them at any occasion. But like any tool, window functions come at a price and we must carefully evaluate if that’s a price we’re willing to pay. That price can be a sort operation. And as we all know, sort operations are expensive. They follow O(n log n) complexity, which must be avoided at all costs for large data sets.

In a previous post, I’ve described how to calculate a running total with window functions (among other ways). In this post, we’re going to calculate the cumulative revenue at each payment in our Sakila database.

SELECT
  customer_id,
  payment_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY payment_date, payment_id
  ) cumulative_amount
FROM payment
ORDER BY customer_id, payment_date, payment_id;

The above will yield something like this:

customer_id |payment_date        |amount |cumulative_amount 
------------|--------------------|-------|------------------
1           |2005-05-25 11:30:37 |2.99   |2.99              
1           |2005-05-28 10:35:23 |0.99   |3.98              
1           |2005-06-15 00:54:12 |5.99   |9.97              
1           |2005-06-15 18:02:53 |0.99   |10.96             
1           |2005-06-15 21:08:46 |9.99   |20.95             
1           |2005-06-16 15:18:57 |4.99   |25.94             
...

As can be seen, in spread sheet notation, cumulative_amount[N] = cumulative_amount[N-1] + amount.

Reusing this calculation in several queries

As in any other language, we don’t want to repeat ourselves, so the SQL way of doing DRY is to create a view or a table valued function. Let’s create a view, first. Something like this:

CREATE VIEW payment_with_revenue AS
SELECT
  customer_id,
  payment_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY payment_date, payment_id
  ) cumulative_amount
FROM payment

Now, we can do nice things like this:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM payment_with_revenue
WHERE customer_id IN (1, 2, 3)
AND payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date

yielding:

customer_id |payment_date        |amount |cumulative_amount 
------------|--------------------|-------|------------------
1           |2005-05-25 11:30:37 |2.99   |2.99              
1           |2005-05-28 10:35:23 |0.99   |3.98              
2           |2005-05-27 00:09:24 |4.99   |4.99              
3           |2005-05-27 17:17:09 |1.99   |1.99              

What about performance?

Now, if we have an index on (CUSTOMER_ID, PAYMENT_DATE), we’d expect to be able to use it, right? Because it seems that our predicate should be able to profit from it:

SELECT 
  count(*),
  count(*) FILTER (
    WHERE customer_id IN (1, 2, 3)
  ),
  count(*) FILTER (
    WHERE customer_id IN (1, 2, 3)
    AND payment_date < DATE '2005-05-29'
  ) 
FROM payment;

yielding:

count |count |count 
------|------|-----
16049 |85    |4     

(To learn more about the cool FILTER clause, read this article here)

How could we best use the index? Let’s look again at our original query, but this time, with an inlined view (“inlined”):

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
) inlined
WHERE customer_id IN (1, 2, 3)
AND payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date;

We should be able to apply two transformations that benefit using the index:

CUSTOMER_ID IN (1, 2, 3) predicate

The CUSTOMER_ID IN (1, 2, 3) predicate should be pushed down into the view, “past” the window function, because it does not affect the window function calculation, which partitions the data set by CUSTOMER_ID. By being pushed “past” the window function, I mean the fact that window functions are calculated late in the order of SELECT clauses.

This means that our original query should be equivalent to this one:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id IN (1, 2, 3) -- Pushed down
) inlined
WHERE payment_date 
  BETWEEN DATE '2005-05-25'
  AND     DATE '2005-05-29'
ORDER BY customer_id, payment_date;

The PAYMENT_DATE predicate

The PAYMENT_DATE predicate is a bit more tricky. It cannot be pushed “past” the window function completely, because that would alter the semantics of the window function, which calculates the cumulative amount in the RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW range (which is the default, if we do not specify it).

But intuitively (and if you want to spend the time: formally as well), we can show that we can at least push the upper bound of our range predicate into the view, like this:

SELECT 
  customer_id,
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id IN (1, 2, 3)
  AND payment_date <= DATE '2005-05-29' -- Pushed down
) inlined
WHERE payment_date >= DATE '2005-05-25'
ORDER BY customer_id, payment_date;

And now, we can profit from the index very easily! But is this transformation being done by any database? Unfortunately not. Some databases manage to push down the “more obvious” CUSTOMER_ID predicate past the window function, but none can do the same with the “less obvious” range predicate on PAYMENT_DATE:

DB2 LUW 10.5

The CUSTOMER_ID predicate is pushed down into the view, which generates an index scan (blue) on the pre-existing foreign key index (which doesn’t contain the PAYMENT_DATE column), but the PAYMENT_DATE itself is only filtered much later using an in-memory filter (red):

Explain Plan                                                       
-------------------------------------------------------------------
ID | Operation                       |                  Rows | Cost
 1 | RETURN                          |                       |   40
 2 |  FILTER                         |     4 of 80 (  5.00%) |   40
 3 |   TBSCAN                        |    80 of 80 (100.00%) |   40
 4 |    SORT                         |    80 of 80 (100.00%) |   40
 5 |     NLJOIN                      |               80 of 3 |   40
 6 |      TBSCAN GENROW              |      3 of 3 (100.00%) |    0
 7 |      FETCH PAYMENT              |    27 of 27 (100.00%) |   13
 8 |       IXSCAN IDX_FK_CUSTOMER_ID | 27 of 16049 (   .17%) |    6
                                                                   
Predicate Information                                              
 2 - RESID (Q5.PAYMENT_DATE <= '2005-05-29')                       
     RESID ('2005-05-25' <= Q5.PAYMENT_DATE)                       
 5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                                
 8 - START (Q3.CUSTOMER_ID = Q2.$C0)                               
      STOP (Q3.CUSTOMER_ID = Q2.$C0)                               

Conversely, see the plan of the manually optimised query:

Explain Plan                                                  
--------------------------------------------------------------
ID | Operation                   |                 Rows | Cost
 1 | RETURN                      |                      |   40
 2 |  FILTER                     |     4 of 4 (100.00%) |   40
 3 |   TBSCAN                    |     4 of 4 (100.00%) |   40
 4 |    SORT                     |     4 of 4 (100.00%) |   40
 5 |     NLJOIN                  |               4 of 1 |   40
 6 |      TBSCAN GENROW          |     3 of 3 (100.00%) |    0
 7 |      FETCH PAYMENT          |     1 of 1 (100.00%) |   13
 8 |       IXSCAN IDX_PAYMENT_I1 | 1 of 16049 (   .01%) |    6
                                                              
Predicate Information                                         
 2 - RESID ('2005-05-25' <= Q5.PAYMENT_DATE)                  
 5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                           
 8 - START (Q3.CUSTOMER_ID = Q2.$C0)                          
      STOP (Q3.CUSTOMER_ID = Q2.$C0)                          
      STOP (Q3.PAYMENT_DATE <= '2005-05-29')                  

This is certainly a better plan.

MySQL 8.0.2

MySQL, very regrettably, doesn’t seem to show any effort at all in optimising this. We’re accessing the entire payment table to get this result.

id   table        type  rows    filtered    Extra
-----------------------------------------------------------------------
1    <derived2>   ALL   16086    3.33       Using where
2    payment      ALL   16086  100.00       Using filesort

Here’s the manually optimised plan:

id   table        type  key             rows  filtered    Extra
-------------------------------------------------------------------------------
1    <derived2>   ALL                   4     3.33        Using where
2    payment      range idx_payment_i1  4      100.00     Using index condition

Oracle 12.2.0.1

Oracle also cannot do this beyond pushing the more obvious CUSTOMER_ID predicate into the view:

-------------------------------------------------------------------------------
| Id  | Operation                              | Name                 | Rows  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                      |       |
|*  1 |  VIEW                                  | PAYMENT_WITH_REVENUE |    80 |
|   2 |   WINDOW SORT                          |                      |    80 |
|   3 |    INLIST ITERATOR                     |                      |       |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PAYMENT              |    80 |
|*  5 |      INDEX RANGE SCAN                  | IDX_FK_CUSTOMER_ID   |    80 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(("PAYMENT_DATE">=TO_DATE('2005-05-25 00:00:00') AND 
              "PAYMENT_DATE"<=TO_DATE('2005-05-29 00:00:00')))
   5 - access(("CUSTOMER_ID"=1 OR "CUSTOMER_ID"=2 OR "CUSTOMER_ID"=3))

The manually optimised plan looks better:

-------------------------------------------------------------------------
| Id  | Operation                              | Name           | Rows  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                |       |
|*  1 |  VIEW                                  |                |     1 |
|   2 |   WINDOW SORT                          |                |     1 |
|   3 |    INLIST ITERATOR                     |                |       |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| PAYMENT        |     1 |
|*  5 |      INDEX RANGE SCAN                  | IDX_PAYMENT_I1 |     1 |
-------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("PAYMENT_DATE">=TO_DATE('2005-05-25 00:00:00'))
   5 - access(("CUSTOMER_ID" IN (1, 2, 3)) AND 
              "PAYMENT_DATE"<=TO_DATE('2005-05-29 00:00:00'))

Much better cardinality estimates!

PostgreSQL 10

PostgreSQL’s version of the Sakila database uses a partitioned payment table, but that should be irrelevant for this analysis. The CUSTOMER_ID predicate could be pushed down…

QUERY PLAN                                                                                          
---------------------------------------------------------------------------------------------------
Subquery Scan on payment_with_revenue  (cost=117.06..124.45 rows=8 width=52)                       
  Filter: ((payment_date >= '2005-05-25') AND (payment_date <= '2005-05-29'))
-> WindowAgg  (cost=117.06..121.49 rows=197 width=56)                                               
   -> Sort  (cost=117.06..117.55 rows=197 width=24)                                              
      Sort Key: payment.customer_id, payment.payment_date, payment.payment_id                  
      -> Result  (cost=0.29..109.55 rows=197 width=24)                                        
         -> Append  (cost=0.29..107.58 rows=197 width=24)                                  
            -> Index Scan using idx_fk.. on payment  (cost=0.29..18.21 rows=77 width=20)
               Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))
               -> Bitmap Heap Scan on payment_p2007_01  (cost=4.62..14.90 rows=20 width=26)
                  Recheck Cond: (customer_id = ANY ('{1,2,3}'::integer[]))               
                  -> Bitmap Index Scan on idx_fk.. (cost=0.00..4.61 rows=20 width=0)
                     Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))           
                  -> Bitmap Heap Scan on payment_p2007_02  (cost=4.62..14.90 rows=20 width=26)
                     Recheck Cond: (customer_id = ANY ('{1,2,3}'::integer[]))               
                  -> Bitmap Index Scan on idx_fk.. (cost=0.00..4.61 rows=20 width=0)
                     Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))           
              ...

But manual optimisation is required to get better behaviour for the date range:

QUERY PLAN                                                                                           
-----------------------------------------------------------------------------------------------------
Subquery Scan on inlined  (cost=18.46..18.56 rows=3 width=48)                                        
  Filter: (inlined.payment_date >= '2005-05-25'::date)                    
-> WindowAgg  (cost=18.46..18.52 rows=3 width=52)                                                 
   -> Sort  (cost=18.46..18.46 rows=3 width=20)                                                
      Sort Key: payment.customer_id, payment.payment_date, payment.payment_id                
      -> Result  (cost=0.29..18.43 rows=3 width=20)                                         
         -> Append  (cost=0.29..18.40 rows=3 width=20)                                   
            -> Index Scan using idx_fk.. on payment  (cost=0.29..18.40 rows=3 width=20)
                Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))
                Filter: (payment_date <= '2005-05-29'::date)

Interestingly, the index still isn’t used optimally on both columns, which has nothing to do with the current discussion on window functions. PostgreSQL seems to be unable to think of the IN predicate as an equality predicate. See also this article about other optimisations (such as predicate merging) that are not possible (yet) in PostgreSQL.

But still, this is much better as it brings down the estimated cardinalities (in case this query is a subquery in a more sophisticated context), and more importantly, it filters out many many rows prior to calculating the window function.

SQL Server 2014

Another database that cannot push down this predicate past the window function optimally. Only the “obvious” part is pushed down:

|--Sort(ORDER BY:([payment_date] ASC))
   |--Filter(WHERE:([payment_date]>='2005-05-25' AND [payment_date]<='2005-05-29'))
      |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1004]=(0) THEN NULL ELSE [Expr1005] END))
         |--Stream Aggregate(GROUP BY:([WindowCount1009]) DEFINE:(..))
            |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
               |--Segment
                  |--Segment
                     |--Sort(ORDER BY:([customer_id] ASC, [payment_date] ASC, [payment_id] ASC))
                        |--Table Scan(OBJECT:([payment]), WHERE:([customer_id] IN (1, 2, 3)))

Interestingly, this doesn’t even use the index at all, but at least the data is filtered out prior to the calculation that relies on sorting. With the manual optimisation, again the same, much better effect:

|--Filter(WHERE:([payment_date]>='2005-05-25'))
   |--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1004]=(0) THEN NULL ELSE [Expr1005] END))
      |--Stream Aggregate(GROUP BY:([WindowCount1011]) DEFINE:(..))
         |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
            |--Segment
               |--Segment
                  |--Sort(ORDER BY:([payment_date] ASC, [payment_id] ASC))
                     |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]))
                        |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1007], [Expr1008], [Expr1006]))
                        |  |--Compute Scalar(DEFINE:(([Expr1007],[Expr1008],[Expr1006])=GetRangeWithMismatchedTypes(NULL,'2005-05-29',(42))))
                        |  |  |--Constant Scan
                        |  |--Index Seek(OBJECT:([idx_payment_i1]), SEEK:([customer_id] IN (1, 2, 3) AND [payment_date] > [Expr1007] AND [payment_date] < [Expr1008]))
                        |--RID Lookup(OBJECT:([payment]))

Certainly, this is a bit cryptic to read but it really means the same thing as always: The manual optimisation worked and we got a better plan.

Meh, does it matter?

I hope so! Let’s benchmark these things against each other! Some info about our benchmarking technique in our previous post and on this page here. Specifically, we don’t publish actual execution times, only relative times within the benchmark as we do not want to compare databases against each other but only against themselves.

DB2 LUW 10.5

RUN |STMT |RATIO  |
----|-----|-------|
1   |1    |3.0890 |
1   |2    |1.2272 |
2   |1    |3.0624 |
2   |2    |1.0100 |
3   |1    |3.0389 |
3   |2    |1.0000 |
4   |1    |3.1566 |
4   |2    |1.0948 |
5   |1    |3.1817 |
5   |2    |1.0905 |

The manually optimised statement is 3x faster in our benchmark. Do bear in mind that we’re operating on a rather small data set of a total of a few thousand rows! This gets worse in a larger data set.

MySQL 8.0.2

The difference is devastating in MySQL 8.0.2, which just recently introduced window functions. Surely, the MySQL team will be able to apply some further optimisations prior to GA – I’ve filed an issue for review:

0	1	431.1905
0	2	1.0000
1	1	372.4286
1	2	1.0000
2	1	413.4762
2	2	1.0000
3	1	381.2857
3	2	1.0000
4	1	400.1429
4	2	1.2857

Oracle 12.2.0.1

Another factor 4x can be observed in Oracle:

Run 1, Statement 1 : 4.58751
Run 1, Statement 2 : 1.37639
Run 2, Statement 1 : 4.71833
Run 2, Statement 2 : 1.03693
Run 3, Statement 1 : 4.05729
Run 3, Statement 2 : 1.04719
Run 4, Statement 1 : 3.86653
Run 4, Statement 2 : 1
Run 5, Statement 1 : 3.99603
Run 5, Statement 2 : 1.0212

PostgreSQL 10

PostgreSQL is quite bad too, here. A factor 7x can be observed:

RUN 1, Statement 1: 7.23373
RUN 1, Statement 2: 1.01438
RUN 2, Statement 1: 6.62028
RUN 2, Statement 2: 1.26183
RUN 3, Statement 1: 8.40322
RUN 3, Statement 2: 1.04074
RUN 4, Statement 1: 6.33401
RUN 4, Statement 2: 1.06750
RUN 5, Statement 1: 6.41649
RUN 5, Statement 2: 1.00000

SQL Server 2014

Another very significant penalty in SQL Server for the unoptimised version:

Run 1, Statement 1: 29.50000
Run 1, Statement 2: 1.07500
Run 2, Statement 1: 28.15000
Run 2, Statement 2: 1.00000
Run 3, Statement 1: 28.00000
Run 3, Statement 2: 1.00000
Run 4, Statement 1: 28.00000
Run 4, Statement 2: 1.00000
Run 5, Statement 1: 31.07500
Run 5, Statement 2: 1.00000

Bad news for views. Is there a better solution?

This is rather bad news for window functions inside of reusable views. None of the databases, not even DB2 or Oracle can push down range predicates past a derived table’s window function, if the column that is part of the range predicate doesn’t correspond to the window function’s PARTITION BY clause.

The problem described above can be easily fixed when the query is written manually, expanding all possible views into their calling SQL, but that kind of sucks – we’d love to make our code reusable. There’s one solution in databases that support inline table valued functions. Among the tested databases, these include:

  • DB2
  • PostgreSQL
  • SQL Server

MySQL doesn’t have table valued functions, and Oracle’s (very regrettably) are not inlineable because they have to be written in PL/SQL.

Here’s how to write these functions:

DB2

Function definition:

CREATE OR REPLACE FUNCTION f_payment_with_revenue (
  p_customer_id BIGINT,
  p_from_date DATE,
  p_to_date DATE
)
RETURNS TABLE (
  customer_id BIGINT,
  payment_date DATE,
  amount DECIMAL(10, 2),
  cumulative_amount DECIMAL(10, 2)
)
LANGUAGE SQL
RETURN
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = p_customer_id
  AND payment_date <= p_to_date
) t
WHERE payment_date >= p_from_date;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) c(customer_id),
TABLE(sakila.f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
))
ORDER BY payment_date;

Execution plan:

Explain Plan                                                    
----------------------------------------------------------------
ID | Operation                     |                 Rows | Cost
 1 | RETURN                        |                      |   33
 2 |  TBSCAN                       |     4 of 4 (100.00%) |   33
 3 |   SORT                        |     4 of 4 (100.00%) |   33
 4 |    NLJOIN                     |               4 of 1 |   33
 5 |     NLJOIN                    |               3 of 1 |   20
 6 |      TBSCAN GENROW            |     3 of 3 (100.00%) |    0
 7 |      IXSCAN PK_CUSTOMER       |   1 of 599 (   .17%) |    6
 8 |     FILTER                    |     1 of 1 (100.00%) |   13
 9 |      TBSCAN                   |     1 of 1 (100.00%) |   13
10 |       SORT                    |     1 of 1 (100.00%) |   13
11 |        FETCH PAYMENT          |     1 of 1 (100.00%) |   13
12 |         IXSCAN IDX_PAYMENT_I1 | 1 of 16049 (   .01%) |    6
                                                                
Predicate Information                                           
  5 - JOIN (Q3.CUSTOMER_ID = Q2.$C0)                            
  7 - START (Q3.CUSTOMER_ID = Q2.$C0)                           
       STOP (Q3.CUSTOMER_ID = Q2.$C0)                           
  8 - RESID ('2005-05-25' <= Q6.PAYMENT_DATE)                   
 12 - START (Q4.CUSTOMER_ID = Q3.CUSTOMER_ID)                   
       STOP (Q4.CUSTOMER_ID = Q3.CUSTOMER_ID)                   
       STOP (Q4.PAYMENT_DATE <= '2005-05-29')                   

Much better!

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

RUN |STMT |RATIO  |
----|-----|-------|
1   |1    |1.5945 |
1   |2    |1.0080 |
2   |1    |1.6310 |
2   |2    |1.0768 |
3   |1    |1.5827 |
3   |2    |1.0090 |
4   |1    |1.5486 |
4   |2    |1.0084 |
5   |1    |1.5569 |
5   |2    |1.0000 |

Definitely a huge improvement. The comparison might not be entirely fair because

  • CROSS APPLY / LATERAL unnesting tends to generate nested loops that could be written more optimally with a classic join
  • We have an additional auxiliary customer table access (which could probably be tuned away with another rewrite)

PostgreSQL

Function definition:

CREATE OR REPLACE FUNCTION f_payment_with_revenue (
  p_customer_id BIGINT,
  p_from_date DATE,
  p_to_date DATE
)
RETURNS TABLE (
  customer_id SMALLINT,
  payment_date TIMESTAMP,
  amount DECIMAL(10, 2),
  cumulative_amount DECIMAL(10, 2)
)
AS $$
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = p_customer_id
  AND payment_date <= p_to_date
) t
WHERE payment_date >= p_from_date
$$ LANGUAGE SQL;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) c(customer_id)
CROSS JOIN LATERAL f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
)
ORDER BY payment_date;

Execution plan:

QUERY PLAN                                                                                    
----------------------------------------------------------------------------------------------
Sort  (cost=250.39..257.89 rows=3000 width=72)                                                
  Sort Key: f_payment_with_revenue.payment_date                                               
  ->  Nested Loop  (cost=0.53..77.13 rows=3000 width=72)                                      
        ->  Index Only Scan using customer_pkey on customer  (cost=0.28..16.88 rows=3 width=4)
              Index Cond: (customer_id = ANY ('{1,2,3}'::integer[]))                          
        ->  Function Scan on f_payment_with_revenue  (cost=0.25..10.25 rows=1000 width=72)    

Oops, no unnesting of the function is happening. The cardinality defaults to 1000. That’s bad news!

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

RUN 1, Statement 1: 25.77538
RUN 1, Statement 2: 1.00000
RUN 2, Statement 1: 27.55197
RUN 2, Statement 2: 1.11581
RUN 3, Statement 1: 27.99331
RUN 3, Statement 2: 1.16463
RUN 4, Statement 1: 29.11022
RUN 4, Statement 2: 1.01159
RUN 5, Statement 1: 26.65781
RUN 5, Statement 2: 1.01654

Rats. This has gotten much worse than with the view. Not surprising, though. Table valued functions are not that good of an idea when they cannot be inlined! Oracle would have had a similar result if I wasn’t too lazy to translate my function to an ordinary PL/SQL table valued function, or a pipelined function.

SQL Server

Function definition:

CREATE FUNCTION f_payment_with_revenue (
  @customer_id BIGINT,
  @from_date DATE,
  @to_date DATE
)
RETURNS TABLE
AS RETURN
SELECT *
FROM (
  SELECT
    customer_id,
    payment_date,
    amount,
    SUM(amount) OVER (
      PARTITION BY customer_id
      ORDER BY payment_date, payment_id
    ) cumulative_amount
  FROM payment
  WHERE customer_id = @customer_id
  AND payment_date <= @to_date
) t
WHERE payment_date >= @from_date;

Function call:

SELECT 
  payment_date,
  amount,
  cumulative_amount
FROM (
  SELECT customer_id FROM customer WHERE customer_id IN (1, 2, 3)
) AS c(customer_id)
CROSS APPLY f_payment_with_revenue(
  c.customer_id,
  CAST('2005-05-25' AS DATE),
  CAST('2005-05-29' AS DATE)
)
ORDER BY payment_date;

Execution plan

|--Sort(ORDER BY:([payment_date] ASC))
   |--Nested Loops(Inner Join, OUTER REFERENCES:([customer_id]))
      |--Index Seek(OBJECT:([PK__customer__CD65CB84E826462D]), SEEK:([customer_id] IN (1, 2, 3))
      |--Filter(WHERE:([payment_date]>='2005-05-25'))
         |--Compute Scalar(DEFINE:([Expr1006]=CASE WHEN [Expr1007]=(0) THEN NULL ELSE [Expr1008] END))
            |--Stream Aggregate(GROUP BY:([WindowCount1014]) DEFINE:(..)))
               |--Window Spool(RANGE BETWEEN:(UNBOUNDED, [[payment_date], [payment_id]]))
                  |--Segment
                     |--Segment
                        |--Sort(ORDER BY:([payment_date] ASC, [payment_id] ASC))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1003]))
                              |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1010], [Expr1011], [Expr1009]))
                              |  |--Compute Scalar(DEFINE:(([Expr1010],[Expr1011],[Expr1009])=GetRangeWithMismatchedTypes(NULL,'2005-05-29',(42))))
                              |  |  |--Constant Scan
                              |  |--Index Seek(OBJECT:([idx_payment_i1]), SEEK:([customer_id]=CONVERT_IMPLICIT(bigint,[customer_id],0) AND [payment_date] > [Expr1010] AND [payment_date] < [Expr1011]))
                              |--RID Lookup(OBJECT:([payment]), SEEK:([Bmk1003]=[Bmk1003]))

Again, super unreadable IMO, but after looking a bit more closely, we can see that the plan is almost the same as the manually optimised one, and the predicate is applied early on, where it belongs.

Benchmark result (Statement 1 = function call, Statement 2 = manually optimised):

Run 1, Statement 1: 2.50000
Run 1, Statement 2: 1.27778
Run 2, Statement 1: 2.11111
Run 2, Statement 2: 1.27778
Run 3, Statement 1: 2.11111
Run 3, Statement 2: 1.00000
Run 4, Statement 1: 2.22222
Run 4, Statement 2: 1.11111
Run 5, Statement 1: 2.02778
Run 5, Statement 2: 1.19444

Conclusion

Window functions are super cool and powerful. But they come at a price. They sort your data. Normally, when we write complex queries and reuse parts in views, we can profit from predicate push down operations into derived tables and views, which is something that most databases support (see also our previous blog post about such optimisations).

But when it comes to using window functions, they act like a “fence”, past which only few predicates can be pushed automatically. It’s not that it wouldn’t be possible, it simply isn’t done very well by most databases (and in the case of MySQL, not at all as of 8.0.2).

Inline table valued functions can be a remedy to avoid manual building of complex queries, such that at least some parts of your logic can be reused among queries. Unfortunately, they rely on CROSS APPLY or LATERAL JOIN, which can also cause performance issues in more complex setups. Besides, among the databases covered in this article, only DB2 and SQL Server support inline table valued functions. Oracle doesn’t support SQL functions at all, and PostgreSQL’s SQL functions are not inlinable (yet), which means that in these databases, in order to tune such queries, you might not be able to reuse the parts that use window functions in views or stored functions.

However, as always, do measure. Perhaps, a 4x waste of performance for a particular query is OK.