How to Work Around ORA-38104: Columns referenced in the ON Clause cannot be updated

Standard SQL is a beautiful language. Vendor specific implementations, however, have their warts. In Oracle, for example, it’s not possible to update any columns in a MERGE statement, which have been referenced by the ON clause. For example:

CREATE TABLE person (
  id NUMBER(18) NOT NULL PRIMARY KEY,
  user_name VARCHAR2(50) NOT NULL UNIQUE,
  score NUMBER(18)
);

Now, in MySQL, we can run a non-standard INSERT .. ON DUPLICATE KEY UPDATE statement like this:

INSERT INTO person (id, user_name, score)
VALUES (1, 'foo', 100)
ON DUPLICATE KEY UPDATE
  SET user_name = 'foo', score = 100

Behind the scenes, MySQL will check all unique constraints for duplicates and reject the insert, replacing it by the update statement instead. It’s debatable whether this is really useful (ideally, we want to check only a single unique constraint for duplicates), but that’s what MySQL offers.

In case we want to run the same behaviour by Oracle, we could use the MERGE statement:

MERGE INTO person t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
ON (t.id = s.id OR t.user_name = s.user_name)
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score)

That looks reasonable, but it doesn’t work. We’ll get:

SQL-Fehler: ORA-38104: Columns referenced in the ON Clause cannot be updated: “T”.”USER_NAME”

Obviously, this is some protection against the situation where such an update would suddenly move a row from the matched to the not matched group. In this particular example, it might not look like something that could cause problems, but if vendor specific extensions such as the WHERE or DELETE clause would be used, things might look different.

However, the parser is not very smart, in fact, it is almost not smart at all. While it detects extremely silly attempts at circumventing this limitation, such as this:

MERGE INTO person t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
-- Circumvention attempt here: NVL()
ON (t.id = s.id OR nvl(t.user_name, null) = s.user_name)
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score)

It does not detect any of these attempts:

Using row value expressions

MERGE INTO person t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
ON (t.id = s.id OR 
-- Circumvention attempt here: row value expressions
  (t.user_name, 'dummy') = ((s.user_name, 'dummy')))
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score)

Seemingly without any penalty on the execution plan. Both indexes are being used:

---------------------------------------------------------------------------
| Id  | Operation                               | Name            | Rows  |
---------------------------------------------------------------------------
|   0 | MERGE STATEMENT                         |                 |     1 |
|   1 |  MERGE                                  | PERSON          |       |
|   2 |   VIEW                                  |                 |       |
|   3 |    NESTED LOOPS OUTER                   |                 |     1 |
|   4 |     FAST DUAL                           |                 |     1 |
|   5 |     VIEW                                | VW_LAT_8626BD41 |     1 |
|   6 |      TABLE ACCESS BY INDEX ROWID BATCHED| PERSON          |     1 |
|   7 |       BITMAP CONVERSION TO ROWIDS       |                 |       |
|   8 |        BITMAP OR                        |                 |       |
|   9 |         BITMAP CONVERSION FROM ROWIDS   |                 |       |
|* 10 |          INDEX RANGE SCAN               | SYS_C00106110   |       |
|  11 |         BITMAP CONVERSION FROM ROWIDS   |                 |       |
|* 12 |          INDEX RANGE SCAN               | SYS_C00106111   |       |
---------------------------------------------------------------------------

Correlated subquery

MERGE INTO person t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
ON (t.id = s.id OR 
-- Circumvention attempt here: correlated subquery
  (SELECT t.user_name FROM dual) = s.user_name)
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score)

This seems to prevent any index usage, and should thus be avoided:

----------------------------------------------------------
| Id  | Operation              | Name            | Rows  |
----------------------------------------------------------
|   0 | MERGE STATEMENT        |                 |     1 |
|   1 |  MERGE                 | PERSON          |       |
|   2 |   VIEW                 |                 |       |
|   3 |    NESTED LOOPS OUTER  |                 |     1 |
|   4 |     FAST DUAL          |                 |     1 |
|   5 |     VIEW               | VW_LAT_1846A928 |     1 |
|*  6 |      FILTER            |                 |       |
|   7 |       TABLE ACCESS FULL| PERSON          |     1 |
|   8 |       FAST DUAL        |                 |     1 |
----------------------------------------------------------

Using NVL() and updating a view instead

Just plain simple usage of NVL() inside of the ON clause didn’t work before. The parser was smart enough to detect that. But it isn’t smart enough to detect NVL() inside of a view / derived table.

MERGE INTO (
  SELECT id, user_name, nvl(user_name, null) n, score
  FROM person
) t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
-- Circumvention attempt here: renamed column
ON (t.id = s.id OR t.n = s.user_name)
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score)

Notice that both USER_NAME and N columns are the same thing, but the parser doesn’t notice this and thinks we’re fine.

The execution plan is still optimal, as Oracle seems to have a way to optimise NVL() expressions (but not coalesce and others!):

---------------------------------------------------------------------------
| Id  | Operation                               | Name            | Rows  |
---------------------------------------------------------------------------
|   0 | MERGE STATEMENT                         |                 |     1 |
|   1 |  MERGE                                  | PERSON          |       |
|   2 |   VIEW                                  |                 |       |
|   3 |    NESTED LOOPS OUTER                   |                 |     1 |
|   4 |     FAST DUAL                           |                 |     1 |
|   5 |     VIEW                                | VW_LAT_46651921 |     1 |
|   6 |      TABLE ACCESS BY INDEX ROWID BATCHED| PERSON          |     1 |
|   7 |       BITMAP CONVERSION TO ROWIDS       |                 |       |
|   8 |        BITMAP OR                        |                 |       |
|   9 |         BITMAP CONVERSION FROM ROWIDS   |                 |       |
|* 10 |          INDEX RANGE SCAN               | SYS_C00106110   |       |
|  11 |         BITMAP CONVERSION FROM ROWIDS   |                 |       |
|* 12 |          INDEX RANGE SCAN               | SYS_C00106111   |       |
---------------------------------------------------------------------------

Using the WHERE clause

If we hadn’t had an OR predicate in our ON clause, but a AND predicate, then we could have used the WHERE clause in Oracle. This works:

-- NOT the same query as the original one!
MERGE INTO person t
USING (
  SELECT 1 id, 'foo' user_name, 100 score
  FROM dual
) s
ON (t.id = s.id)
WHEN MATCHED THEN UPDATE
  SET t.user_name = s.user_name, t.score = 100
  WHERE t.user_name = s.user_name
WHEN NOT MATCHED THEN INSERT (id, user_name, score)
  VALUES (s.id, s.user_name, s.score);

This is not the same query as the original one. I just listed it here for completeness’ sake. Also to remind readers of the fact that this approach as well doesn’t seem to use indexes optimally. Only the primary key index (from the ON clause) seems to be used. The unique key is not being used:

----------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  |
----------------------------------------------------------------
|   0 | MERGE STATEMENT                |               |     1 |
|   1 |  MERGE                         | PERSON        |       |
|   2 |   VIEW                         |               |       |
|   3 |    NESTED LOOPS OUTER          |               |     1 |
|   4 |     VIEW                       |               |     1 |
|   5 |      FAST DUAL                 |               |     1 |
|   6 |     TABLE ACCESS BY INDEX ROWID| PERSON        |     1 |
|*  7 |      INDEX UNIQUE SCAN         | SYS_C00106110 |     1 |
----------------------------------------------------------------

Careful

Be careful when applying the above workarounds. Assuming that ORA-38104 is a good thing (i.e. that Oracle still thinks it should be enforced), then the above workarounds simply expose bugs in the parser, which should detect such cases. The above behaviour has been observed in Oracle 12c and 18c.

I personally believe that ORA-38104 should be abandoned entirely, and the root cause for this restriction should be removed. But it is certainly worth exploring alternative options rather than relying on the above workarounds in production code, apart from the occasional one-shot migration query, where such loop holes are always nice tools to exploit.

How to Aggregate an Archive Log’s Deltas into a Snapshot with SQL

A customer of my popular SQL training (which you should book!) has recently challenged me to optimise a hierarchical query that merges an archive log’s deltas in order to obtain a snapshot of some record at a given point in time. In this article, I will reproduce their problem statement in a simplified version and show how this can be done with SQL Server, using a few cool SQL features:

All of these are topics covered in the training, which were immediately applicable to this problem statement.

The problem statement

This was their archive design. They designed for uncertainty, meaning that for some entities in their system, they did not know what kinds of attributes will be part of the entity in the future. Given their application design, users could even add their own custom attributes to an entity.

This kind of thing is typically solved with the EAV (Entity Attribute Value) model, a “workaround” to denormalise data sets in SQL databases in the event of such schema uncertainty.

EAV can be implemented in several ways:

Through classic SQL tables only

An example implementation is this:

CREATE TABLE eav_classic (
  entity_type     VARCHAR (100) NOT NULL,
  entity_id       BIGINT        NOT NULL,
  attribute_name  VARCHAR (100) NOT NULL,
  attribute_type  VARCHAR (100) NOT NULL,
  attribute_value VARCHAR (100)     NULL,

  CONSTRAINT eav_classic_pk 
    PRIMARY KEY (entity_type, entity_id, attribute_name)
);

The drawbacks of this non-normalised design are immediately obvious. Most specifically, there is no simple way to establish referential integrity. But this may be totally OK, especially for archive logs, and for smaller databases (datomic does something similar)

Through tables containing JSON or XML data

Whenever you have schema-on-read data, JSON or XML data types may be appropriate, so this is a perfectly valid alternative:

CREATE TABLE eav_json (
  entity_type     VARCHAR (100)   NOT NULL,
  entity_id       BIGINT          NOT NULL,
  attributes      VARCHAR (10000) NOT NULL 
    CHECK (ISJSON(attributes) = 1),

  CONSTRAINT eav_json_pk 
    PRIMARY KEY (entity_type, entity_id)
);

If your database supports a JSON data type, obviously, you will prefer that over the above emulation

For the rest of this article, I will use the JSON

Versioning the EAV table

Versioning data in an EAV model is quite easier than in a normalised schema. We can just add a version number and/or timestamp to the record. In their case, something like this may make sense:

CREATE TABLE history (
  id          BIGINT IDENTITY (1, 1) NOT NULL PRIMARY KEY,
  ts          DATETIME               NOT NULL,
  entity_type VARCHAR(100)           NOT NULL,
  entity_id   BIGINT                 NOT NULL,
  delta       VARCHAR(8000)          NOT NULL 
    CHECK (ISJSON(delta) = 1)
);

INSERT INTO history (entity_type, entity_id, ts, delta)
VALUES ('Person', 1, '2000-01-01 00:00:00', '{"first_name": "John", "last_name": "Doe"}'),
       ('Person', 1, '2000-01-01 01:00:00', '{"age": 37}'),
       ('Person', 1, '2000-01-01 02:00:00', '{"age": 38}'),
       ('Person', 1, '2000-01-01 03:00:00', '{"city": "New York"}'),
       ('Person', 1, '2000-01-01 04:00:00', '{"city": "Zurich", "age": null}')
;

This table now contains a set of deltas applied to the Person entity with ID = 1. It corresponds to the following sequence of SQL statements on an ordinary entity:

INSERT INTO person (id, first_name, last_name) 
  VALUES ('John', 'Doe');
UPDATE person SET age = 37 WHERE id = 1;
UPDATE person SET age = 38 WHERE id = 1;
UPDATE person SET city = 'New York' WHERE id = 1;
UPDATE person SET city = 'Zurich', age = null WHERE id = 1;

You could even see their hand-written log like a transaction log of the database system, kinda like what you can extract using products like Golden Gate or Debezium. If you think of the transaction log as an event stream, the RDBMS’s current data representation is like a snapshot that you can get when applying any number of deltas to your tables.

Sometimes, you don’t want to completely change your architecture and go full “event sourcing”, but just need this kind of log for a specific set of auditable entities. And e.g. for reasons like still supporting very old SQL Server versions, as well as supporting other databases, you may choose also not to use the SQL:2011 temporal table feature, which has also been implemented in SQL Server 2016 and more recent versions.

With that out of our way…

How to access any arbitrary snapshot version?

When we visually process our HISTORY table, we can see that Person ID = 1 had the following values at any given time:

TIME        FIRST_NAME    LAST_NAME    AGE    CITY
------------------------------------------------------
00:00:00    John          Doe
01:00:00    John          Doe          37
02:00:00    John          Doe          38
03:00:00    John          Doe          38     New York
04:00:00    John          Doe                 Zurich

Remember, this is always the same record of Person ID = 1, its snapshots represented at different times in the time axis. The goal here is to be able to find the record of John Doe at any given time.

Again, if we had been using the SQL:2011 temporal table feature, we could write

-- SQL Server
SELECT * 
FROM Person
FOR SYSTEM_TIME AS OF '2000-01-01 02:00:00.0000000'; 

-- Oracle (flashback query)
SELECT *
FROM Person
AS OF TIMESTAMP TIMESTAMP '2000-01-01 02:00:00'

Side note: Do note that Oracle’s flashback query needs to be properly configured:

  • Not all data is “flashbackable”
  • DDL tends to destroy the archive
  • Proper grants are needed to access the flashback archive

Similar limitations may apply in SQL Server.

What if the RDBMS can’t help us?

If again for some reason, we cannot use the RDBMS’s temporal table features, we’ll roll our own as we’ve seen. So, our query in SQL Server to access the snapshot at any given time may be this:

SELECT 
  '{' 
+ string_agg(
    CASE type WHEN 0 THEN NULL ELSE 
      '"' + [key] + '": ' + 
      CASE type WHEN 1 THEN '"' + value + '"' ELSE value END
    END, ', ') 
+ '}'
FROM (
  SELECT *, row_number() OVER (
    PARTITION BY [key] ORDER BY ts DESC) rn
  FROM history
  OUTER APPLY openjson(delta)
  
  -- Apply all deltas prior to any given snapshot
  WHERE ts <= '2000-01-01 02:00:00'
) t
WHERE rn = 1;

What does this query do? Consider again our deltas at 04:00:00:

TIME        FIRST_NAME    LAST_NAME    AGE    CITY
------------------------------------------------------
00:00:00    John          Doe
01:00:00    John          Doe          37
02:00:00    John          Doe          38
03:00:00    John          Doe          38     New York
04:00:00    John          Doe          -      Zurich

Observe how each value has some color encoding:

  • Strong, red: The current snapshot’s attribute value, when the last delta was applied to any given attribute
  • Strong, black: A previous snapshot’s attribute value, when a previous, superseded delta was applied to any given attribute
  • Light grey: A previous snapshot’s attribute value that was inherited from another previous delta

For any given snapshot, we want to find the Strong, red values. E.g. at a previous snapshot time, the color encoding would have been:

At 03:00:00

TIME        FIRST_NAME    LAST_NAME    AGE    CITY
------------------------------------------------------
00:00:00    John          Doe
01:00:00    John          Doe          37
02:00:00    John          Doe          38
03:00:00    John          Doe          38     New York

04:00:00    John          Doe          -      Zurich

At 02:00:00

TIME        FIRST_NAME    LAST_NAME    AGE    CITY
------------------------------------------------------
00:00:00    John          Doe
01:00:00    John          Doe          37
02:00:00    John          Doe          38

03:00:00    John          Doe          38     New York
04:00:00    John          Doe          -      Zurich

So, our query needs to find the delta that was applied last for any given attribute.

With SQL, we can find that easily. We can assign a row number to each delta per attribute in reverse order, something like this:

TIME        FIRST_NAME    LAST_NAME    AGE    CITY
------------------------------------------------------
00:00:00    John (1)      Doe (1)
01:00:00    John          Doe          37 (3)
02:00:00    John          Doe          38 (2)
03:00:00    John          Doe          38     New York (2)
04:00:00    John          Doe          - (1)  Zurich (1)

Once we have that row number, we just filter out only those deltas whose row number is 1. Something like:

SELECT [key], value, row_number() OVER (
  PARTITION BY [key] ORDER BY ts DESC) rn
FROM history OUTER APPLY openjson(delta)
ORDER BY [key], ts;

Notice the OUTER APPLY openjson(delta) syntax. This just expands the JSON structure into key/value/type columns, which we can use more easily in a SQL query. Other database systems may have similar syntax for similar purposes. The result of the above query is:

key        |value    |rn 
-----------|---------|---
age        |37       |3  
age        |38       |2  
age        |         |1  
city       |New York |2  
city       |Zurich   |1  
first_name |John     |1  
last_name  |Doe      |1  

Filtering the ones whose row number is 1:

SELECT [key], value
FROM (
  SELECT ts, [key], value, row_number() OVER (
    PARTITION BY [key] ORDER BY ts DESC) rn
  FROM history OUTER APPLY openjson(delta)
) t
WHERE rn = 1
ORDER BY ts, [key]

This yields:

key        |value  
-----------|-------
first_name |John   
last_name  |Doe    
age        |       
city       |Zurich 

Exactly the data we wanted, in key/value form. Notice that this filtering step could have been done with DISTINCT ON in PostgreSQL, or with KEEP (DENSE_RANK FIRST ORDER BY ..) in Oracle – an exercise which I shall leave to the reader (feel free to leave the solution in the comments!)

And now, finally, just re-assemble the JSON using SQL Server 2017 STRING_AGG. PostgreSQL would offer us JSON_AGG here, Oracle has JSON_OBJECTAGG. With STRING_AGG, you have to take care of manually escaping all values according to JSON syntax rules, which is bad. In my example, I just replaced ” by \”. Other characters need escaping too, so if there is a built-in feature, use that instead of string processing.

The STRING_AGG function aggregates a CASE expression which translates different JSON data types into different formats, where:

  • 0 is NULL (and nulls are not aggregated)
  • 1 is string
  • everything else can be taken at its value for simplicity, e.g. numbers or booleans

Every value (except nulls) are prefixed by the JSON object’s attribute name (“key”).

SELECT 
  '{' 
+ string_agg(
    CASE type WHEN 0 THEN NULL ELSE 
      '"' + replace([key], '"', '\"') + '": ' + 
      CASE type WHEN 1 THEN '"' + replace(value, '"', '\"') + '"' ELSE value END
    END, ', ') 
+ '}'
FROM (
  SELECT *, row_number() OVER (
    PARTITION BY [key] ORDER BY ts DESC) rn
  FROM history
  OUTER APPLY openjson(delta)
  
  -- Apply all deltas prior to any given snapshot
  WHERE ts <= '2000-01-01 04:00:00'
) t
WHERE rn = 1;

This produces

{"city": "Zurich", "first_name": "John", "last_name": "Doe"}

A final query, that gets us the entire history of snapshots (watch the performance on this one, could definitely be optimised):

SELECT ts, (
  SELECT 
    '{' 
  + string_agg(
      CASE type WHEN 0 THEN NULL ELSE 
        '"' + replace([key], '"', '\"') + '": ' + 
        CASE type WHEN 1 THEN '"' + replace(value, '"', '\"') + '"' ELSE value END
      END, ', ') 
  + '}'
  FROM (
    SELECT *, row_number() OVER (
      PARTITION BY [key] ORDER BY ts DESC) rn
    FROM history
    OUTER APPLY openjson(delta)
    
    -- Apply all deltas prior to any given snapshot
    WHERE ts <= x.ts
  ) t
  WHERE rn = 1
)
FROM history x
GROUP BY ts;

It yields:

ts       |                                                                          
---------|--------------------------------------------------------------------------
00:00:00 |{"first_name": "John", "last_name": "Doe"}                                
01:00:00 |{"age": 37, "first_name": "John", "last_name": "Doe"}                     
02:00:00 |{"age": 38, "first_name": "John", "last_name": "Doe"}                     
03:00:00 |{"age": 38, "city": "New York", "first_name": "John", "last_name": "Doe"} 
04:00:00 |{"city": "Zurich", "first_name": "John", "last_name": "Doe"}              

So, the complete history of all the snapshot versions of the Person with ID = 1.

Very cool, and definitely good enough for their archive / audit query requirements.

How to Write a Multiplication Aggregate Function in SQL

Everyone knows the SQL SUM() aggregate function (and many people also know its window function variant).

When querying the Sakila database, we can get the daily revenue (using PostgreSQL syntax):

WITH p AS (
  SELECT
    CAST (payment_date AS DATE) AS date,
    amount
  FROM payment
)
SELECT
  date,
  SUM (amount) AS daily_revenue,
  SUM (SUM (amount)) OVER (ORDER BY date) AS cumulative_revenue
FROM p
GROUP BY date
ORDER BY date

The result will look something like this:

date       |daily_revenue |cumulative_revenue 
-----------|--------------|-------------------
2005-05-24 |29.92         |29.92              
2005-05-25 |573.63        |603.55             
2005-05-26 |754.26        |1357.81            
2005-05-27 |685.33        |2043.14            
2005-05-28 |804.04        |2847.18            
2005-05-29 |648.46        |3495.64            
2005-05-30 |628.42        |4124.06            
2005-05-31 |700.37        |4824.43            
2005-06-14 |57.84         |4882.27            
...

Doing the same with multiplication

This is already quite useful. Very occasionally, however, we do not need to aggregate multiple values in a sum (through addition), but in a product (through multiplication). I’ve just stumbled upon such a case on Stack Overflow, recently.

The question wanted to achieve the following result:

date        factor          accumulated
---------------------------------------
1986-01-10  null            1000
1986-01-13  -0.026595745    973.4042548
1986-01-14  0.005464481     978.7234036
1986-01-15  -0.016304348    962.7659569
1986-01-16  0               962.7659569
1986-01-17  0               962.7659569
1986-01-20  0               962.7659569
1986-01-21  0.005524862     968.0851061
1986-01-22  -0.005494506    962.765957
1986-01-23  0               962.765957
1986-01-24  -0.005524862    957.4468078
1986-01-27  0.005555556     962.7659569
1986-01-28  0               962.7659569
1986-01-29  0               962.7659569
1986-01-30  0               962.7659569
1986-01-31  0.027624309     989.3617013
1986-02-03  0.016129032     1005.319148
1986-02-04  0.042328041     1047.872338
1986-02-05  0.04568528      1095.744679

If this were a Microsoft Excel spreadsheet, the ACCUMULATED column would simply start with 1000 and have the following formula in all other rows:

accumulated(i) = accumulated(i - 1) * (1 + factor)

In other words (values truncated for simplicity):

1000.0 = start
 973.4 = 1000.0 * (1 - 0.026)
 978.7 =  973.4 * (1 + 0.005)
 962.7 =  978.7 * (1 - 0.016)
 962.7 =  962.7 * (1 - 0.000)
 962.7 =  962.7 * (1 - 0.000)
 962.7 =  962.7 * (1 - 0.000)
 968.0 =  962.7 * (1 + 0.005)
 ...

This is exciting because we’re not only requiring multiplicative aggregation, but even cumulative multiplicative aggregation. So, another window function.

But regrettably, SQL doesn’t offer a MUL() aggregate function, even if it were relatively simple to implement. We have two options:

  • Implementing a custom aggregate function (stay tuned for a future blog post)
  • Using a trick by summing logarithms, rather than multiplying operands directly

We’re implementing the latter for now. Check out this cool Wikipedia website about logarithmic identities, which we are going to blindly trust. In the middle of it, we have:

bx * by = bx + y

Which leads to:

logb(x * y) = logb(x) + logb(y)

How cool is that? And thus:

x * y = blogb(x) + logb(y)

So, we can define any multiplication in terms of a bunch of exponentiation to some base (say e) and logarithms to some base (say e). Or, in SQL:

x * y = EXP(LN(x) + LN(y))

Or, as an aggregate function:

MUL(x) = EXP(SUM(LN(x)))

Heh!

Our original problem can thus be solved very easily using this, as shown in my stack overflow answer:

SELECT
  date,
  factor,
  EXP(SUM(LN(1000 * (1 + COALESCE(factor, 1)))) 
       OVER (ORDER BY date)) AS accumulated
FROM t

And we get the nice result as previously shown. You may have to replace LN() by LOG() depending on your database.

Caveat: Negative numbers

Try running this:

SELECT LN(-1)

You’ll get:

SQL Error [2201E]: ERROR: cannot take logarithm of a negative number

Logarithms are defined only for strictly positive numbers, unless your database is capable of handling complex numbers as well. In case of which a single zero value would still break the aggregation.

But if your data set is defined to contain only strictly positive numbers, you’ll be fine – give or take some floating point rounding errors. Or, you’ll do some sign handling, which looks like this:

WITH v(i) AS (VALUES (-2), (-3), (-4))
SELECT 
  CASE 
    WHEN SUM (CASE WHEN i < 0 THEN -1 END) % 2 < 0 
    THEN -1 
    ELSE 1 
  END * EXP(SUM(LN(ABS(i)))) multiplication1
FROM v;

WITH v(i) AS (VALUES (-2), (-3), (-4), (-5))
SELECT 
  CASE 
    WHEN SUM (CASE WHEN i < 0 THEN -1 END) % 2 < 0 
    THEN -1 
    ELSE 1 
  END * EXP(SUM(LN(ABS(i)))) multiplication2
FROM v;

The above yielding

multiplication1      
--------------------
-23.999999999999993 


multiplication2     
-------------------
119.99999999999997 

Close enough.

Caveat: Zero

Try running this:

SELECT LN(0)

You’ll get:

SQL Error [2201E]: ERROR: cannot take logarithm of zero

Zero is different from negative numbers. A product that has a zero operand is always zero, so we should be able to handle this. We’ll do it in two steps:

  • Exclude zero values from the actual aggregation that uses EXP() and LN()
  • Add an additional CASE expression that checks if any of the operands is zero

The first step might not be necessary depending on how your database optimiser executes the second step.

WITH v(i) AS (VALUES (2), (3), (0))
SELECT 
  CASE 
    WHEN SUM (CASE WHEN i = 0 THEN 1 END) > 0
    THEN 0
    WHEN SUM (CASE WHEN i < 0 THEN -1 END) % 2 < 0 
    THEN -1 
    ELSE 1 
  END * EXP(SUM(LN(ABS(NULLIF(i, 0))))) multiplication
FROM v;

Extension: DISTINCT

Calculating the product of all DISTINCT values requires to repeat the DISTINCT keyword in 2 out of the above 3 sums:

WITH v(i) AS (VALUES (2), (3), (3))
SELECT 
  CASE 
    WHEN SUM (CASE WHEN i = 0 THEN 1 END) > 0
    THEN 0
    WHEN SUM (DISTINCT CASE WHEN i < 0 THEN -1 END) % 2 < 0 
    THEN -1 
    ELSE 1 
  END * EXP(SUM(DISTINCT LN(ABS(NULLIF(i, 0))))) multiplication
FROM v;

The result is now:

multiplication |
---------------|
6              |

Notice that the first SUM() that checks for the presence of NULL values doesn’t require a DISTINCT keyword, so we omit it to improve performance.

Extension: Window functions

Of course, if we are able to emulate a PRODUCT() aggregate function, we’d love to turn it into a window function as well. This can be done simply by transforming each individual SUM() into a window function:

WITH v(i, j) AS (
  VALUES (1, 2), (2, -3), (3, 4), 
         (4, -5), (5, 0), (6, 0)
)
SELECT i, j, 
  CASE 
    WHEN SUM (CASE WHEN j = 0 THEN 1 END) 
      OVER (ORDER BY i) > 0
    THEN 0
    WHEN SUM (CASE WHEN j < 0 THEN -1 END) 
      OVER (ORDER BY i) % 2 < 0 
    THEN -1 
    ELSE 1 
  END * EXP(SUM(LN(ABS(NULLIF(j, 0)))) 
    OVER (ORDER BY i)) multiplication
FROM v;

The result is now:

i |j  |multiplication      |
--|---|--------------------|
1 | 2 |2                   |
2 |-3 |-6                  |
3 | 4 |-23.999999999999993 |
4 |-5 |119.99999999999997  |
5 | 0 |0                   |
6 | 1 |0                   |

So cool! The cumulative product gets bigger and bigger until it hits he first zero, from then on it stays zero.

jOOQ support

jOOQ 3.12 will support this as well and emulate it correctly on all databases:
https://github.com/jOOQ/jOOQ/issues/5939

Beware of Hidden PL/SQL to SQL Context Switches

I recently stumbled upon a curious query on a customer’s productive Oracle database:

SELECT USER FROM SYS.DUAL

Two things caught my attention:

  • The query was executed many billions of times per month, accounting for about 0.3% of that system’s load. That’s 0.3% for something extremely silly!
  • I don’t think that customer would ever qualify the DUAL table as SYS.DUAL, which hints at some system functionality

I found it in Oracle Enterprise Manager, but you could also find it using a query like this one:

SELECT 
  sql_id, 
  executions, 
  elapsed_time, 
  ratio_to_report(elapsed_time) over() p, 
  sql_text
FROM v$sql
ORDER BY p DESC;

Why was this query being run so often? In Enterprise Manager, the query’s statistics overview displayed that the query originated from a function called STANDARD.USER (I don’t know yet where I could find this information in the dictionary views, manually).

Naively, I had always thought that the USER pseudo column or pseudo constant is some value from the context, but like many other functions, it’s really just a function in that package.

What does STANDARD.USER() do?

Now, I’m not 100% sure if that source code is something that I am allowed to reproduce from a legal perspective, this being Oracle and all. But if you run this query here, which I am freely allowing you to:

WITH s AS (
  SELECT s.*,
    MIN(CASE 
      WHEN upper(text) LIKE '%FUNCTION USER%' 
      THEN line END
    ) OVER () s
  FROM all_source s
  WHERE owner = 'SYS' 
  AND name = 'STANDARD'
  AND type = 'PACKAGE BODY'
)
SELECT text
FROM s
WHERE line >= s AND line < s + 6;

Then you might be able to see something like this:

  function USER return varchar2 is
  c varchar2(255);
  begin
        select user into c from sys.dual;
        return c;
  end;

This is just the result of some SQL query I’ve shown you. Any correspondence with actual source code is merely coincidental.

Let’s assume this were the actual source code of the STANDARD.USER() function. We can now clearly see that this very SQL query that I’ve observed before is being executed! Want to verify this?

Let’s benchmark

As always, I’m using the benchmark technique described here. The full benchmark logic is at the end of the article.

In essence, I’m comparing the performances of 500000 executions of this loop:

FOR i IN 1 .. v_repeat LOOP
  v := USER;
END LOOP;

With this one:

FOR i IN 1 .. v_repeat LOOP
  SELECT USER INTO v FROM dual;
END LOOP;

And this one:

FOR i IN 1 .. v_repeat LOOP
  -- Note: According to the doc, use 'SESSION_USER' instead
  v := sys_context('USERENV', 'CURRENT_USER');
END LOOP;

The result of this benchmark is:

Run 1, Statement 1 : 2.40509 (avg : 2.43158)
Run 1, Statement 2 : 2.13208 (avg : 2.11816)
Run 1, Statement 3 : 1.01452 (avg : 1.02081)

Run 2, Statement 1 : 2.41889 (avg : 2.43158)
Run 2, Statement 2 : 2.09753 (avg : 2.11816)
Run 2, Statement 3 : 1.00203 (avg : 1.02081)

Run 3, Statement 1 : 2.45384 (avg : 2.43158)
Run 3, Statement 2 : 2.09060 (avg : 2.11816)
Run 3, Statement 3 : 1.02239 (avg : 1.02081)

Run 4, Statement 1 : 2.39516 (avg : 2.43158)
Run 4, Statement 2 : 2.14140 (avg : 2.11816)
Run 4, Statement 3 : 1.06512 (avg : 1.02081)

Run 5, Statement 1 : 2.48493 (avg : 2.43158)
Run 5, Statement 2 : 2.12922 (avg : 2.11816)
Run 5, Statement 3 : 1.00000 (avg : 1.02081)

How to read this benchmark result? These aren’t actual times, which are not interesting, but relative times compared to the fastest run (run 5, statement 3 = 1). The explicit SELECT USER FROM DUAL is about half as fast as the SYS_CONTEXT call, and the USER call is a bit slower, even.

When re-running this query:

SELECT 
  sql_id, 
  executions, 
  ratio_to_report(elapsed_time) over() p, 
  sql_text
FROM v$sql
ORDER BY p DESC;

We can see:

SQL_ID          EXECUTIONS  P     SQL_TEXT
6r9s58qfu339c   1           0.26  DECLARE ...
1v717nvrhgbn9   2500000     0.14  SELECT USER FROM SYS.DUAL
...

So, this query has definitely been run way too many times, including the PL/SQL to SQL context switch that is involved.

I’m running this benchmark in Oracle 18.0.0.0.0 in Docker on a Windows machine. More close-to-the-metal and less virtualised setups might achieve more drastic results. See, e.g. Connor McDonald got a much better improvement from using SYS_CONTEXT:

In this particular case, The STANDARD.USER() reference was used very often in triggers to fill in audit columns of many tables. Very easy to fix. Just use sys_context('USERENV', 'CURRENT_USER') instead.

Full benchmark logic

SET SERVEROUTPUT ON

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;

CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 500000;
  v NUMBER;
BEGIN

  -- Repeat the whole benchmark several times to 
  -- avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1 .. v_repeat LOOP
      v := v + length(USER);
    END LOOP;
  
    INSERT INTO results VALUES (r, 1, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1 .. v_repeat LOOP
      SELECT v + length(USER) INTO v FROM dual;
    END LOOP;
      
    INSERT INTO results VALUES (r, 2, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1 .. v_repeat LOOP
      -- Note: According to the doc, use 'SESSION_USER' instead
      v := v + length(sys_context('USERENV', 'CURRENT_USER'));
    END LOOP;
      
    INSERT INTO results VALUES (r, 3, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(10, 5)) ratio,
      CAST(AVG(elapsed) OVER (PARTITION BY stmt) / 
           MIN(elapsed) OVER() AS NUMBER(10, 5)) avg_ratio
    FROM results
    ORDER BY run, stmt
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio || ' (avg : ' || rec.avg_ratio || ')');
  END LOOP;
  
  dbms_output.put_line('');
  dbms_output.put_line('Copyright Data Geekery GmbH');
  dbms_output.put_line('https://www.jooq.org/benchmark');
END;
/

DROP TABLE results;

Find the Next Non-NULL Row in a Series With SQL

I’ve stumbled across this fun SQL question on reddit, recently. The question was looking at a time series of data points where some events happened. For each event, we have the start time and the end time

timestamp             start    end
-----------------------------------
2018-09-03 07:00:00   1        null
2018-09-03 08:00:00   null     null
2018-09-03 09:00:00   null     null
2018-09-03 10:00:00   null     1
2018-09-03 12:00:00   null     null
2018-09-03 13:00:00   null     null
2018-09-03 14:00:00   1        null
2018-09-03 15:00:00   null     1

The desired output of the query should be this additional count column:

timestamp             start    end    count
-------------------------------------------
2018-09-03 07:00:00   1        null   4
2018-09-03 08:00:00   null     null   null
2018-09-03 09:00:00   null     null   null
2018-09-03 10:00:00   null     1      null
2018-09-03 12:00:00   null     null   null
2018-09-03 13:00:00   null     null   null
2018-09-03 14:00:00   1        null   2
2018-09-03 15:00:00   null     1      null

So, the rule is simple. Whenever an event starts, we would like to know how many consecutive entries it takes until the event stops again. We can visually see how that makes sense:

timestamp             start    end    count
-------------------------------------------
2018-09-03 07:00:00   1        null   4     -- 4 Rows in this event
2018-09-03 08:00:00   null     null   null
2018-09-03 09:00:00   null     null   null
2018-09-03 10:00:00   null     1      null

2018-09-03 12:00:00   null     null   null  -- No event here
2018-09-03 13:00:00   null     null   null

2018-09-03 14:00:00   1        null   2     -- 2 Rows in this event
2018-09-03 15:00:00   null     1      null

Some observations and assumptions about the problem at hand:

  • No two events will ever overlap
  • The time series does not progress monotonously, i.e. even if most data points are 1h apart, there can be larger or smaller gaps between data points
  • There are, however, no two identical timestamps in the series

How can we solve this problem?

Create the data set, first

We’re going to be using PostgreSQL for this example, but it will work with any database that supports window functions, which are most databases these days.

In PostgreSQL, we can use the VALUES() clause to generate data in memory easily. For the sake of simplicity, we’re not going to use timestamps, but integer representations of the timestamps. I’ve included the same out-of-the-ordinary gap between 4 and 6:

values (1, 1, null),
       (2, null, null),
       (3, null, null),
       (4, null, 1),
       (6, null, null),
       (7, null, null),
       (8, 1, null),
       (9, null, 1)

If we run this statement (yes, this is a standalone statement in PostgreSQL!), then the database will simply echo back the values we’ve sent it:

column1 |column2 |column3 |
--------|--------|--------|
1       |1       |        |
2       |        |        |
3       |        |        |
4       |        |1       |
6       |        |        |
7       |        |        |
8       |1       |        |
9       |        |1       |

How to deal with non-monotonously growing series

The fact that column1 is not growing monotonously means that we cannot use it / trust it as a means to calculate the length of an event. We need to calculate an additional column that has a guaranteed monotonously growing set of integers in it. The ROW_NUMBER() window function is perfect for that.

Consider this SQL statement:

with 
  d(a, b, c) as (
	values (1, 1, null),
	       (2, null, null),
	       (3, null, null),
	       (4, null, 1),
	       (6, null, null),
	       (7, null, null),
	       (8, 1, null),
	       (9, null, 1)
  ),
  t as (
    select 
      row_number() over (order by a) as rn, a, b, c
    from d
  )
select * from t;

The new rn column is a row number calculated for each row based on the ordering of a. For simplicity, I’ve aliased:

  • a = timestamp
  • b = start
  • c = end

The result of this query is:

rn |a |b |c |
---|--|--|--|
1  |1 |1 |  |
2  |2 |  |  |
3  |3 |  |  |
4  |4 |  |1 |
5  |6 |  |  |
6  |7 |  |  |
7  |8 |1 |  |
8  |9 |  |1 |

Nothing fancy yet.

Now, how to use this rn column to find the length of an event?

Visually, we can get the idea quickly, seeing that an event’s length can be calculated using the formula RN2 - RN1 + 1:

rn |a |b |c |
---|--|--|--|
1  |1 |1 |  | RN1 = 1
2  |2 |  |  |
3  |3 |  |  |
4  |4 |  |1 | RN2 = 4

5  |6 |  |  |
6  |7 |  |  |

7  |8 |1 |  | RN1 = 7
8  |9 |  |1 | RN2 = 8

We have two events:

  • 4 – 1 + 1 = 4
  • 8 – 7 + 1 = 2

So, all we have to do is for each starting point of an event at RN1, find the corresponding RN2, and run the arithmetic. This is quite a bit of syntax, but it isn’t so hard, so bear with me while I explain:

with 
  d(a, b, c) as (
	values (1, 1, null),
	       (2, null, null),
	       (3, null, null),
	       (4, null, 1),
	       (6, null, null),
	       (7, null, null),
	       (8, 1, null),
	       (9, null, 1)
  ),
  t as (
    select 
      row_number() over (order by a) as rn, a, b, c
    from d
  )

-- Interesting bit here:
select
  a, b, c,
  case 
    when b is not null then 
      min(case when c is not null then rn end) 
        over (order by rn 
          rows between 1 following and unbounded following) 
      - rn + 1 
  end cnt
from t;

Let’s look at this new cnt column, step by step. First, the easy part:

The CASE expression

There’s a case expression that goes like this:

case 
  when b is not null then 
    ...
end cnt

All this does is check if b is not null and if this is true, then calculate something. Remember, b = start, so we’re putting a calculated value in the row where an event started. That was the requirement.

The new window function

So, what do we calculate there?

min(...) over (...) ...

A window function that finds the minimum value over a window of data. That minimum value is RN2, the next row number value where the event ends. So, what do we put in the min() function to get that value?

min(case when c is not null then rn end) 
over (...) 
...

Another case expression. When c is not null, we know the event has ended (remember, c = end). And if the event has ended, we want to find that row’s rn value. So that would be the minimum value of that case expression for all the rows after the row that started the event. Visually:

rn |a |b |c | case expr | minimum "next" value
---|--|--|--|-----------|---------------------
1  |1 |1 |  | null      | 4
2  |2 |  |  | null      | null
3  |3 |  |  | null      | null
4  |4 |  |1 | 4         | null

5  |6 |  |  | null      | null
6  |7 |  |  | null      | null

7  |8 |1 |  | null      | 8
8  |9 |  |1 | 8         | null

Now, we only need to specify that OVER() clause to form a window of all rows that follow the current row.

min(case when c is not null then rn end) 
  over (order by rn 
    rows between 1 following and unbounded following) 
...

The window is ordered by rn and it starts 1 row after the current row (1 following) and ends in infinity (unbounded following).

The only thing left to do now is do the arithmetic:

min(case when c is not null then rn end) 
  over (order by rn 
    rows between 1 following and unbounded following) 
- rn + 1

This is a verbose way of calculating RN2 - RN1 + 1, and we’re doing that only in those columns that start an event. The result of the complete query above is now:

a |b |c |cnt |
--|--|--|----|
1 |1 |  |4   |
2 |  |  |    |
3 |  |  |    |
4 |  |1 |    |
6 |  |  |    |
7 |  |  |    |
8 |1 |  |2   |
9 |  |1 |    |

Read more about window functions on this blog.

How to Write Multiset Conditions With Oracle VARRAY Types

Oracle is one of the few databases that implements the SQL standard ORDBMS extensions, which essentially allow for nested collections. Other databases that have these features to some extent are CUBRID, Informix, PostgreSQL.

Oracle has two types of nested collections:

-- Nested tables
CREATE TYPE t1 AS TABLE OF VARCHAR2(10);
/

-- Varrays
CREATE TYPE t2 AS VARRAY(10) OF VARCHAR2(10);
/

The main difference at first is that a nested table can be of arbitrary size, whereas a varray has a fixed maximum size. Other than that, they behave in similar ways.

When storing a nested collection in a table, there is another difference. Varrays can be inlined into the table just like any other data type, whereas nested tables have to be accompanied by an additional storage clause:

CREATE TABLE t (
  id NUMBER(10),
  t1 t1,
  t2 t2
)
NESTED TABLE t1 STORE AS t1_nt;

This is a minor hassle in terms of DDL. The runtime implications are more significant.

Multiset Conditions

The most important difference is the fact that all the useful multiset conditions are not available with varrays. For instance, consider running these statements:

INSERT INTO t VALUES (1, NULL, NULL);
INSERT INTO t VALUES (2, t1(), t2());
INSERT INTO t VALUES (
  3, 
  t1('abc', 'xyz', 'zzz'), 
  t2('abc', 'xyz', 'zzz')
);
INSERT INTO t VALUES (
  4, 
  t1('dup', 'dup', 'dup'), 
  t2('dup', 'dup', 'dup')
);

SELECT * FROM t WHERE 'abc' MEMBER OF t1;
SELECT * FROM t WHERE 'abc' MEMBER OF t2;

The result of these queries is:

ID  T1                        T2
-----------------------------------------------------
3   T1('abc', 'xyz', 'zzz')   T2('abc', 'xyz', 'zzz')

ORA-00932: inconsistent datatypes: expected UDT got TEST.T2

Bummer. The documentation is a bit unclear about this. It reads (emphasis mine):

he return value is TRUE if expr is equal to a member of the specified nested table or varray. The return value is NULL if expr is null or if the nested table is empty.

There is some explicit mention of varrays supporting these operations, but in most of the documentation, varrays are not mentioned. So, how can we write such operations with varrays? Here’s an list of translations of the nested table operator to the equivalent SQL expression for use with varrays.

These are the multiset conditions:

IS A SET condition

In SQL, everything is a (partially ordered) multiset by default. Sometimes, however, we want to work with sets, i.e. a special type of multiset that has no duplicate values. We can easily check whether nested tables are sets (or whether they aren’t):

-- Nested table version
SELECT * FROM t WHERE t1 IS A SET;

-- Varray version
SELECT * 
FROM t 
WHERE t2 IS NOT NULL
AND (SELECT count(*) FROM TABLE(t2)) 
  = (SELECT count(DISTINCT column_value) FROM TABLE(t2));

The IS A SET operation yields UNKNOWN if the nested table is NULL, so we have to take that into account as well. If it isn’t NULL, we can count the total values in the varray and compare that with the total distinct values in the varray.

The result is:

ID  T1                        T2
-----------------------------------------------------
2   T1()                      T2()
3   T1('abc', 'xyz', 'zzz')   T2('abc', 'xyz', 'zzz')

IS EMPTY condition

This predicate needs no explanation. It can be written as such:

-- Nested table version
SELECT * FROM t WHERE t1 IS EMPTY;

-- Varray version
SELECT * 
FROM t 
WHERE t2 IS NOT NULL
AND NOT EXISTS (
  SELECT * FROM TABLE (t2)
);

The result being:

ID  T1                 T2
---------------------------------------
2   T1()               T2()

MEMBER condition

This handy predicate can help check if a specific value is contained in a nested collection. It can be written as such:

-- Nested table version
SELECT * FROM t WHERE 'abc' MEMBER OF t1;

-- Varray version
SELECT *
FROM t
WHERE t2 IS NOT NULL
AND EXISTS (
  SELECT 1 FROM TABLE(t2) WHERE column_value = 'abc'
);

Yielding:

ID  T1                        T2
-----------------------------------------------------
3   T1('abc', 'xyz', 'zzz')   T2('abc', 'xyz', 'zzz')

SUBMULTISET condition

Just like the previous MEMBER condition, this predicate can help check if specific values (more than one) are contained in a nested collection. This is a bit more tricky than the previous emulations. The MEMBER condition works the same way for sets and multisets, as we’re checking if exactly one element is contained in the (multi)set.

When working with multisets, duplicates are allowed, and in the case of the SUBMULTISET operation, the following can be observed:

-- Equal multisets
t1() SUBMULTISET OF t1();
t1('a', 'a') SUBMULTISET OF t1('a', 'a');

-- Subsets
t1('a') SUBMULTISET OF t1('a', 'a');

-- But this is not true
t1('a', 'a') SUBMULTISET OF t1('a');

When we omit the fact that nested collections can be multisets and pretend we’re working with sets only, then the emulation of the SUBMULTISET operator is relatively easy:

-- Nested table version
SELECT * FROM t WHERE t1('abc', 'xyz') SUBMULTISET OF t1;

-- Varray version
SELECT *
FROM t
WHERE t2 IS NOT NULL
AND EXISTS (
  SELECT 1 FROM TABLE(t2) 
  WHERE column_value = 'abc'
  INTERSECT
  SELECT 1 FROM TABLE(t2) 
  WHERE column_value = 'xyz'
);

Yielding, once more:

ID  T1                        T2
-----------------------------------------------------
3   T1('abc', 'xyz', 'zzz')   T2('abc', 'xyz', 'zzz')

If we’re really working with multisets, things are a bit more tricky:

-- Nested table version
SELECT * FROM t WHERE t1('dup', 'dup') SUBMULTISET OF t1;

-- Varray version
SELECT *
FROM t
WHERE t2 IS NOT NULL
AND NOT EXISTS (
  SELECT column_value, count(*)
  FROM TABLE (t2('dup', 'dup')) x
  GROUP BY column_value
  HAVING count(*) > (
    SELECT count(*)
    FROM TABLE (t2) y
    WHERE y.column_value = x.column_value
  )
);

Yielding:

ID  T1                        T2
-----------------------------------------------------
4   T1('dup', 'dup', 'dup')   T2('dup', 'dup', 'dup')

How does it work? In the NOT EXISTS correlated subquery, we’re counting the number of duplicate values in the potential SUBMULTISET, effectively turning that SUBMULTISET into a SET using the GROUP BY operation.

We’re then comparing that count value from the left operand with the corresponding count value from the right operand. If there is no value in the left operand whose number of occurrences is bigger than the number of occurrences of that value in the right operand, then the whole left operand is a SUBMULTISET of the right operand.

Cool, eh? We’ll talk about performance another time :-)

MULTISET operators

Also very interesting, the multiset operators:

  • MULTISET EXCEPT [ ALL | DISTINCT ]
  • MULTISET INTERSECT [ ALL | DISTINCT ]
  • MULTISET UNION [ ALL | DISTINCT ]

Notice how there are some differences to the ordinary set operators that can be used in SELECT statements. In particular:

  • EXCEPT is used as defined in the standard, not MINUS
  • ALL is supported on all three operators, not just on UNION
  • ALL is the default, not DISTINCT

How can we work with these operators? Consider these queries:

SELECT id, t1 MULTISET EXCEPT t1('aaa', 'abc', 'dup', 'dup') r 
FROM t;

SELECT id, t1 MULTISET EXCEPT ALL t1('aaa', 'abc', 'dup', 'dup') r 
FROM t;

Both yielding:

ID   R
---------------------
1    (null)
2    T1()
3    T1('xyz', 'zzz')
4    T1('dup')

With this operator, we’re removing each element of the right operand once from the left operand:

  • 'aaa' does not appear in the left operand, so nothing happens
  • 'abc' appears on row with ID = 3 and we remove it
  • 'dup' appears on row with ID = 4, 3 times, and we remove it twice, leaving one value

Conversely, when adding DISTINCT, we’ll get:

SELECT t1 MULTISET EXCEPT DISTINCT t1('aaa', 'abc', 'dup') FROM t;

Yielding:

ID   R
---------------------
1    (null)
2    T1()
3    T1('xyz', 'zzz')
4    T1('')

The only difference is on row with ID = 4, where all 'dup' values were removed, regardless how many there were on either side of the MULTISET EXCEPT DISTINCT operator.

How to emulate this for varrays?

DISTINCT version

This is a bit easier, because we can now use MINUS:

-- Nested table version
SELECT t1 MULTISET EXCEPT DISTINCT t1('aaa', 'abc', 'dup', 'dup') 
FROM t;

-- Varray version
SELECT 
  id,
  CASE 
    WHEN t2 IS NULL THEN NULL 
    ELSE 
      CAST(MULTISET(
        SELECT column_value
        FROM TABLE (t2)
        MINUS
        SELECT column_value
        FROM TABLE (t2('aaa', 'abc', 'dup', 'dup'))
      ) AS t2)
  END r
FROM t;

Luckily, we can still cast a structural MULTISET type that we can obtain using the MULTISET() operator to a varray type. This greatly simplifies the task.

ALL version

If we want the MULTISET EXCEPT or MULTISET EXCEPT ALL semantics, things are trickier. Here’s a solution that resorts to using window functions, in order to turn a MULTISET back into a SET:

-- Nested table version
SELECT t1 MULTISET EXCEPT ALL t1('aaa', 'abc', 'dup', 'dup') 
FROM t;

-- Varray version
SELECT 
  id,
  CASE 
    WHEN t2 IS NULL THEN NULL 
    ELSE 
      CAST(MULTISET(
        SELECT column_value
        FROM (
          SELECT 
            column_value,
            row_number() OVER (
              PARTITION BY column_value 
              ORDER BY column_value) rn
          FROM TABLE (t2)
          MINUS
          SELECT 
            column_value, 
            row_number() OVER (
              PARTITION BY column_value 
              ORDER BY column_value) rn
          FROM TABLE (t2('aaa', 'abc', 'dup', 'dup'))
        )
      ) AS t2)
  END r
FROM t;

How does this work? Ideally, we’ll look at what this ROW_NUMBER() evaluates to on each row. For this, we use OUTER APPLY:

SELECT id, t2, column_value, rn
FROM t
OUTER APPLY (
  SELECT 
    column_value,
    row_number() OVER (
      PARTITION BY column_value
      ORDER BY column_value) rn
  FROM TABLE (t2)
);

The result is:

ID      T2                       COLUMN_VALUE  RN
-----------------------------------------------------
1       (null)                   (null)        (null)
2       T2()                     (null)        (null)
3       T2('abc', 'xyz', 'zzz')  abc           1
3       T2('abc', 'xyz', 'zzz')  xyz           1
3       T2('abc', 'xyz', 'zzz')  zzz           1
4       T2('dup', 'dup', 'dup')  dup           1
4       T2('dup', 'dup', 'dup')  dup           2
4       T2('dup', 'dup', 'dup')  dup           3

As can be seen, each duplicate value gets assigned a unique row number due to the nature of how ROW_NUMBER() works (this property can be very useful for solving the gaps-and-islands-problem. See trick #4).

Now that we turned our (COLUMN_VALUE) multiset into a (COLUMN_VALUE, RN) set (without duplicates), we can use MINUS again.

MULTISET INTERSECT and MULTISET UNION

MULTISET INTERSECT works exactly the same way as MULTISET EXCEPT, with the same window function based emulation in the MULTISET INTERSECT ALL case. MULTISET UNION is simpler, because Oracle knows UNION ALL, so we do not need to resort to such trickery.

Conclusion

Nested collections are a very powerful tool in Oracle SQL. Oracle knows two types of nested collections:

  • Nested tables
  • Varrays

Nested tables are trickier to maintain as you have to think of their storage more explicitly. Varrays can just be embedded into ordinary tables like any other column. But there’s a price to pay for using varrays. Oracle regrettably doesn’t support all of the above very useful multiset conditions and multiset operators.

Luckily, when you encounter a situation where you have varrays and cannot change that, you can still emulate each of the operators using more traditional SQL.

How SQL DISTINCT and ORDER BY are Related

One of the things that confuse SQL users all the time is how DISTINCT and ORDER BY are related in a SQL query.

The Basics

Running some queries against the Sakila database, most people quickly understand:

SELECT DISTINCT length FROM film

This returns results in an arbitrary order, because the database can (and might apply hashing rather than ordering to remove duplicates):

length |
-------|
129    |
106    |
120    |
171    |
138    |
80     |
...

Most people also understand:

SELECT length FROM film ORDER BY length

This will give us duplicates, but in order:

length |
-------|
46     |
46     |
46     |
46     |
46     |
47     |
47     |
47     |
47     |
47     |
47     |
47     |
48     |
...

And, of course, we can combine the two:

SELECT DISTINCT length FROM film ORDER BY length

Resulting in…

length |
-------|
46     |
47     |
48     |
49     |
50     |
51     |
52     |
53     |
54     |
55     |
56     |
...

Then why doesn’t this work?

Maybe somewhat intuitively, we may want to order the lengths differently, e.g. by title:

SELECT DISTINCT length FROM film ORDER BY title

Most databases fail this query with an exception like Oracle’s:

ORA-01791: not a SELECTed expression

At first sight, this seems funny, because this works after all:

SELECT length FROM film ORDER BY title

Yielding:

length |
-------|
86     |
48     |
50     |
117    |
130    |
...

We could add the title to illustrate the ordering

length |title                       |
-------|----------------------------|
86     |ACADEMY DINOSAUR            |
48     |ACE GOLDFINGER              |
50     |ADAPTATION HOLES            |
117    |AFFAIR PREJUDICE            |
130    |AFRICAN EGG                 |

So, how are these different?

We have to rewind and check out the logical order of SQL operations (as opposed to the syntactic order). And always remember, this is the logical order, not the actual order executed by the optimiser.

When we write something like this:

SELECT DISTINCT length FROM film ORDER BY length

The logical order of operations is:

  • FROM clause, loading the FILM table
  • SELECT clause, projecting the LENGTH column
  • DISTINCT clause, removing distinct tuples (with projected LENGTH columns)
  • ORDER BY clause, ordering by the LENGTH column

If we look at this step by step, we have:

Step 1: SELECT * FROM film

The intermediary data set is something like:

film_id |title                       |length | ...
--------|----------------------------|-------| ...
1       |ACADEMY DINOSAUR            |86     | ...
2       |ACE GOLDFINGER              |48     | ...
3       |ADAPTATION HOLES            |50     | ...
4       |AFFAIR PREJUDICE            |117    | ...
5       |AFRICAN EGG                 |130    | ...
...     |...                         |...    | ...

Step 2: SELECT length …

The intermediary data set is something like:

length |
-------|
86     |
48     |
50     |
117    |
130    |
...
86     | <-- duplicate

Step 3: SELECT DISTINCT length …

Now we’re getting a new random order (due to hashing) and no duplicates anymore:

length |
-------|
129    |
106    |
120    |
171    |
138    |
...

Step 4: … ORDER BY length

And we’re getting:

length |
-------|
46     |
47     |
48     |
49     |
50     |
...

It seems obvious.

So why did this work?

Remember, this query worked:

SELECT length FROM film ORDER BY title

Even if after projecting the LENGTH column, it seems as though it is no longer available for sorting, it really is, according to the SQL standard and to common sense. There is a concept called extended sort key columns in the SQL standard, which means the above query has a slightly different order of operations (apart from the fact that there is no DISTINCT operation):

  • FROM clause, loading the FILM table
  • SELECT clause, projecting the LENGTH column from the select list and the TITLE from the extended sort key columns
  • ORDER BY clause, ordering by the TITLE column
  • SELECT clause (implicit), projecting only the LENGTH column, discarding the TITLE column

Again, this is what happens logically. Database optimisers may choose other ways to implement this. By example:

Step 1: SELECT * FROM film

Same as before

film_id |title                       |length | ...
--------|----------------------------|-------| ...
1       |ACADEMY DINOSAUR            |86     | ...
2       |ACE GOLDFINGER              |48     | ...
3       |ADAPTATION HOLES            |50     | ...
4       |AFFAIR PREJUDICE            |117    | ...
5       |AFRICAN EGG                 |130    | ...
...     |...                         |...    | ...

Step 2: SELECT length, title…

We get that synthetic extended sort key column TITLE along with the LENGTH column that we requested

length |title                       |
-------|----------------------------|
86     |ACADEMY DINOSAUR            |
114    |ALABAMA DEVIL               |
50     |ADAPTATION HOLES            |
117    |AFFAIR PREJUDICE            |
168    |ANTITRUST TOMATOES          |
...

Step 3: … ORDER BY title

… we can now order by that column

length |title                       |
-------|----------------------------|
86     |ACADEMY DINOSAUR            |
48     |ACE GOLDFINGER              |
50     |ADAPTATION HOLES            |
117    |AFFAIR PREJUDICE            |
130    |AFRICAN EGG                 |
...

Step 4: SELECT length

… and finally discard it, because we never wanted it

length |
-------|
86     |
48     |
50     |
117    |
130    |

So why can’t we use DISTINCT?

If we try to run this:

SELECT DISTINCT length FROM film ORDER BY title

We would get an additional DISTINCT operation in our logical set of operations:

  • FROM clause, loading the FILM table
  • SELECT clause, projecting the LENGTH column from the select list and the TITLE from the extended sort key columns
  • DISTINCT clause, removing duplicate (LENGTH, TITLE) values… Ooops
  • ORDER BY clause, ordering by the TITLE column
  • SELECT clause (implicit), projecting only the LENGTH column, discarding the TITLE column

The problem is, since we have synthetically added the extended sort key column TITLE to the projection in order to be able to ORDER BY it, DISTINCT wouldn’t have the same semantics anymore as can be seen here:

SELECT count(*)
FROM (
  SELECT DISTINCT length FROM film
) t;

SELECT count(*)
FROM (
  SELECT DISTINCT length, title FROM film
) t;

Yielding

140
1000

All titles are distinct. There is no way this query can be executed reasonably. Either DISTINCT doesn’t work (because the added extended sort key column changes its semantics), or ORDER BY doesn’t work (because after DISTINCT we can no longer access the extended sort key column).

A more constructed example. T contains this data:

CREATE TABLE t (a INT, b INT);
INSERT INTO t VALUES (1, 1);
INSERT INTO t VALUES (1, 2);
INSERT INTO t VALUES (2, 3);
INSERT INTO t VALUES (1, 4);
INSERT INTO t VALUES (2, 5);
A   B
-----
1   1
1   2
2   3
1   4
2   5

What would this query produce?

SELECT DISTINCT a FROM t ORDER BY b;

Clearly, we should only get 2 rows with values 1, 2, because of DISTINCT a:

A 
--
1
2

Now, how do we order these by B? There are 3 values of B associated A = 1 and 2 values of B associated with A = 2:

A   B
------------------
1   Any of 1, 2, 4
2   Any of 3, 5

Should we get 1, 2 or 2, 1 as a result? Impossible to tell.

But there are some exceptions

The way I read the SQL standard, the following exception should be possible. The SQL standard ISO/IEC 9075-2:2016(E), 7.17 <query expression>, Syntax Rules 28) d) i) 6) references the “Left normal form derivation”. But I may be reading this wrong, see also a discussion on the PostgreSQL mailing list:
https://www.postgresql.org/message-id/20030819103859.L69440-100000%40megazone.bigpanda.com

In any case, it still makes sense to me. For instance, we can form expressions on the columns in the select list. This is totally fine in MySQL (strict mode) and Oracle:

SELECT DISTINCT length 
FROM film 
ORDER BY mod(length, 10), length;

It will produce

length |
-------|
50     |
60     |
70     |
80     |
90     |
100    |
110    |
120    |
130    |
140    |
150    |
160    |
170    |
180    |
51     |
61     |
71     |

PostgreSQL doesn’t allow this because the expression MOD(LENGTH, 10) is not in the select list. How to interpret this? We’re looking again at the order of SQL operations:

  • FROM clause, loading the FILM table
  • SELECT clause, projecting the LENGTH column from the select list. MOD(LENGTH, 10) does not have to be put in the extended sort key columns, because it can be fully derived from the select list.
  • DISTINCT clause, removing duplicate LENGTH values … all fine, because we don’t have the verboten extended sort key columns
  • ORDER BY clause, ordering by the mod(LENGTH, 10), LENGTH columns. Totally fine, because we can derive all of these order by expressions from expressions in the select list

Makes sense, right?

Back to our constructed table T:

A   B
-----
1   1
1   2
2   3
1   4
2   5

We are allowed to write:

SELECT DISTINCT a, b FROM t ORDER BY a - b;

We would get:

A   B
-----
1   4
2   5
2   3
1   2
1   1

Again, the order by expressions can be derived completely from the select list. This also works in Oracle:

SELECT DISTINCT a - b FROM t ORDER BY abs(a - b);

The select list contains a column A - B, so we can derive any ORDER BY expression from it. But these don’t work:

SELECT DISTINCT a - b FROM t ORDER BY a;
SELECT DISTINCT a - b FROM t ORDER BY b;
SELECT DISTINCT a - b FROM t ORDER BY b - a;

It is easy to build an intuition for why these don’t work. Clearly, the data set we want is:

A - B  A             B             B - A
------------------------------------------
-3     Any of 1, 2   Any of 4, 5   3
-1     Any of 2, 1   Any of 3, 2   1
 0     Any of 1      Any of 1      0

Now, how are we supposed to order these by A, B or B - A? It looks as though we should be able to sort by B - A in this case. We could derive a complicated transformation of expressions that can be reasonably transformed into each other, such as A - B = -(B - A), but this simply isn’t practical. The expression in the projection is A - B, and that’s the only expression we can re-use in the ORDER BY. For example, we could even do this in Oracle:

SELECT DISTINCT a - b FROM t ORDER BY abs((a - b) + (a - b));

Or start using aliases:

SELECT DISTINCT a - b AS x FROM t ORDER BY abs(x + x);

PostgreSQL DISTINCT ON

PostgreSQL has a nice feature for when you want to order by something from within a group of non-distinct values. Remember how this wasn’t possible?

SELECT DISTINCT length FROM film ORDER BY title

Well, this is:

SELECT DISTINCT ON (title) length FROM film ORDER BY title

And we’re getting now:

length |
-------|
86     |
48     |
50     |
117    |
130    |
169    |
62     |
...

What we’re essentially doing is, we take all distinct lengths, and for each group of identical lengths, we’re taking the top title as a criteria to order by. In a way, this is syntax sugar for this:

SELECT length
FROM (
  SELECT length, MIN(title) title
  FROM film
  GROUP BY length
) t
ORDER BY title

Which is what most people really want, when they ORDER BY something they cannot really order by.

Conclusion

The SQL language is quirky. This is mostly because the syntactical order of operations doesn’t match the logical order of operations. The syntax is meant to be human readable (remember Structured English Query Language?) but when reasoning about a SQL statement, we would often like to directly write down the logical order of operations.

In this article, we haven’t even touched the implications of adding

  • GROUP BY
  • TOP / LIMIT / FETCH
  • UNION

Which add more fun rules to what’s possible and what isn’t. Our previous article on the true logical order of SQL operations explains this completely.

Need more explanation? Check this out.