Imagine you have a configuration table like the following:
CREATE TABLE rule (
name VARCHAR2(50) NOT NULL PRIMARY KEY,
enabled NUMBER(1) DEFAULT 1 NOT NULL CHECK (enabled IN (0,1)),
priority NUMBER(10) DEFAULT 0 NOT NULL,
flag1 NUMBER(3) DEFAULT 0 NOT NULL,
flag2 NUMBER(3) DEFAULT 0 NOT NULL,
flag3 NUMBER(3) DEFAULT 0 NOT NULL,
flag4 NUMBER(3) DEFAULT 0 NOT NULL,
flag5 NUMBER(3) DEFAULT 0 NOT NULL
);
It specifies a set of rules that
- Can be enabled / disabled
- Can be given a priority among themselves
- Include a set of flags which correspond to the thing you want to configure (e.g. some check to execute)
- Those flags can be ordered as well
So, given the following data:
INSERT INTO rule (name, priority, flag1, flag5)
VALUES ('RULE 1', 1, 1, 2);
INSERT INTO rule (name, priority, flag2, flag5)
VALUES ('RULE 2', 2, 2, 1);
INSERT INTO rule (name, priority, flag3, flag4, flag5)
VALUES ('RULE 3', 3, 3, 1, 2);
INSERT INTO rule (name, priority, flag3)
VALUES ('RULE 4', 4, 1);
SELECT * FROM rule;
We’ll get our configuration “spreadsheet”:
NAME ENABLED PRIORITY FLAG1 FLAG2 FLAG3 FLAG4 FLAG5
------------------------------------------------------------
RULE 1 1 1 1 0 0 0 2
RULE 2 1 2 0 2 0 0 1
RULE 3 1 3 0 0 3 1 2
RULE 4 1 4 0 0 1 0 0
This form is really useful to
edit the configuration. If we want to activate
FLAG2
in
RULE 1
, we just go to that cell in some SQL tool like Oracle SQL Developer, and change the value.
But reading the configuration is a bit different.
FLAG1
through
FLAG5
are not nicely normalised. How to read the data as though it were normalised?
Using UNPIVOT
In Oracle and SQL Server, we can use
UNPIVOT
for this use case. I’m using Oracle syntax in this blog post. SQL Server’s is just slightly different. Consider the following query:
SELECT name, flag, value
FROM rule
UNPIVOT (
value FOR flag IN (
flag1,
flag2,
flag3,
flag4,
flag5
)
)
WHERE enabled = 1
AND value > 0
ORDER BY priority, value;
This will result in the following result set:
NAME FLAG VALUE
---------------------
RULE 1 FLAG1 1
RULE 1 FLAG5 2
RULE 2 FLAG5 1
RULE 2 FLAG2 2
RULE 3 FLAG4 1
RULE 3 FLAG5 2
RULE 3 FLAG3 3
RULE 4 FLAG3 1
In this representation, the rules are ordered by priority, and the flags are ordered by their respective value within a rule. The flags that are not turned on (value
0
) are simply omitted. This form is much easier to traverse procedurally, when “consuming” the configuration.
How does it work?
In principle,
UNPIVOT
is just syntax sugar for a bunch of
UNION ALL
subqueries. We could have written our query like this, instead:
SELECT name, flag, value
FROM (
SELECT rule.*, 'FLAG1' AS flag, FLAG1 AS value FROM rule
UNION ALL
SELECT rule.*, 'FLAG2' AS flag, FLAG2 AS value FROM rule
UNION ALL
SELECT rule.*, 'FLAG3' AS flag, FLAG3 AS value FROM rule
UNION ALL
SELECT rule.*, 'FLAG4' AS flag, FLAG4 AS value FROM rule
UNION ALL
SELECT rule.*, 'FLAG5' AS flag, FLAG5 AS value FROM rule
) rule
WHERE enabled = 1
AND value > 0
ORDER BY priority, value;
Which is decidedly more code. It’s also more work for the database. The execution plans are different (I’m using Oracle 12.2.0.1.0):
UNPIVOT version – single table access
---------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | SORT ORDER BY | | 5 |
|* 2 | VIEW | | 5 |
| 3 | UNPIVOT | | |
|* 4 | TABLE ACCESS FULL| RULE | 1 |
---------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("unpivot_view_005"."VALUE">0 AND
"unpivot_view_005"."ENABLED"=1))
4 - filter("RULE"."ENABLED"=1)
UNION ALL version – multi table access
---------------------------------------------
| Id | Operation | Name | Rows |
---------------------------------------------
| 0 | SELECT STATEMENT | | |
| 1 | SORT ORDER BY | | 8 |
| 2 | VIEW | | 8 |
| 3 | UNION-ALL | | |
|* 4 | TABLE ACCESS FULL| RULE | 1 |
|* 5 | TABLE ACCESS FULL| RULE | 1 |
|* 6 | TABLE ACCESS FULL| RULE | 2 |
|* 7 | TABLE ACCESS FULL| RULE | 1 |
|* 8 | TABLE ACCESS FULL| RULE | 3 |
---------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(("RULE"."ENABLED"=1 AND "FLAG1">0))
5 - filter(("RULE"."ENABLED"=1 AND "FLAG2">0))
6 - filter(("RULE"."ENABLED"=1 AND "FLAG3">0))
7 - filter(("RULE"."ENABLED"=1 AND "FLAG4">0))
8 - filter(("RULE"."ENABLED"=1 AND "FLAG5">0))
We can also measure the time it takes to execute these queries thousands of times.
The following shows resulting times relative to the fastest execution (1):
Run 1, Statement 1 : 1.155
Run 1, Statement 2 : 1.88056
Run 2, Statement 1 : 1.04333
Run 2, Statement 2 : 1.95148
Run 3, Statement 1 : 1.02185
Run 3, Statement 2 : 1.86074
Run 4, Statement 1 : 1
Run 4, Statement 2 : 1.85241
Run 5, Statement 1 : 1.0263
Run 5, Statement 2 : 1.82944
The
UNION ALL
version is consistently about 2x slower on this very small data set. This is significant in the use case presented here, as a configuration table is probably read many times per day.
Knowing when a rule starts and when it ends
The real world use case that is behind this blog post also needed to know when a rule started and when it ended. I.e., which flag entry was the first and which was the last of the rule. This was easy in the non-normalised representation where each rule was a single row.
In the normalised version, we can use
LEAD()
and LAG()
.
Using this query:
SELECT
CASE WHEN lag(name, 1, 'NULL')
OVER (ORDER BY priority, value) != name
THEN 1 ELSE 0 END rule_begin,
CASE WHEN lead(name, 1, 'NULL')
OVER (ORDER BY priority, value) != name
THEN 1 ELSE 0 END rule_end,
name, flag, value
FROM rule
UNPIVOT (
value FOR flag IN (
flag1,
flag2,
flag3,
flag4,
flag5
)
)
WHERE enabled = 1
AND value > 0
ORDER BY priority, value;
We’re now getting (with some visual emphasis):
RULE_BEGIN RULE_END NAME FLAG VALUE
-------------------------------------------
1 0 RULE 1 FLAG1 1
0 1 RULE 1 FLAG5 2
1 0 RULE 2 FLAG5 1
0 1 RULE 2 FLAG2 2
1 0 RULE 3 FLAG4 1
0 0 RULE 3 FLAG5 2
0 1 RULE 3 FLAG3 3
1 1 RULE 4 FLAG3 1
LEAD()
looks ahead one row to see if the rule name there is different from the rule name on the current row.
LAG()
looks behind one row to see if the rule name there is different from the rule name on the current row.
That’s it – very simple. The
window functions part of this example is part of my
10 SQL Tricks talk, which I highly recommend you watch.
Like this:
Like Loading...
Another approach that will have single TABLE ACCESS FULL is `INSERT ALL`. It could be handy when you need to materialize unpivot result:
CREATE TABLE unpiv(name VARCHAR2(100), FLAG VARCHAR2(100), VALUE INT);
INSERT ALL
WHEN enabled=1 AND flag1>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag1′, flag1)
WHEN enabled=1 AND flag2>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag2′, flag2)
WHEN enabled=1 AND flag3>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag3′, flag3)
WHEN enabled=1 AND flag4>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag4′, flag4)
WHEN enabled=1 AND flag5>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag5′, flag5)
SELECT *
FROM rule;
SELECT * FROM unpiv;
But that creates a new table that is out of sync with the old one. Doesn’t seem desireable in the context of the problem described in this blog post.
Yes, I agree that for ad-hoc unpivot is correct solution.
I often see `CREATE TABLE … AS SELECT … UNPIVOT` that is why I wanted to share my thoughts and extend a scope a bit :)
Oh interesting, in what context do you see that? I’m curious to read a blog post about it. Indeed, in those situations, INSERT ALL could be a very interesting alternative!
It could be simplified further:
INSERT ALL
WHEN flag1>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag1′, flag1)
WHEN flag2>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag2′, flag2)
WHEN flag3>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag3′, flag3)
WHEN flag4>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag4′, flag4)
WHEN flag5>0 THEN
INTO unpiv(name, flag, value) VALUES (name,’Flag5′, flag5)
SELECT *
FROM rule
WHERE enabled=1;