A large-ish customer in banking (largest tables on that particular system: ~1 billion rows) once decided to separate the OLTP database from the “log database” in order to better use resources and prevent contention on some tables, as the append-only log database is used heavily for analytic querying of all sorts. That seems to make perfect sense. Except that sometimes, joins need to be done between “main database” and “log database” tables. This is when things get really hard to tune in Oracle – and probably in other databases too.
In this article, however, I’d like to focus on a much simpler example. One that seems to cause no trouble to the optimiser because all joined tables are from the “log database” only. Let’s use the following setup:
-- This is the database link
CREATE PUBLIC DATABASE LINK LOOPBACK
CONNECT TO TEST IDENTIFIED BY TEST
USING 'ORCLCDB';
-- Just making sure we get all statistics in execution plans
ALTER SESSION SET statistics_level = ALL;
And then, create this schema:
CREATE TABLE t (
a INT NOT NULL,
b INT NOT NULL,
CONSTRAINT pk_t PRIMARY KEY (a)
);
CREATE TABLE u (
a INT NOT NULL,
b INT NOT NULL,
CONSTRAINT pk_u PRIMARY KEY (a)
);
INSERT INTO t
SELECT
level,
level
FROM dual
CONNECT BY level <= 500000;
INSERT INTO u
SELECT
level,
level
FROM dual
CONNECT BY level <= 500000;
CREATE INDEX i_t ON t(b);
ALTER TABLE u ADD CONSTRAINT fk_u FOREIGN KEY (a) REFERENCES t;
EXEC dbms_stats.gather_table_stats('TEST', 'T');
EXEC dbms_stats.gather_table_stats('TEST', 'U');
It’s a really boring emulation of the real schema, and it doesn’t have nearly as many columns / rows. But the essence is:
- There are (at least) two tables
- Both have quite a few rows (that’s important here. I’ll show why, later)
- We’ll use an index for searching rows
- We’ll join by a one-to-many relationship
There may be other setups to reproduce the same issue, of course.
Now, let’s consider the following query (not using the database link yet).
SELECT CASE WHEN EXISTS (
SELECT *
FROM t
JOIN u USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual
Unfortunately, Oracle doesn’t support boolean types and always requires a
FROM
clause. Otherwise, we could be writing this more concise version, as in PostgreSQL:
SELECT EXISTS (
SELECT *
FROM t
JOIN u USING (a)
WHERE t.b BETWEEN 0 AND 1000
)
We’re checking for the existence of rows in both tables, given a predicate that runs on the previously created index.
As shown in a previous article,
it’s much better to use EXISTS
rather than COUNT(*), in pretty much all databases. The algorithm is optimal, because the usage of the
EXISTS
predicate hints to the optimiser that
a SEMI JOIN can be used instead of an INNER JOIN
--------------------------------------------------------------------------
| Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | | 1 |
| NESTED LOOPS SEMI | | 1 | 4 | 1 |
| TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 1000 | 1 |
| INDEX RANGE SCAN | I_T | 1 | 1000 | 1 |
| INDEX UNIQUE SCAN | PK_U | 1 | 333K| 1 |
| FAST DUAL | | 1 | 1 | 1 |
--------------------------------------------------------------------------
Some observations:
So, this is optimal (until I learn a new trick, of course).
Let’s bring in database links
Assuming that these two tables are on a remote database, we might naively proceed with writing this query:
SELECT CASE WHEN EXISTS (
SELECT *
FROM t@loopback
JOIN u@loopback USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual;
So now, we’re selecting from
T@LOOPBACK
and
U@LOOPBACK
, but the rest is exactly the same. For the sake of simplicity, I’m running this reproduction on the same instance, thus “LOOPBACK”. The logical impact is the same, though.
------------------------------------------------------
| Operation | Name | Starts | E-Rows | A-Rows |
------------------------------------------------------
| SELECT STATEMENT | | 1 | | 1 |
| REMOTE | | 1 | | 1 |
| FAST DUAL | | 1 | 1 | 1 |
------------------------------------------------------
Interesting. Or rather: Not too interesting. Sure, our own database knows the correct estimate: 1 row that comes out of the
EXISTS()
predicate. But the interesting thing happens at the remote database. Let’s look at the plan there. The query being executed on the remote database is this:
SQL_ID 80fczs4r1c9yd, child number 0
-------------------------------------
SELECT 0 FROM "T" "A2","U" "A1" WHERE "A2"."B">=0 AND "A2"."B"=0
So, the
EXISTS()
predicate is not propagated to the remote database. Thus, the plan:
Plan hash value: 165433672
--------------------------------------------------------------------------
| Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | | 1 |
| HASH JOIN | | 1 | 1000 | 1 |
| TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 1000 | 1000 |
| INDEX RANGE SCAN | I_T | 1 | 1000 | 1000 |
| INDEX FAST FULL SCAN | PK_U | 1 | 500K| 1 |
--------------------------------------------------------------------------
Oops. Observations:
- We’re now running a hash join (as expected, given the query that the remote database knows of)
- We’re materialising the expected 1000 rows from the predicate on
T.B
- But we’re still not fetching all the expected 500,000 rows from the
U
table because the database that calls this query will abort as soon as it finds a single row
Huh. Bummer. So while we’re not running into a major catastrophe (of materialising
all the rows from
U
), this is still far from optimal. The remote database has no knowledge at all of the fact that we’re going to be selecting
0
or
1
rows only, and that it thus should always run a
SEMI JOIN
.
You can try adding a
/*+FIRST_ROWS(1)*/
hint, but that doesn’t work. It won’t make it to the remote database.
Arcane DUAL@LINK to the rescue
This is when I had an idea. The problem might just be the fact that Oracle always needs a
FROM
clause, even if it doesn’t make any sense here. So what if we use
DUAL@LOOPBACK
instead of
DUAL
. Because that
DUAL
table, technically, is a table on our own database, so even if it looks as though the entire query can be run on the remote database, that seems not to be true here! So let’s try this:
SELECT CASE WHEN EXISTS (
SELECT *
FROM t@loopback
JOIN u@loopback USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual@loopback; -- Subtle difference here!
As I hoped, this subtle change leads to the
EXISTS()
predicate being sent to the remote database. The query executed on the remote database is now:
SQL_ID 9bz87xw0zc23c, child number 0
-------------------------------------
SELECT CASE WHEN EXISTS (SELECT 0 FROM "T" "A3","U" "A2" WHERE
"A3"."B">=0 AND "A3"."B"<=1000 AND "A3"."A"="A2"."A") THEN 1 ELSE 0 END
FROM "DUAL" "A1"
And the plan, now again including the desired
SEMI JOIN
:
Plan hash value: 1561559448
--------------------------------------------------------------------------
| Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | | 1 |
| NESTED LOOPS SEMI | | 1 | 4 | 1 |
| TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 1000 | 1 |
| INDEX RANGE SCAN | I_T | 1 | 1000 | 1 |
| INDEX UNIQUE SCAN | PK_U | 1 | 333K| 1 |
| FAST DUAL | | 1 | 1 | 1 |
--------------------------------------------------------------------------
Excellent!
Benchmark time
Plans and estimates are one thing. What ultimately counts to business is wall clock time. So, let’s try this again using a benchmark:
SET SERVEROUTPUT ON
DECLARE
v_ts TIMESTAMP WITH TIME ZONE;
v_repeat CONSTANT NUMBER := 100;
BEGIN
-- Repeat benchmark several times to avoid warmup penalty
FOR r IN 1..5 LOOP
v_ts := SYSTIMESTAMP;
FOR i IN 1..v_repeat LOOP
FOR rec IN (
SELECT CASE WHEN EXISTS (
SELECT *
FROM t
JOIN u USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual
) LOOP
NULL;
END LOOP;
END LOOP;
dbms_output.put_line('Run ' || r ||', Statement 1 : '
|| (SYSTIMESTAMP - v_ts));
v_ts := SYSTIMESTAMP;
FOR i IN 1..v_repeat LOOP
FOR rec IN (
SELECT CASE WHEN EXISTS (
SELECT *
FROM t@loopback
JOIN u@loopback USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual
) LOOP
NULL;
END LOOP;
END LOOP;
dbms_output.put_line('Run ' || r ||', Statement 2 : '
|| (SYSTIMESTAMP - v_ts));
v_ts := SYSTIMESTAMP;
FOR i IN 1..v_repeat LOOP
FOR rec IN (
SELECT CASE WHEN EXISTS (
SELECT *
FROM t@loopback
JOIN u@loopback USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
FROM dual@loopback
) LOOP
NULL;
END LOOP;
END LOOP;
dbms_output.put_line('Run ' || r ||', Statement 3 : '
|| (SYSTIMESTAMP - v_ts));
dbms_output.put_line('');
END LOOP;
END;
/
And here are the resulting times:
Run 1, Statement 1 : +000000000 00:00:00.008110000
Run 1, Statement 2 : +000000000 00:00:00.213404000
Run 1, Statement 3 : +000000000 00:00:00.043044000
Run 2, Statement 1 : +000000000 00:00:00.003466000
Run 2, Statement 2 : +000000000 00:00:00.198487000
Run 2, Statement 3 : +000000000 00:00:00.042717000
Run 3, Statement 1 : +000000000 00:00:00.003077000
Run 3, Statement 2 : +000000000 00:00:00.191802000
Run 3, Statement 3 : +000000000 00:00:00.048740000
Run 4, Statement 1 : +000000000 00:00:00.005008000
Run 4, Statement 2 : +000000000 00:00:00.192828000
Run 4, Statement 3 : +000000000 00:00:00.043461000
Run 5, Statement 1 : +000000000 00:00:00.002970000
Run 5, Statement 2 : +000000000 00:00:00.190786000
Run 5, Statement 3 : +000000000 00:00:00.043910000
Clearly, not using the database link is always the fastest, roughly by a factor of 10 compared to the
DUAL@LOOPBACK
solution. But due to the system design, we don’t have this choice. Nonetheless, you can still see that
DUAL@LOOPBACK
consistently outperforms
DUAL
by another factor of around 5 as it still prevents the
HASH JOIN
!
Caveat: Small data != Big data
There, I said it. “Big Data”. Before, we had a predicate that ran on 1,000 rows in a 500,000 row strong table. Our customer had millions of rows. But what happens if you query small data sets? Let’s reduce the predicate to this:
WHERE t.b BETWEEN 0 AND 10
The benchmark result is now completely different:
Run 1, Statement 1 : +000000000 00:00:00.007093000
Run 1, Statement 2 : +000000000 00:00:00.047539000
Run 1, Statement 3 : +000000000 00:00:00.071546000
Run 2, Statement 1 : +000000000 00:00:00.003023000
Run 2, Statement 2 : +000000000 00:00:00.041259000
Run 2, Statement 3 : +000000000 00:00:00.052132000
Run 3, Statement 1 : +000000000 00:00:00.002767000
Run 3, Statement 2 : +000000000 00:00:00.034190000
Run 3, Statement 3 : +000000000 00:00:00.054023000
Run 4, Statement 1 : +000000000 00:00:00.003468000
Run 4, Statement 2 : +000000000 00:00:00.026141000
Run 4, Statement 3 : +000000000 00:00:00.047415000
Run 5, Statement 1 : +000000000 00:00:00.002818000
Run 5, Statement 2 : +000000000 00:00:00.026100000
Run 5, Statement 3 : +000000000 00:00:00.046875000
And as you can see, the
DUAL@LOOPBACK
solution actually worsens performance for these queries. The reason for this is that we’re now running, again, a
NESTED LOOP JOIN
(but not
SEMI JOIN
) rather than a
HASH JOIN
on the remote database:
Query on remote database:
SQL_ID 7349t2363uc9m, child number 0
-------------------------------------
SELECT 0 FROM "T" "A2","U" "A1" WHERE "A2"."B">=0 AND "A2"."B"=0
Plan on remote database:
Plan hash value: 2558931407
--------------------------------------------------------------------------
| Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------------------
| SELECT STATEMENT | | 1 | | 1 |
| NESTED LOOPS | | 1 | 10 | 1 |
| TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 10 | 1 |
| INDEX RANGE SCAN | I_T | 1 | 10 | 1 |
| INDEX UNIQUE SCAN | PK_U | 1 | 1 | 1 |
--------------------------------------------------------------------------
I haven’t analysed what the reason for this difference is, as the difference is not significant enough, compared to the improvement for large data sets.
Conclusion
Tuning queries over database links is hard. Much much harder than tuning “ordinary” queries. Ideally, you’ll simply avoid database links and run all queries on a single instance. But sometimes that’s not possible.
In that case, the best solution is to move the logic to the remote query completely and collect only the result. Ideally, this is done using a stored procedure on the remote database and calculating this
1/0
result completely remotely. I think,
hipsters these days call this a Microservice, or better, a Lambda:
CREATE FUNCTION t_u RETURN NUMBER IS
v_result NUMBER;
BEGIN
SELECT CASE WHEN EXISTS (
SELECT *
FROM t
JOIN u USING (a)
WHERE t.b BETWEEN 0 AND 1000
) THEN 1 ELSE 0 END
INTO v_result
FROM dual;
RETURN v_result;
END t_u;
/
Comparing the benchmark call with the other options:
T.B BETWEEN 0 AND 10
Statement 1 : +000000000 00:00:00.003022000 -- Local query
Statement 2 : +000000000 00:00:00.027416000 -- DUAL
Statement 3 : +000000000 00:00:00.043823000 -- Remote DUAL
Statement 4 : +000000000 00:00:00.022181000 -- Remote stored procedure
T.B BETWEEN 0 AND 1000
Statement 1 : +000000000 00:00:00.002877000 -- Local query
Statement 2 : +000000000 00:00:00.188588000 -- Local DUAL
Statement 3 : +000000000 00:00:00.050163000 -- Remote DUAL
Statement 4 : +000000000 00:00:00.018736000 -- Remote stored procedure
But that, too, might not be possible as you may not have the required rights to create stored procedures on that database. You could call
DBMS_SQL
on the remote database and run a PL/SQL block dynamically on the remote database (didn’t try that in my benchmark).
Or, you simply use an occasional
DUAL@LINK
, which might already do the trick with a minimal change to the original query.
Like this:
Like Loading...
Hi Lukas,
When I ran your test on 12cR2 (Oracle Developer Day VM), the remote query with local DUAL used nested loops and was very fast. Only one row was actually fetched. I don’t know if this is because of a change in the optimizer logic, or simply that 12c gathers statistics automatically on table creation (and I think index creation), so your statistics and mine may differ.
Best regards, Stew
Oh, it would be great if this was fixed already. I was running this on 12.1.0.2.0 (and inside of docker, although docker shouldn’t matter too much).
But did you not experience a hash join in the remote query with local dual, when the range was getting bigger?
Or perhaps the code written like this (untested):
That’s an interesting alternative. I hope it doesn’t make a huge difference :) In the past, I’ve compared count(*) and exists for primary key existence checks, which didn’t matter really, in Oracle: https://blog.jooq.org/2016/10/28/dont-even-use-count-for-primary-key-existence-checks
In this case, I’ve just tried it and it performs almost exactly the same as Statement #2 for both “small” and “large” range predicates.
Most Oracle developers do not use ANSI SQL. A SQLPlus syntax would make sense to a wider audience.
If by “ANSI SQL” you mean the JOIN syntax… 1) It’s been around for 30 years, why not start using it? :) 2) there are some types of join that can only be expressed with that syntax, not the “classic one”, 3) I’ve killed many a rogue query in production because someone forgot their join predicates after adding yet another table to the table list in the FROM clause… 4) Oracle themselves have been recommending to use “ANSI joins”
So, sorry. But nope. This blog will not go back in time and continue recommending the “old” syntax.