FULL [ OUTER ] JOIN, but maybe, you have run into this beast in the past. But when was the last time you’ve seen a
NATURAL JOIN? Right. A quick reminder from our article about JOINs: FULL JOIN A
FULL JOINis a type of
OUTER JOINthat retains data from both sides of the
JOINoperation, regardless if the
JOINpredicate matches or not. For example, querying the Sakila database:
The result will look something like this:
SELECT first_name, last_name, title FROM actor FULL JOIN film_actor USING (actor_id) FULL JOIN film USING (film_id)
first_name last_name title --[ LEFT JOIN ] ------------------------- Jon Doe <-- Actors that didn't play in any film Claire Miller --[ INNER JOIN ] ------------------------ Samuel Jackson Django Unchained <-- Actors played in many films Samuel Jackson Pulp Fiction John Travolta Pulp Fiction <-- Films featured several actors --[ RIGHT JOIN ]------------------------- Meh <-- Films that don't have actorsNote the special “markers” that delimit the parts that would have also been contributed by
RIGHT JOIN. In a way, a
FULL JOINcombines them all. Handy, occasionally NATURAL JOIN This one is less obviously useful. It looks useful at first, because if we carefully design our schemas as in the Sakila database used above, then we can simplify our joins as such:
Which is just syntax sugar for this:
SELECT first_name, last_name, title FROM actor NATURAL JOIN film_actor NATURAL JOIN film
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (<all common columns>) JOIN film USING (<all common columns>)
NATURAL JOINwill join two tables using all the columns they have in common. You’d think that this would be the useful outcome:
Unfortunately, the outcome looks more like this:
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (actor_id) JOIN film USING (film_id)
In other words: It makes no sense at all. In real world situations (and the Sakila database is already close enough to the real world), we’ll always have accidentally conflicting column names that will be considered for
SELECT first_name, last_name, title FROM actor JOIN film_actor USING (actor_id, last_update) JOIN film USING (film_id, last_update)
NATURAL JOIN, when they hsould not be.
OK, so why NATURAL FULL JOIN?… you’re asking. One is very rarely needed, and the other is pretty useless in the real world. Not so fast! I recently helped a customer who had two very similar tables that both contained payments. Let’s call them:
STANDING_ORDER, the latter being a
PAYMENTtemplate which generates actual
PAYMENTsperiodically. Now, there was some PL/SQL logic that operated on payments, which after some change request, needed to operate both on payments and/or on standing orders. If this was Java, we’d just (ab)use subtype polymorphism for the quick win, or factor out common logic into common component types for the thorough solution. For instance, these kinds of columns certainly appear in both cases: “System” columns
ID– The surrogate key
CREATED_AT– Audit info
MODIFIED_AT– Audit info
DELETED– Soft deletion
STANDING_ORDER_ID– Only payments reference the standing order that may have created them.
PERIODICITY– This is a standing order’s core feature.
NATURAL FULL JOIN to the rescueI created a view like this:
This view itself doesn’t do anything. It just sits there and produces the row type I need. Why? Because the
CREATE VIEW like_a_payment SELECT * FROM payment NATURAL FULL JOIN standing_order WHERE 1 = 0
NATURAL JOINwill join by ALL the common columns, namely
ID, CREATED_AT, MODIFIED_AT, DELETED, AMOUNT, CURRENCYand then adds the columns from the left table, namely
STANDING_ORDER_ID, and then from the right table, namely
PERIODICITY. I can now use this type in PL/SQL:
DECLARE paym like_a_payment%ROWTYPE; BEGIN -- Do things... END;
But why FULL?True, I didn’t need
FULLyet, because I’m not producing any rows anyway. Correct. But observe how I will populate the
PAYMlocal variable in the PL/SQL block:
And I’m done! This could not have been done any more easily!
DECLARE l_paym_id like_a_payment.id%TYPE; l_paym like_a_payment%ROWTYPE; BEGIN l_paym_id := 1; SELECT * INTO l_paym FROM ( SELECT * FROM payment WHERE id = l_paym_id ) NATURAL FULL JOIN ( SELECT * FROM standing_order WHERE id = l_paym_id ); END;
- I’m using
NATURALbecause now I have to use it all the time to populate the record (or select individual columns, but meh)
- I’m using
FULLbecause only one of the two queries will produce a result, and the remaining columns will be
Some observationsOf course, this is kind of a hack. Please don’t abuse this too often. Here’s some hints to follow:
- The refactoring would have been substantial. I needed a quick win.
- The logic actually needs almost all the columns from the payments. Otherwise, you might want to review that
- The derived tables are necessary in Oracle for performance reasons.
- The actual code contained two distinct queries, each one joining “one side” with an empty substitute query, as we know that the ID is either a payment or a standing order.
NATURAL FULL JOIN. Put this in your code base and impress your coworkers!