Some SQL operators are as esoteric as they’re powerful. One of the oldest operator that you’ve likely hardly ever used in real world applications is NATURAL JOIN which is the default in relational algebra. We’ve covered a funky use-case for NATURAL JOIN earlier on this blog.
The main reason why it’s not very useful is because it joins two tables using all column names from both joined tables. If you look at the Sakila database, this includes audit columns like LAST_UPDATE, which is present on all tables. For example, you wouldn’t want:
SELECT *FROM payment NATURAL JOIN customer;
Because that would be equivalent to:
SELECT *FROM payment AS pJOIN customer AS c ON p.customer_id = c.customer_id AND p.last_update = c.last_update
This is an obvious reason, but even if you carefully design your schema to never contain two columns that “accidentally” share their name, someone might add one by accident, nonetheless, which renders existing queries useless.
JOIN USING to the rescue?
So, you’d think that the syntax that should really be used instead is JOIN .. USING? This looks nice, doesn’t it?
SELECT *FROM payment AS pJOIN customer AS c USING (customer_id)
And yes, this is really cool for ad-hoc SQL. If your schema is designed this way (primary key column name = foreign key column name), then this syntax can be used to quickly join tables across the schema.
But in a real world application, you should still avoid the syntax to prevent unpleasant surprises when evolving the schema. Look at this script, for example:
create table a (i int);create table b (i int, j int);create table c (j int);select * from a join b using (i) join c using (j);
The query works perfectly fine as intended. But what happens if we add:
alter table a add j int;
Now, there’s an ambiguity in the last JOIN. If we repeat the SELECT statement from earlier, PostgreSQL complains:
ERROR: common column name “j” appears more than once in left table
Oracle confirms:
SQL Error [918] [42000]: ORA-00918: J: column ambiguously specified – appears in A and B
Why? Because we’re really joining:
select * from ( a join b using (i) ) join c using (j);
And since j is not a join column between a and b, it is now ambiguous. Note how the ALTER TABLE statement broke a query that was perfectly fine before. It would not have broken if we used the JOIN .. ON syntax:
select * from a join b on a.i = b.i join c on b.j = c.j; -- No ambiguity introduced by the new column a.j
Too bad! Again, when writing ad-hoc SQL for quick querying the Sakila database, JOIN .. USING is a useful tool – just like SELECT *. But in production queries, this ad-hoc SQL tool should be avoided.
