The Useful BigQuery * EXCEPT Syntax

One of the coolest things about using and making jOOQ is that we get to discover the best extensions to the standard SQL language by vendors, and add support for those clauses in jOOQ via emulations.

One of these syntaxes is BigQuery’s * EXCEPT syntax. Everyone who ever wrote ad-hoc SQL queries would have liked this syntax to be available, occasionally. Why is it needed? Look at this query:

SELECT * FROM actor

The result being:

|actor_id|first_name|last_name   |last_update            |
|--------|----------|------------|-----------------------|
|1       |PENELOPE  |GUINESS     |2006-02-15 04:34:33.000|
|2       |NICK      |WAHLBERG    |2006-02-15 04:34:33.000|
|3       |ED        |CHASE       |2006-02-15 04:34:33.000|
|4       |JENNIFER  |DAVIS       |2006-02-15 04:34:33.000|
|5       |JOHNNY    |LOLLOBRIGIDA|2006-02-15 04:34:33.000|
|6       |BETTE     |NICHOLSON   |2006-02-15 04:34:33.000|
|...

But that LAST_UPDATE column is mighty annoying, especially if we want to NATURAL JOIN things. E.g. this doesn’t work:

SELECT actor_id, a.first_name, a.last_name, count(fa.film_id)
FROM actor AS a
NATURAL LEFT JOIN film_actor AS fa
GROUP BY actor_id

The result is just actors without films, because accidentally, the LAST_UPDATE column was included in the NATURAL JOIN:

|actor_id|first_name|last_name|count|
|--------|----------|---------|-----|
|58      |CHRISTIAN |AKROYD   |0    |
|8       |MATTHEW   |JOHANSSON|0    |
|116     |DAN       |STREEP   |0    |
|184     |HUMPHREY  |GARLAND  |0    |
|87      |SPENCER   |PECK     |0    |

This is the biggest flaw of NATURAL JOIN, making it almost useless for schemas that aren’t perfectly designed for NATURAL JOIN usage, but this is ad-hoc SQL, and it would have been nice to do that.

We could, if we had * EXCEPT like this:

SELECT 
  a.actor_id, 
  a.first_name, 
  a.last_name, 
  count(fa.film_id)
FROM (
  SELECT * EXCEPT (last_update) FROM actor
) AS a
NATURAL LEFT JOIN (
  SELECT * EXCEPT (last_update) FROM film_actor
) AS fa
GROUP BY 
  a.actor_id, 
  a.first_name, 
  a.last_name

Unfortunately, this doesn’t work in PostgreSQL and other dialects, but jOOQ can emulate it. If you provide the online SQL translator with the sakila database table definitions:

CREATE TABLE actor (
    actor_id integer NOT NULL,
    first_name character varying(45) NOT NULL,
    last_name character varying(45) NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);
CREATE TABLE film_actor (
    actor_id smallint NOT NULL,
    film_id smallint NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);

Then, it can resolve the EXCEPT syntax of the query and produce this:

SELECT
  a.actor_id,
  a.first_name,
  a.last_name,
  count(fa.film_id)
FROM (
  SELECT actor.actor_id, actor.first_name, actor.last_name
  FROM actor
) a
  NATURAL LEFT OUTER JOIN (
    SELECT film_actor.actor_id, film_actor.film_id
    FROM film_actor
  ) fa
GROUP BY
  a.actor_id,
  a.first_name,
  a.last_name

Obviously, we could have just omitted NATURAL JOIN to achieve the same result, but sometimes, it’s just nice to have yet another tool in the tool chain to write a nice query. With jOOQ, the query would read:

Actor a = ACTOR.as("a");
FilmActor fa = FILM_ACTOR.as("fa");

ctx.select(
        a.ACTOR_ID,
        a.FIRST_NAME,
        a.LAST_NAME,
        count(fa.FILM_ID))
   .from(
        select(asterisk().except(a.LAST_UPDATE)).from(a).asTable(a))
   .naturalLeftOuterJoin(
        select(asterisk().except(fa.LAST_UPDATE)).from(fa).asTable(fa))
   .groupBy(a.ACTOR_ID, a.FIRST_NAME, a.LAST_NAME)
   .fetch();