Selecting all Columns Except One in PostgreSQL

Google’s BigQuery has a very interesting SQL language feature, which I’ve missed many times in other databases:
select:
    SELECT  [{ ALL | DISTINCT }]
        { [ expression. ]* [ EXCEPT ( column_name [, ...] ) ]
            [ REPLACE ( expression [ AS ] column_name [, ...] ) ]
        | expression [ [ AS ] alias ] } [, ...]
    [ FROM from_item  [, ...] ]
    [ WHERE bool_expression ]
    ...
Notice the two keywords EXCEPT and REPLACE that can be used along with an asterisked expression.

An Example

For example, when running a query like this one (which fetches the longest film(s) every actor in the Sakila database played in):

SELECT *
FROM (
  SELECT 
    a.*, 
    f.*, 
    RANK() OVER (PARTITION BY actor_id ORDER BY length DESC) rk
  FROM film f
  JOIN film_actor fa USING (film_id)
  JOIN actor a USING (actor_id)
) t
WHERE rk = 1
ORDER BY first_name, last_name

This is one way to apply TOP-N per category filtering in SQL, which works with most modern databases, including MySQL 8.0. Essentially, we’re calculating the rank of a film per actor ordered by the film’s length. The result looks like this:
actor_id |first_name  |last_name    |..|title                  |length|..|rk |
---------|------------|-------------|..|-----------------------|------|--|---|
71       |ADAM        |GRANT        |..|GLADIATOR WESTWARD     |   173|..|1  |
71       |ADAM        |GRANT        |..|BALLROOM MOCKINGBIRD   |   173|..|1  |
132      |ADAM        |HOPPER       |..|TORQUE BOUND           |   179|..|1  |
165      |AL          |GARLAND      |..|JACKET FRISCO          |   181|..|1  |
Let’s assume for a moment that we really need the entire projection of the ACTOR and FILM tables (so, SELECT * is fine), but we certainly don’t need the useless RK column, which is always 1. Sometimes, having some excess columns is not going to be a problem, but sometimes it is. How to remove it? We can’t reference the ACTOR and FILM tables anymore in the outer query:

SELECT a.*, f.* -- Would be great, but wrong syntax
FROM (
  SELECT 
    a.*, 
    f.*, 
    RANK() OVER (PARTITION BY actor_id ORDER BY length DESC) rk
  FROM film f
  JOIN film_actor fa USING (film_id)
  JOIN actor a USING (actor_id)
) t
WHERE rk = 1
ORDER BY first_name, last_name

The outer query only has one table, and that’s the (derived) table T.

How to Solve This

In BigQuery syntax, we could now simply write

SELECT * EXCEPT rk
FROM (...) t
WHERE rk = 1
ORDER BY first_name, last_name

Which is really quite convenient! We want to project everything, except this one column. But none of the more popular SQL databases support this syntax. Luckily, in PostgreSQL, we can use a workaround: Nested records:

SELECT (a).*, (f).* -- Unnesting the records again
FROM (
  SELECT 
    a, -- Nesting the actor table
    f, -- Nesting the film table
    RANK() OVER (PARTITION BY actor_id ORDER BY length DESC) rk
  FROM film f
  JOIN film_actor fa USING (film_id)
  JOIN actor a USING (actor_id)
) t
WHERE rk = 1
ORDER BY (a).first_name, (a).last_name;

Notice how we’re no longer projecting A.* and F.* inside of the derived table T, but instead, the entire table (record). In the outer query, we have to use some slightly different syntax to unnest the record again (e.g. (A).FIRST_NAME), and we’re done.

How Does This Work?

Informix, Oracle, PostgreSQL, and maybe a few lesser known ones, have implemented the SQL standard’s ORDBMS features to various degrees. ORDBMS attempted to combine relational and object oriented features in the SQL language (and in the storage model). For all practical purposes, this essentially just means that we can nest records and collections. For instance, in PostgreSQL, we can write:

-- Explicit ROW constructor
SELECT 1, ROW(2, ROW(3, 4))

-- Implicit ROW constructor
SELECT 1, (2, (3, 4))

And we’ll get:
x        |row       |
---------|----------|
1        |(2,(3,4)) |
Along with ordinary “scalar” values, we can have nested rows (or nested tuples) constructed very easily. Conveniently, we can also reference a table without its column names in the projection, such as:

SELECT a, f
FROM film f
JOIN film_actor fa USING (film_id)
JOIN actor a USING (actor_id)

Which produces the aforementioned result:
a    |f    |
-----|-----|
(...)|(...)|
(...)|(...)|
(...)|(...)|
...
Similar things are possible in Oracle as well, except that Oracle doesn’t support structural row/tuple types, only nominal ones. We’d have to create some types first, prior to being able to use them:

CREATE TABLE film_t AS OBJECT (...);

Bonus

Of course, if you’re using SQL Server or Oracle, you wouldn’t have this problem, because then you could use the totally underrated WITH TIES clause along with CROSS APPLY: SQL Server

SELECT *
FROM actor a
CROSS APPLY (
  SELECT TOP 1 WITH TIES f.*
  FROM film f
  JOIN film_actor fa 
    ON f.film_id = fa.film_id
	AND fa.actor_id = a.actor_id
  ORDER BY length DESC
) f
ORDER BY first_name, last_name;

Oracle (Do check performance on this!)

SELECT *
FROM actor a
CROSS APPLY (
  SELECT f.*
  FROM film f
  JOIN film_actor fa 
    ON f.film_id = fa.film_id
	AND fa.actor_id = a.actor_id
  ORDER BY length DESC
  FETCH FIRST ROW WITH TIES
) f
ORDER BY first_name, last_name;

PostgreSQL and DB2 support the LATERAL keyword, which could be used with FETCH FIRST ROW ONLY semantics (so, no ties are selected). For more details about TOP N per category queries, see this blog post

5 thoughts on “Selecting all Columns Except One in PostgreSQL

    1. That’s an interesting feature of course, but a bit of overkill for a trivial feature like the one exposed in this article. I doubt that running through the hoops of going back and forth between SQL and PL/SQL engines is remotely comparable to the idea of simply not projecting a column in an ordinary query… I’d still just explicitly list all the columns in Oracle rather than using these PTFs…

  1. Does this technique only help with omitting derived columns generated during the join, though?
    In other words, we can’t omit columns from A or F by name.

  2. Snowflake supports this with SELECT * EXCLUDE ....

    It’s a shame Postgres still doesn’t include this useful feature.

Leave a Reply