What Exactly are SQL Views?

You probably know about “ordinary views” already, but I’m sure you’ll find one or two things in this article that you haven’t thought about in this way yet…

What exactly are SQL views?

Views in SQL are a means of treating complex queries in the same way as “ordinary” tables. In fact, SQL is all about tables (which are bags of records), much like relational algebra is all about relations (which are sets of tuples).

There are different types of views:

“Ordinary” views

These are most commonly referred to as “views”. Most databases allow for declaring them using this syntax

CREATE VIEW my_view AS 
SELECT col1, col2
FROM my_table
WHERE ...

These stored views are then part of the catalog and can be referenced by name just like tables, which is great for re-use. And what’s even greater, you can grant a different set of privileges to views than to tables, which allows you to implement a complete security layer only using views (e.g. hiding some columns, or rows from certain users)

Some databases (including Oracle, PostgreSQL) even allow for updating them under certain circumstances – mostly when they’re unambiguous, 1-1 mappings of a single table that does not produce any calculations or denormalisations.

Materialized views

Just like the above “ordinary views”, materialized views can be used just like tables. In fact, they are tables as their data is materialized on disk, updated whenever their content is updated. These are useful for frequent, complex queries on rarely updated data.

Just add the MATERIALIZED keyword and you’re set:

CREATE MATERIALIZED VIEW my_view AS 
SELECT col1, col2
FROM my_table
WHERE ...

Among others, Oracle and PostgreSQL support materialized views. Other databases like SQL Server know of “indexed views”, which are a bit less powerful as you have to explicitly “materialise” your view data in indexes.

“Snapshot” views

These aren’t really views, they’re real tables. But in the context of this blog post, you could think of them as a permanently materialized “snapshot” view of your data. You can create such views using different syntaxes:

Most databases, e.g. Oracle

CREATE TABLE my_view AS
SELECT col1, col2
FROM my_table
WHERE ...

Some databases, e.g. SQL Server

SELECT col1, col2
INTO my_view
FROM my_table
WHERE ...

The nice thing about this approach is the fact that like materialized views, these “views” can be very useful for frequent querying – you have to pre-calculate the data only once. But once you’ve calculated that data, you generate a “snapshot” of it, and the data can continue to live independently from your view – just like a snapshot! (don’t forget to add relevant indexes, though)

Note that some databases including DB2 and Oracle support real SQL:2011 standard “snapshots”, such as flashback query in Oracle, or time travel queries in DB2. That’s a different story, though.

Parameterized views

Few people refer to these views as “views”, but if you think about it, that’s what they really are. Table-valued functions are stored procedures that return tables that can then again be used in SQL. For example (using PostgreSQL syntax):

CREATE FUNCTION my_view (arg1 INTEGER, arg2 INTEGER)
RETURNS TABLE (
    col1 INTEGER
    col2 INTEGER
)
AS $$
BEGIN
    RETURN QUERY
    SELECT col1, col2
    FROM my_table
    WHERE v1 = arg2 AND v2 = arg2;
END
$$ LANGUAGE plpgsql;

And then…

SELECT *
FROM my_view (42, 1337)
WHERE ...

That’s quite powerful, isn’t it? Among others, Firebird, HANA, HSQLDB, Oracle, PostgreSQL, SQL Server support table-valued functions.

Common Table Expressions

Like ordinary views, these views are named but they’re scoped only for a single statement – mostly a SELECT statement, although PostgreSQL or SQL Server also allow for common table expressions to be used with other DML statements. These “views” can be written as such:

WITH 
    my_view_a AS (
        SELECT ...
    ),
    my_view_b AS (
        SELECT ...
    )
-- To be consumed immediately by a statement
SELECT *
FROM my_view_a, my_view_b

While common table expressions are very useful for structuring code (they’re like “table variables”), they come with a price in Oracle or PostgreSQL, as the view is most often temporarily materialized, which prevents a lot of SQL transformations in the optimiser. On the flip side, common table expressions can be recursive / hierarchical, which is great for graph / tree traversal.

Derived tables

The most common type of views (although rarely called “views”) are derived tables, i.e. all nested select statements that are put in a FROM clause. E.g.:

SELECT *
FROM (
    SELECT ...
) my_view_a, (
    SELECT ...
) my_view_b

Unlike common table expressions, derived tables cannot be reused easily within a statement, but chances are high that they can be optimised into a different statement that has a higher performance.

Conclusion

SQL is all about tables and recomposition of tables in ad-hoc queries. The most important clause of any SQL statement is the FROM clause. It specifies the set of tuples that you want to recompose, filter, group, project in various ways. As we have seen above, you can feed any such table transformation easily into yet another transformation via one of the above ways to create views.

Curious about more? Read our popular article “10 Easy Steps to a Complete Understanding of SQL

Should I Put That Table Alias or Not?

Infrequent SQL developers often get confused about when to put parentheses and/or aliases on derived tables. There has been this recent Reddit discussion about the subject, where user Elmhurstlol was wondering why they needed to provide an alias to the derived table (the subselect with the UNION) in the following query:

SELECT AVG(price) AS AVG_PRICE
FROM (
  SELECT price from product a JOIN pc b
  ON a.model=b.model AND maker='A'
  UNION ALL
  SELECT price from product a JOIN laptop c
  ON a.model=c.model and maker='A'
) hello

The question really was about why the "hello" table alias was necessary, because often it seems not to be required.

Here’s what the SQL standard states

If in doubt, it is often useful to consider the SQL standard about the rationale behind some syntax elements. In this case, let’s consider the freely available SQL 1992 standard text (for simplicity), and see how it specifies table references:

<table reference> ::=
    <table name> [ [ AS ] <correlation name>
        [ <left paren> <derived column list> 
          <right paren> ] ]
  | <derived table> [ AS ] <correlation name>
        [ <left paren> <derived column list>
          <right paren> ]
  | <joined table>

<derived table> ::= <table subquery>

<table subquery> ::= <subquery>

<subquery> ::= <left paren> <query expression>
               <right paren>

(for more information about the awesome and completely underused derived column list feature, read this article here)

The essence of the above syntax specification is this:

  • A derived table MUST always be aliased
  • The AS keyword is optional, for improved readability
  • The parentheses MUST always be put around subqueries

Following these rules, you’ll be pretty safe in most SQL dialects. Here are some deviations to the above, though:

  • Some dialects allow for unaliased derived tables. However, this is still a bad idea, because you will not be able to unambiguously qualify a column from such a derived table

Takeaway

Always provide a meaningful alias to your derived tables. As simple as that.

How to Implement Sort Indirection in SQL

I’ve recently stumbled upon this interesting Stack Overflow question, where the user essentially wanted to ensure that resulting records are delivered in a well-defined order.

They wrote

SELECT name
FROM product
WHERE name IN ('CE367FAACDHCANPH-151556',
               'CE367FAACEX9ANPH-153877',
               'NI564FAACJSFANPH-162605',
               'GE526OTACCD3ANPH-149839')

They got

CE367FAACDHCANPH-151556
CE367FAACEX9ANPH-153877
GE526OTACCD3ANPH-149839
NI564FAACJSFANPH-162605

They wanted

CE367FAACDHCANPH-151556
CE367FAACEX9ANPH-153877
NI564FAACJSFANPH-162605
GE526OTACCD3ANPH-149839

Very often, according to your business rules, sorting orders are not “natural”, as in numeric sorting or in alpha-numeric sorting. Some business rule probably specified that GE526OTACCD3ANPH-149839 needs to appear last in a list. Or the user might have re-arranged product names in their screen with drag and drop, producing new sort order.

We could discuss, of course, if such sorting should be performed in the UI layer or not, but let’s assume that the business case or the performance requirements or the general architecture needed for this sorting to be done in the database. How to do it? Through…

Sort Indirection

In fact, you don’t want to sort by the product name, but by a pre-defined enumeration of such names. In other words, you want a function like this:

CE367FAACDHCANPH-151556 -> 1
CE367FAACEX9ANPH-153877 -> 2
NI564FAACJSFANPH-162605 -> 3
GE526OTACCD3ANPH-149839 -> 4

With plain SQL, there are many ways to do the above. Here are two of them (also seen in my Stack Overflow answer):

By using a CASE expression

You can tell the database the explicit sort indirection easily, using a CASE expression in your ORDER BY clause:

SELECT name
FROM product
WHERE name IN ('CE367FAACDHCANPH-151556',
               'CE367FAACEX9ANPH-153877',
               'NI564FAACJSFANPH-162605',
               'GE526OTACCD3ANPH-149839')
ORDER BY 
  CASE WHEN name = 'CE367FAACDHCANPH-151556' THEN 1
       WHEN name = 'CE367FAACEX9ANPH-153877' THEN 2
       WHEN name = 'NI564FAACJSFANPH-162605' THEN 3
       WHEN name = 'GE526OTACCD3ANPH-149839' THEN 4
  END

Note that I’ve used the CASE WHEN predicate THEN value END syntax, because this is implemented in all SQL dialects. Alternatively (if you’re not using Apache Derby), you could also save some characters when typing, writing:

ORDER BY 
  CASE name WHEN 'CE367FAACDHCANPH-151556' THEN 1
            WHEN 'CE367FAACEX9ANPH-153877' THEN 2
            WHEN 'NI564FAACJSFANPH-162605' THEN 3
            WHEN 'GE526OTACCD3ANPH-149839' THEN 4
  END

Of course, this requires repeating the same values in the predicate and in the sort indirection. This is why, in some cases, you might be more lucky …

By using INNER JOIN

In the following example, the predicate and the sort indirection are taken care of in a simple derived table that is INNER JOIN‘ed to the original query:

SELECT product.name
FROM product
JOIN (
  VALUES('CE367FAACDHCANPH-151556', 1),
        ('CE367FAACEX9ANPH-153877', 2),
        ('NI564FAACJSFANPH-162605', 3),
        ('GE526OTACCD3ANPH-149839', 4)
) AS sort (name, sort)
ON product.name = sort.name
ORDER BY sort.sort

The above example is using PostgreSQL syntax, but you might be able to implement the same in a different way in your database.

Using jOOQ’s sort indirection API

Sort indirection is a bit tedious to write out, which is why jOOQ has a special syntax for this kind of use-case, which is also documented in the manual. Any of the following statements perform the same as the above query:

// jOOQ generates 1, 2, 3, 4 as values in the
// generated CASE expression
DSL.using(configuration)
   .select(PRODUCT.NAME)
   .from(PRODUCT)
   .where(NAME.in(
      "CE367FAACDHCANPH-151556",
      "CE367FAACEX9ANPH-153877",
      "NI564FAACJSFANPH-162605",
      "GE526OTACCD3ANPH-149839"
   ))
   .orderBy(PRODUCT.NAME.sortAsc(
      "CE367FAACDHCANPH-151556",
      "CE367FAACEX9ANPH-153877",
      "NI564FAACJSFANPH-162605",
      "GE526OTACCD3ANPH-149839"
   ))
   .fetch();

// You can choose your own indirection values to
// be generated in the CASE expression
   .orderBy(PRODUCT.NAME.sort(
      new HashMap<String, Integer>() {{
        put("CE367FAACDHCANPH-151556", 2);
        put("CE367FAACEX9ANPH-153877", 3);
        put("NI564FAACJSFANPH-162605", 5);
        put("GE526OTACCD3ANPH-149839", 8);
      }}
   ))

Conclusion

Sort indirection is a nice trick to have up your sleeves every now and then. Never forget that you can put almost arbitrary column expressions in your SQL statement’s ORDER BY clause. Use them!