How to Run a Bulk INSERT .. RETURNING Statement With Oracle and JDBC

When inserting records into SQL databases, we often want to fetch back generated IDs and possibly other trigger, sequence, or default generated values. Let’s assume we have the following table:

-- DB2
CREATE TABLE x (
  i INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, 
  j VARCHAR(50), 
  k DATE DEFAULT CURRENT_DATE
);

-- PostgreSQL
CREATE TABLE x (
  i SERIAL4 PRIMARY KEY, 
  j VARCHAR(50), 
  k DATE DEFAULT CURRENT_DATE
);

-- Oracle
CREATE TABLE x (
  i INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, 
  j VARCHAR2(50), 
  k DATE DEFAULT SYSDATE
);

DB2

DB2 is the only database currently supported by jOOQ, which implements the SQL standard according to which we can SELECT from any INSERT statement, including:

SELECT *
FROM FINAL TABLE (
  INSERT INTO x (j)
  VALUES ('a'), ('b'), ('c')
);

The above query returns:

I |J |K          |
--|--|-----------|
1 |a |2018-05-02 |
2 |b |2018-05-02 |
3 |c |2018-05-02 |

Pretty neat! This query can simply be run like any other query in JDBC, and you don’t have to go through any hassles.

PostgreSQL and Firebird

These databases have a vendor specific extension that does the same thing, almost as powerful:

-- Simple INSERT .. RETURNING query
INSERT INTO x (j)
VALUES ('a'), ('b'), ('c')
RETURNING *;

-- If you want to do more fancy stuff
WITH t AS (
  INSERT INTO x (j)
  VALUES ('a'), ('b'), ('c')
  RETURNING *
)
SELECT * FROM t;

Both syntaxes work equally well, the latter is just as powerful as DB2’s, where the result of an insertion (or update, delete, merge) can be joined to other tables. Again, no problem with JDBC

Oracle

In Oracle, this is a bit more tricky. The Oracle SQL language doesn’t have an equivalent of DB2’s FINAL TABLE (DML statement). The Oracle PL/SQL language, however, does support the same syntax as PostgreSQL and Firebird. This is perfectly valid PL/SQL

-- Create a few auxiliary types first
CREATE TYPE t_i AS TABLE OF NUMBER(38);
/
CREATE TYPE t_j AS TABLE OF VARCHAR2(50);
/
CREATE TYPE t_k AS TABLE OF DATE;
/

DECLARE 
  -- These are the input values
  in_j t_j := t_j('a', 'b', 'c');
  
  out_i t_i;
  out_j t_j;
  out_k t_k;
  
  c1 SYS_REFCURSOR;
  c2 SYS_REFCURSOR;
  c3 SYS_REFCURSOR;
BEGIN

  -- Use PL/SQL's FORALL command to bulk insert the
  -- input array type and bulk return the results
  FORALL i IN 1 .. in_j.COUNT
    INSERT INTO x (j)
    VALUES (in_j(i))
    RETURNING i, j, k
    BULK COLLECT INTO out_i, out_j, out_k;
  
  -- Fetch the results and display them to the console
  OPEN c1 FOR SELECT * FROM TABLE(out_i);  
  OPEN c2 FOR SELECT * FROM TABLE(out_j);  
  OPEN c3 FOR SELECT * FROM TABLE(out_k); 
  
  dbms_sql.return_result(c1);
  dbms_sql.return_result(c2);
  dbms_sql.return_result(c3);
END;
/

A bit verbose, but it has the same effect. Now, from JDBC:

try (Connection con = DriverManager.getConnection(url, props);
    Statement s = con.createStatement();

    // The statement itself is much more simple as we can
    // use OUT parameters to collect results into, so no
    // auxiliary local variables and cursors are needed
    CallableStatement c = con.prepareCall(
        "DECLARE "
      + "  v_j t_j := ?; "
      + "BEGIN "
      + "  FORALL j IN 1 .. v_j.COUNT "
      + "    INSERT INTO x (j) VALUES (v_j(j)) "
      + "    RETURNING i, j, k "
      + "    BULK COLLECT INTO ?, ?, ?; "
      + "END;")) {

    try {

        // Create the table and the auxiliary types
        s.execute(
            "CREATE TABLE x ("
          + "  i INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,"
          + "  j VARCHAR2(50),"
          + "  k DATE DEFAULT SYSDATE"
          + ")");
        s.execute("CREATE TYPE t_i AS TABLE OF NUMBER(38)");
        s.execute("CREATE TYPE t_j AS TABLE OF VARCHAR2(50)");
        s.execute("CREATE TYPE t_k AS TABLE OF DATE");

        // Bind input and output arrays
        c.setArray(1, ((OracleConnection) con).createARRAY(
            "T_J", new String[] { "a", "b", "c" })
        );
        c.registerOutParameter(2, Types.ARRAY, "T_I");
        c.registerOutParameter(3, Types.ARRAY, "T_J");
        c.registerOutParameter(4, Types.ARRAY, "T_K");

        // Execute, fetch, and display output arrays
        c.execute();
        Object[] i = (Object[]) c.getArray(2).getArray();
        Object[] j = (Object[]) c.getArray(3).getArray();
        Object[] k = (Object[]) c.getArray(4).getArray();

        System.out.println(Arrays.asList(i));
        System.out.println(Arrays.asList(j));
        System.out.println(Arrays.asList(k));
    }
    finally {
        try {
            s.execute("DROP TYPE t_i");
            s.execute("DROP TYPE t_j");
            s.execute("DROP TYPE t_k");
            s.execute("DROP TABLE x");
        }
        catch (SQLException ignore) {}
    }
}

The above code will display:

[1, 2, 3]
[a, b, c]
[2018-05-02 10:40:34.0, 2018-05-02 10:40:34.0, 2018-05-02 10:40:34.0]

Exactly what we wanted.

jOOQ support

A future version of will emulate the above PL/SQL block from the jOOQ INSERT .. RETURNING statement:

DSL.using(configuration)
   .insertInto(X)
   .columns(X.J)
   .values("a")
   .values("b")
   .values("c")
   .returning(X.I, X.J, X.K)
   .fetch();

This will correctly emulate the query for all of the databases that natively support the syntax. In the case of Oracle, since jOOQ cannot create nor assume any SQL TABLE types, PL/SQL types from the DBMS_SQL package will be used

The relevant issue is here: https://github.com/jOOQ/jOOQ/issues/5863

The Performance Difference Between SQL Row-by-row Updating, Batch Updating, and Bulk Updating

Something that has been said many times, but needs constant repeating until every developer is aware of the importance of this is the performance difference between row-by-row updating and bulk updating. If you cannot guess which one will be much faster, remember that row-by-row kinda rhymes with slow-by-slow (hint hint).

Disclaimer: This article will discuss only non-concurrent updates, which are much easier to reason about. In a concurrent update situation, a lot of additional factors will add complexity to the problem, including the locking strategy, transaction isolation levels, or simply how the database vendor implements things in detail. For the sake of simplicity, I’ll assume no concurrent updates are being made.

Example query

Let’s say we have a simple table for our blog posts (using Oracle syntax, but the effect is the same on all databases):

CREATE TABLE post (
  id INT NOT NULL PRIMARY KEY,
  text VARCHAR2(1000) NOT NULL,
  archived NUMBER(1) NOT NULL CHECK (archived IN (0, 1)),
  creation_date DATE NOT NULL
);

CREATE INDEX post_creation_date_i ON post (creation_date);

Now, let’s add some 10000 rows:

INSERT INTO post
SELECT 
  level,
  lpad('a', 1000, 'a'),
  0 AS archived,
  DATE '2017-01-01' + (level / 100)
FROM dual
CONNECT BY level <= 10000;

EXEC dbms_stats.gather_table_stats('TEST', 'POST');

Now imagine, we want to update this table and set all posts to ARCHIVED = 1 if they are from last year, e.g. CREATION_DATE < DATE '2018-01-01'. There are various ways to do this, but you should have built an intuition that doing the update in one single UPDATE statement is probably better than looping over each individual row and updating each individual row explicitly. Right?

Right.

Then, why do we keep doing it?

Let me ask this differently:

Does it matter?

The best way to find out is to benchmark. I’m doing two benchmarks for this:

  1. One that is run in PL/SQL, showing the performance difference between different approaches that are available to PL/SQL (namely looping, the FORALL syntax, and a single bulk UPDATE)
  2. One that is run in Java, doing JDBC calls, showing the performance difference between different approaches available to Java (namely looping, caching PreparedStatement but still looping, batching, and a single bulk UPDATE)

Benchmarking PL/SQL

The code of the benchmark can be found in this gist. I will also include it at the bottom of this blog post. The results are:

Run 1, Statement 1 : .01457 (avg : .0098)
Run 1, Statement 2 : .0133  (avg : .01291)
Run 1, Statement 3 : .02351 (avg : .02519)
Run 2, Statement 1 : .00882 (avg : .0098)
Run 2, Statement 2 : .01159 (avg : .01291)
Run 2, Statement 3 : .02348 (avg : .02519)
Run 3, Statement 1 : .01012 (avg : .0098)
Run 3, Statement 2 : .01453 (avg : .01291)
Run 3, Statement 3 : .02544 (avg : .02519)
Run 4, Statement 1 : .00799 (avg : .0098)
Run 4, Statement 2 : .01346 (avg : .01291)
Run 4, Statement 3 : .02958 (avg : .02519)
Run 5, Statement 1 : .00749 (avg : .0098)
Run 5, Statement 2 : .01166 (avg : .01291)
Run 5, Statement 3 : .02396 (avg : .02519)

The difference between Statement 1 and 3 is a factor of 2.5x

Showing the time it takes for each statement type to complete, each time updating 3649 / 10000 rows. The winner is:

Statement 1, running a bulk update

It looks like this:

UPDATE post
SET archived = 1
WHERE archived = 0 AND creation_date < DATE '2018-01-01';

Runner-up (not too far away) is:

Statement 2, using the PL/SQL FORALL syntax

It works like this:

DECLARE
  TYPE post_ids_t IS TABLE OF post.id%TYPE;
  v_post_ids post_ids_t;
BEGIN
  SELECT id 
  BULK COLLECT INTO v_post_ids
  FROM post 
  WHERE archived = 0 AND creation_date < DATE '2018-01-01';

  FORALL i IN 1 .. v_post_ids.count
    UPDATE post
    SET archived = 1
    WHERE id = v_post_ids(i);
END;

Loser (by a factor of 2.5x on our specific data set) is:

Statement 3, using an ordinary LOOP and running row-by-row updates

FOR rec IN (
  SELECT id 
  FROM post 
  WHERE archived = 0 AND creation_date < DATE '2018-01-01'
) LOOP
  UPDATE post
  SET archived = 1
  WHERE id = rec.id;
END LOOP;

It does not really come as a surprise. We’re switching between the PL/SQL engine and the SQL engine many many times, and also, instead of running through the post table only once in O(N) time, we’re looking up individual ID values in O(log N) time, N times, so the complexity went from

O(N) -> O(N log N)

We’d get far worse results for larger tables!

What about doing this from Java?

The difference is much more drastic if each call to the SQL engine has to be done over the network from another process. Again, the benchmark code is available from a gist, and I will paste it to the end of this blog post as well.

The result is (same time unit):

Run 0, Statement 1: PT4.546S
Run 0, Statement 2: PT3.52S
Run 0, Statement 3: PT0.144S
Run 0, Statement 4: PT0.028S
Run 1, Statement 1: PT3.712S
Run 1, Statement 2: PT3.185S
Run 1, Statement 3: PT0.138S
Run 1, Statement 4: PT0.025S
Run 2, Statement 1: PT3.481S
Run 2, Statement 2: PT3.007S
Run 2, Statement 3: PT0.122S
Run 2, Statement 4: PT0.026S
Run 3, Statement 1: PT3.518S
Run 3, Statement 2: PT3.077S
Run 3, Statement 3: PT0.113S
Run 3, Statement 4: PT0.027S
Run 4, Statement 1: PT3.54S
Run 4, Statement 2: PT2.94S
Run 4, Statement 3: PT0.123S
Run 4, Statement 4: PT0.03S

The difference between Statement 1 and 4 is a factor of 100x !!

So, who’s winning? Again (by far):

Statement 4, running the bulk update

In fact, the time is not too far away from the time taken by PL/SQL. With larger data sets being updated, the two results will converge. The code is:

try (Statement s = c.createStatement()) {
    s.executeUpdate(
        "UPDATE post\n" +
        "SET archived = 1\n" +
        "WHERE archived = 0\n" +
        "AND creation_date < DATE '2018-01-01'\n");
}

Followed by the not that much worse (but still 3.5x worse):

Statement 3, running the batch update

Batching can be compared to PL/SQL’s FORALL statement. While we’re running individual row-by-row updates, we’re sending all the update statements in one batch to the SQL engine. This does save a lot of time on the network and all the layers in between.

The code looks like this:

try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    );
    PreparedStatement u = c.prepareStatement(
        "UPDATE post SET archived = 1 WHERE id = ?"
    )) {

    while (rs.next()) {
        u.setInt(1, rs.getInt(1));
        u.addBatch();
    }

    u.executeBatch();
}

Followed by the losers:

Statement 1 and 2, running row by row updates

The difference between statement 1 and 2 is that 2 caches the PreparedStatement, which allows for reusing some resources. This can be a good thing, but didn’t have a very significant effect in our case, compared to the batch / bulk alternatives. The code is:

// Statement 1:
try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post\n"
      + "WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    )) {

    while (rs.next()) {
        try (PreparedStatement u = c.prepareStatement(
            "UPDATE post SET archived = 1 WHERE id = ?"
        )) {
            u.setInt(1, rs.getInt(1));
            u.executeUpdate();
        }
    }
}

// Statement 2:
try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post\n"
      + "WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    );
    PreparedStatement u = c.prepareStatement(
        "UPDATE post SET archived = 1 WHERE id = ?"
    )) {

    while (rs.next()) {
        u.setInt(1, rs.getInt(1));
        u.executeUpdate();
    }
}

Conclusion

As shown previously on this blog, there is a significant cost of JDBC server roundtrips, which can be seen in the JDBC benchmark. This cost is much more severe if we unnecessarily create many server roundtrips for a task that could be done in a single roundtrip, namely by using a SQL bulk UPDATE statement.

This is not only true for updates, but also for all the other statements, including SELECT, DELETE, INSERT, and MERGE. If doing everything in a single statement isn’t possible due to the limitations of SQL, we can still save roundtrips by grouping statements in a block, either by using an anonymous block in databases that support them:

BEGIN
  statement1;
  statement2;
  statement3;
END;

(you can easily send these anonymous blocks over JDBC, as well!)

Or, by emulating anonymous blocks using the JDBC batch API (has its limitations), or by writing stored procedures.

The performance gain is not always worth the trouble of moving logic from the client to the server, but very often (as in the above case), the move is a no-brainer and there’s absolutely no reason against it.

So, remember: Stop doing row-by-row (slow-by-slow) operations when you could run the same operation in bulk, in a single SQL statement.

Hint: Always know what your ORM (if you’re using one) is doing, because the ORM can help you with automatic batching / bulking in many cases. But it often cannot, or it is too difficult to make it do so, so resorting to SQL is the way to go.

Code

PL/SQL benchmark

SET SERVEROUTPUT ON

DROP TABLE post;

CREATE TABLE post (
  id INT NOT NULL PRIMARY KEY,
  text VARCHAR2(1000) NOT NULL,
  archived NUMBER(1) NOT NULL CHECK (archived IN (0, 1)),
  creation_date DATE NOT NULL
);

CREATE INDEX post_creation_date_i ON post (creation_date);

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;

CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  
  PROCEDURE reset_post IS
  BEGIN
    EXECUTE IMMEDIATE 'TRUNCATE TABLE post';
    INSERT INTO post
    SELECT 
      level AS id,
      lpad('a', 1000, 'a') AS text,
      0 AS archived,
      DATE '2017-01-01' + (level / 100) AS creation_date
    FROM dual
    CONNECT BY level <= 10000;
    dbms_stats.gather_table_stats('TEST', 'POST');
  END reset_post;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
  
    reset_post;
    v_ts := SYSTIMESTAMP;
    
    UPDATE post
    SET archived = 1
    WHERE archived = 0 AND creation_date < DATE '2018-01-01';
  
    INSERT INTO results VALUES (r, 1, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    
    reset_post;
    v_ts := SYSTIMESTAMP;
    
    DECLARE
      TYPE post_ids_t IS TABLE OF post.id%TYPE;
      v_post_ids post_ids_t;
    BEGIN
      SELECT id 
      BULK COLLECT INTO v_post_ids
      FROM post 
      WHERE archived = 0 AND creation_date < DATE '2018-01-01';
    
      FORALL i IN 1 .. v_post_ids.count
        UPDATE post
        SET archived = 1
        WHERE id = v_post_ids(i);
    END;
    
    INSERT INTO results VALUES (r, 2, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    
    reset_post;
    v_ts := SYSTIMESTAMP;
      
    FOR rec IN (
      SELECT id 
      FROM post 
      WHERE archived = 0 AND creation_date < DATE '2018-01-01'
    ) LOOP
      UPDATE post
      SET archived = 1
      WHERE id = rec.id;
    END LOOP;
      
    INSERT INTO results VALUES (r, 3, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed AS NUMBER(10, 5)) ratio,
      CAST(AVG(elapsed) OVER (PARTITION BY stmt) AS NUMBER(10, 5)) avg_ratio
    FROM results
    ORDER BY run, stmt
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio || ' (avg : ' || rec.avg_ratio || ')');
  END LOOP;
  
  dbms_output.put_line('');
  dbms_output.put_line('Copyright Data Geekery GmbH');
  dbms_output.put_line('https://www.jooq.org/benchmark');
END;
/

DROP TABLE results;

JDBC benchmark

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.time.Duration;
import java.time.Instant;
import java.util.Properties;

public class OracleUpdate {

    public static void main(String[] args) throws Exception {
        Class.forName("oracle.jdbc.OracleDriver");

        String url = "jdbc:oracle:thin:@192.168.99.100:1521:ORCLCDB";
        String user = "TEST";
        String password = "TEST";

        Properties properties = new Properties();
        properties.setProperty("user", user);
        properties.setProperty("password", password);

        try (Connection c = DriverManager.getConnection(url, properties)) {
            for (int i = 0; i < 5; i++) {
                Instant ts;

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    )) {

                    while (rs.next()) {
                        try (PreparedStatement u = c.prepareStatement(
                            "UPDATE post SET archived = 1 WHERE id = ?"
                        )) {
                            u.setInt(1, rs.getInt(1));
                            u.executeUpdate();
                        }
                    }
                }

                System.out.println("Run " + i + ", Statement 1: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    );
                    PreparedStatement u = c.prepareStatement(
                        "UPDATE post SET archived = 1 WHERE id = ?"
                    )) {

                    while (rs.next()) {
                        u.setInt(1, rs.getInt(1));
                        u.executeUpdate();
                    }
                }

                System.out.println("Run " + i + ", Statement 2: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    );
                    PreparedStatement u = c.prepareStatement(
                        "UPDATE post SET archived = 1 WHERE id = ?"
                    )) {

                    while (rs.next()) {
                        u.setInt(1, rs.getInt(1));
                        u.addBatch();
                    }

                    u.executeBatch();
                }
                System.out.println("Run " + i + ", Statement 3: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement()) {
                    s.executeUpdate("UPDATE post\n" +
                        "SET archived = 1\n" +
                        "WHERE archived = 0 AND creation_date < DATE '2018-01-01'\n");
                }

                System.out.println("Run " + i + ", Statement 4: " + Duration.between(ts, Instant.now()));
            }
        }
    }

    static void resetPost(Connection c) throws SQLException {
        try (Statement s = c.createStatement()) {
            s.executeUpdate("TRUNCATE TABLE post");
            s.executeUpdate("INSERT INTO post\n" +
                "    SELECT \n" +
                "      level,\n" +
                "      lpad('a', 1000, 'a'),\n" +
                "      0,\n" +
                "      DATE '2017-01-01' + (level / 10)\n" +
                "    FROM dual\n" +
                "    CONNECT BY level <= 10000");
            s.executeUpdate("BEGIN dbms_stats.gather_table_stats('TEST', 'POST'); END;");
        }
    }
}

The Cost of JDBC Server Roundtrips

Or: Move That Loop into the Server Already!

This article will illustrate the significance of something that I always thought to be common sense, but I keep seeing people getting this (very) wrong in their productive systems. Chances are, in fact, that most applications out there suffer from this performance problem – and the fix is really easy.

Transferring data in bulk vs. transferring it row by row

This blog post is inspired by a recent Stack Overflow question that was asking about how to call Oracle’s DBMS_OUTPUT.GET_LINES from JDBC. The answer to that specific question can also be seen in a previous blog post here.

This time, I don’t want to discuss that particular Oracle-specific technical problem, but the performance aspect of whether to call:

  • DBMS_OUTPUT.GET_LINES: Which allows for fetching a bulk of server output into an array
  • DBMS_OUTPUT.GET_LINE: Which fetches a single line of server output into a string

While I will continue to discuss this Oracle-specific functionality, this blog post is in no way strictly related to Oracle. It is generally applicable (and again, it should be common sense) for any database, and in fact, any client-server architecture, or even any distributed architecture. The solution to such performance problems will almost always be that transferring a bulk of data is better than transferring it row by row.

Let’s run a benchmark!

The full benchmark logic can be seen in this gist. It includes the boring parts of actually calling the GET_LINE[S] procedures. The beef of the benchmark is this:

int max = 50;
long[] getLines = new long[max];
long[] getLine = new long[max];

try (Connection c = DriverManager.getConnection(url, properties);
    Statement s = c.createStatement()) {

    for (int warmup = 0; warmup < 2; warmup++) {
        for (int i = 0; i < max; i++) {
            s.executeUpdate("begin dbms_output.enable(); end;");
            String sql =
                "begin "
              + "for i in 1 .. 100 loop "
              + "dbms_output.put_line('Message ' || i); "
              + "end loop; "
              + "end;";
            long t1 = System.nanoTime();
            logGetLines(c, 100, () -> s.executeUpdate(sql));
            long t2 = System.nanoTime();
            logGetLine(c, 100, () -> s.executeUpdate(sql));
            long t3 = System.nanoTime();
            s.executeUpdate("begin dbms_output.disable(); end;");

            if (warmup > 0) {
                getLines[i] = t2 - t1;
                getLine[i] = t3 - t2;
            }
        }
    }
}

System.out.println(LongStream.of(getLines).summaryStatistics());
System.out.println(LongStream.of(getLine).summaryStatistics());

What does it do in prose?

  • It contains a warmup loop whose first iteration doesn’t contribute to our measurements (this is always a good idea)
  • It runs the benchmarked logic 50 times
  • It generates 100 DBMS_OUTPUT.PUT_LINE messages for each run in an anonymous PL/SQL loop …
  • … and then fetches those 100 messages immediately with either 1 call to GET_LINES or 100 calls to GET_LINE
  • Finally, all the stored execution times are aggregated and printed out conveniently with Java 8 Stream’s summary statistics feature

So, in both cases, we’re generating and fetching 5000 messages.

Both methods using GET_LINES and GET_LINE respectively are functionally equivalent, i.e. the only difference is performance. Again the full benchmark can be seen here (which also benchmarks the effect of JDBC fetchSize, which is none, in this case).

The results are devastating:

{count=50, sum=  69120455, min= 1067521, average= 1382409.100000, max= 2454614}
{count=50, sum=2088201423, min=33737827, average=41764028.460000, max=64498375}

We’re gaining a factor of 30x in this benchmark run on my machine. The actual results may vary depending on your hardware and software (e.g. I’m running Oracle 12cR2 in Docker), but regardless of the setup, the results are always very significant.

Note that caching the prepared statement in the client yields better results (see the gist), but nowhere near as good as moving the loop to the server

Does this mean that GET_LINE is slow?

When looking at benchmark results, we must always be very very careful not to draw the wrong conclusions. There can be many reasons why benchmarks show differences in performance. One simple (albeit improbable) reason could be that the GET_LINE implementation simply sucks.

So, I’ve tried to re-implement this benchmark in pure PL/SQL. The full benchmark can be seen here. The beef of it is this:

FOR r IN 1..5 LOOP
  v_ts := SYSTIMESTAMP;
      
  FOR i IN 1..v_repeat LOOP
    m();
     
    v_i := v_max;
    dbms_output.get_lines(v_array, v_i);
  END LOOP;
      
  INSERT INTO results VALUES (1, (SYSTIMESTAMP - v_ts));
  v_ts := SYSTIMESTAMP;
    
  FOR i IN 1..v_repeat LOOP
    m();
    
    FOR j IN 1 .. v_max LOOP
      dbms_output.get_line(v_string, v_i);
    END LOOP;
  END LOOP;
     
  INSERT INTO results VALUES (2, (SYSTIMESTAMP - v_ts));
END LOOP;

Where m() is:

PROCEDURE m IS BEGIN
  FOR i IN 1 .. v_max LOOP 
    dbms_output.put_line('Message ' || i);
  END LOOP;
END m;

The results are now rather different:

stmt    sum     avg      min     max
1       0.0609  0.01218  0.0073  0.0303
2       0.0333  0.00666  0.0063  0.007

This time, calling GET_LINE individually seems to have been 2x faster than the GET_LINES version. Again, it is important not to draw the wrong conclusions! This could be due to:

  • GET_LINES allocating an additional array copy of the original lines, which resides in the PGA, might be costly
  • GET_LINE might have profited from some additional optimisation because we’re never actually consuming the result in the benchmark

But the one thing we can conclude with certainty is: There’s no problem in GET_LINE, so calling it is not inherently worse than calling GET_LINES.

Which brings us back to the JDBC calls

While guess work and hypotheses are usually dangerous, in this case I’m certain of the reason why the JDBC based approach shows such drastic differences. Just watch this excellent talk by Toon Koppelaars from Oracle “NoPLSql and Thick Database Approaches with Toon Koppelaars”, where he explains this with some impressive flame graphs:

The obvious difference between the JDBC benchmark and the PL/SQL one is the fact that the JDBC call has to traverse a vast amount of logic, APIs, “barriers” between the JVM and the Oracle kernel before it can actually invoke the really interesting part. This includes:

  • JVM overhead
  • JDBC logic
  • Network overhead
  • Various “outer” layers inside the Oracle database
  • Oracle’s API layers to get into the SQL and PL/SQL execution engines
  • The actual code running in the PL/SQL engine

In Toon’s talk (which again, you should definitely watch), the examples are running SQL code, not PL/SQL code, but the results are the same. The actual logic is relatively cheap inside of the database (as we’ve seen in the PL/SQL only benchmark), but the overhead is significant when calling database logic from outside the database.

Thus: It is very important to minimise that overhead

There are two ways to minimise that overhead:

  • The super hard way: Change and / or tweak the API technology, e.g. in Oracle’s case, using the C/OCI bindings can be much faster than JDBC
  • The easy way: Just move some data collection logic into the database and fetch data in bulk

Let me repeat this one more time:

Fetch (or send) data in bulk

… it’s almost always faster than processing things row-by-row (or as the Oracle folks call it: “slow-by-slow”).

And it does not matter at all, if that database logic is written in SQL or in a procedural language. The point is that accessing objects over the network (any network) is expensive, so you should minimise the access calls if ever possible.

As I’ve tweeted recently:

When calling logic over the network (any network), we should move logic to the data, not data to the logic. When working with RDBMS, we’re doing this through SQL (preferrably) or if SQL doesn’t suffice, we resort to using stored procedures.

When working with HTTP, we’re doing this with – well, it doesn’t have a name, but we should prefer making few physical HTTP calls that aggregate several logical API calls in order to transfer a lot of data in bulk.

When working with “map reduce” or “serverless” etc technology, we’re calling this “functions” or “lambdas”, which are just fancy, more modern names for stored procedures.

Conclusion

I’m not an expert in how to design complex distributed systems. This stuff is really hard. But I’ve spent many years working with RDBMS, which are also, in a very simple way, distributed systems (data on one server, client logic on another one).

A very significant amount of performance problems with RDBMS is related to the simple fact of clients making way too many calls to the database for what could be implemented in a single SQL query. Within the database, once your logic has reached the kernel, stuff gets executed really really fast. Adding more logic to a query is going to cause far less trouble than adding more queries.

Does your application take into account these things? Especially, if you’re using an ORM that generates the SQL for you, does it generate the right amount of queries, or are you suffering from “N+1 problems”? The main reason why ORM-based systems tend to be slow is because developers are not aware of the SQL their ORMs generate, e.g. due to excessive lazy loading, when a single SQL (or JPQL) query would have been a much better choice. It’s not the ORM’s fault. Most ORMs can get this right.

It’s the developer’s responsibility to think about where the logic should be executed. The rule of thumb is:

If you’re looping in the client to fetch individual things from the same server, you’re doing it wrong. Move the loop into the server.

And by doing so, you’ve written your first (ghasp) “stored procedure”. You’ll write many more, and you’ll love it, once you realise how much speed you’re gaining.

Or, in other words:

Update: Some criticism from the reddit discussion of this article

/u/cogman10 made good points in his comment warning about batching “too big” workloads, which is perfectly correct when batching write heavy tasks. Large batches may increase the contention inside of your database. If you’re working in an MVCC environment (such as Oracle), having transactions that span millions of updates will put a lot of pressure on the UNDO/REDO log, and all other sessions reading from the same data will feel that pain.

Also, when working with HTTP, beware of the fact that batches are harder to cache than individual requests. This article made the assumption that HTTP requests are:

  • Authorised – i.e. caching doesn’t even make sense as the authorisation might be revoked or the session terminated
  • Operating on similar resources, e.g. fetching a bulk of IDs in one go might be more sensible than fetching each ID individuall

… of course, as always, don’t follow advice you find on the internet blindly 🙂 This article illustrated a common mistake. The fix isn’t always as simple as illustrated here, but often it really is.

How to Fetch Oracle DBMS_OUTPUT from JDBC

When working with Oracle stored procedures, it is not uncommon to have debug log information available from DBMS_OUTPUT commands. For instance, if we have a procedure like this:

CREATE TABLE my_table (i INT);

CREATE OR REPLACE PROCEDURE my_procedure (i1 INT, i2 INT) IS
BEGIN
  INSERT INTO my_table 
  SELECT i1 FROM dual UNION ALL 
  SELECT i2 FROM dual;
  
  dbms_output.put_line(sql%rowcount || ' rows inserted');
END my_procedure;
/

The procedure works just the same, regardless if we’re reading the output from the DBMS_OUTPUT call. It is there purely for logging purposes. Now, if we call the above procedure from a tool like SQL Developer or sqlplus, we could write:

SET SERVEROUTPUT ON
BEGIN
  my_procedure(1, 2);
END;
/

To get a result like this:

PL/SQL-Prozedur erfolgreich abgeschlossen.
2 rows inserted

(pardon my german)

How to get this output from JDBC

By default, we don’t get such output from JDBC as the overhead of transferring all this output is usually not worth the trouble. If we still wanted to call the procedure AND get the server output, we cannot simply write SET SERVEROUTPUT ON, as that is a command specific to sqlplus. We have to wrap our procedure calls in two other calls:

try (Connection c = DriverManager.getConnection(url, properties);
     Statement s = c.createStatement()) {

    try {
        // First, we have to enable the DBMS_OUTPUT. Otherwise,
        // all calls to DBMS_OUTPUT made on our connection won't
        // have any effect.
        s.executeUpdate("begin dbms_output.enable(); end;");

        // Now, this is the actually interesting procedure call
        s.executeUpdate("begin my_procedure(1, 2); end;");

        // After we're done with our call(s), we can proceed to
        // fetch the SERVEROUTPUT explicitly, using
        // DBMS_OUTPUT.GET_LINES
        try (CallableStatement call = c.prepareCall(
            "declare "
          + "  num integer := 1000;"
          + "begin "
          + "  dbms_output.get_lines(?, num);"
          + "end;"
        )) {
            call.registerOutParameter(1, Types.ARRAY,
                "DBMSOUTPUT_LINESARRAY");
            call.execute();

            Array array = null;
            try {
                array = call.getArray(1);
                Stream.of((Object[]) array.getArray())
                      .forEach(System.out::println);
            }
            finally {
                if (array != null)
                    array.free();
            }
        }
    }

    // Don't forget to disable DBMS_OUTPUT for the remaining use
    // of the connection.
    finally {
        s.executeUpdate("begin dbms_output.disable(); end;");
    }
}

As can be seen above, this is rather simple:

  • Initialise a connection with DBMS_OUTPUT.ENABLE
  • Do the actually interesting work
  • Fetch the output and call DBMS_OUTPUT.DISABLE

This could also be refactored into a utility:

// Alternatively, just use https://github.com/jOOQ/jOOL
interface WhyUNoCheckedExceptionRunnable {
    void run() throws Exception;
}

static void logServerOutput(
    Connection connection, 
    WhyUNoCheckedExceptionRunnable runnable
) throws Exception {
    try (Statement s = connection.createStatement()) {
       try {
           s.executeUpdate("begin dbms_output.enable(); end;");
           runnable.run();

           try (CallableStatement call = connection.prepareCall(
               "declare "
             + "  num integer := 1000;"
             + "begin "
             + "  dbms_output.get_lines(?, num);"
             + "end;"
           )) {
               call.registerOutParameter(1, Types.ARRAY,
                   "DBMSOUTPUT_LINESARRAY");
               call.execute();

               Array array = null;
               try {
                   array = call.getArray(1);
                   Stream.of((Object[]) array.getArray())
                         .forEach(System.out::println);
               }
               finally {
                   if (array != null)
                       array.free();
               }
           }
       }
       finally {
           s.executeUpdate("begin dbms_output.disable(); end;");
       }
   }
}

This can now be called conveniently as such:

try (Connection c = DriverManager.getConnection(url, properties);
     Statement s = c.createStatement()) {

    logServerOutput(c, () -> 
        s.executeUpdate("begin my_procedure(1, 2); end;"));
}

How to do the same with jOOQ?

jOOQ 3.11 will have built in support for fetching this server output through its ExecuteListener SPI with https://github.com/jOOQ/jOOQ/issues/6580

We can either use jOOQ’s plain SQL API as such:

try (Connection c = DriverManager.getConnection(url, properties)) {

    // Specify this setting to fetch server output explicitly
    DSLContext ctx = DSL.using(c, 
        new Settings().withFetchServerOutputSize(10));
    ctx.execute("begin my_procedure(1, 2); end;");
}

Or, use the code generator for even more type safe calls to the procedures:

try (Connection c = DriverManager.getConnection(url, properties)) {

    // Specify this setting to fetch server output explicitly
    DSLContext ctx = DSL.using(c, 
        new Settings().withFetchServerOutputSize(10));
    myProcedure(ctx.configuration(), 1, 2);
}

The log output will be:

DEBUG [org.jooq.tools.LoggerListener          ] - Executing query : begin my_procedure(1, 2); end;
DEBUG [org.jooq.impl.FetchServerOutputListener] - 2 rows inserted          

JOIN Elimination: An Essential Optimiser Feature for Advanced SQL Usage

The SQL language has one great advantage over procedural, object oriented, and “ordinary” functional programming languages. The fact that it is truly declarative (i.e. a 4GL / fourth generation programming language) means that a sophisticated optimiser can easily transform one SQL expression into another, equivalent SQL expression, which might be faster to execute.

How does this work?

Expression Tree Equivalence

Here’s an introduction to equivalent expression trees from my SQL Masterclass: Let’s assume we want to evaluate the following expression:

A + (A - B)

Now in maths, it can be proven trivially that by the laws of associativity, the above expression is really the same as this one:

(A + A) - B

We didn’t really win anything yet, but we can equally trivially turn the above addition into a multiplication that can be proven to be exactly equivalent:

(2 * A) - B

Now, imagine that A is an extremely “expensive” value, e.g. a value that we have to fetch from the disk, or worse, from the network. The cost of accessing A is thus very high and if we can avoid one access to A by the above transformation, then the resulting expression is much faster to evaluate than the original one, even if mathematically speaking, it does not matter at all.

That’s what optimisers do all the time. They transform expressions into equivalent expressions which are faster to execute. And they constantly adapt, so if the DBA chooses to move A from a remote server to the local server, thus reducing the cost of access to A, perhaps, suddenly, the original plan will be better again, because of the cost of multiplication (just an example).

A SQL Example

An equally trivial SQL example from my SQL Masterclass shows that it really doesn’t matter mathematically if we run this query:

SELECT first_name, last_name
FROM customer
WHERE first_name = 'JAMIE'

Or this one:

SELECT *
FROM (
  SELECT first_name, last_name
  FROM customer
)
WHERE first_name = 'JAMIE'

With the SQL language, it may be a bit harder to see that these are exactly equivalent SQL statements, but if we translate the above queries to relational algebra, it may become more visible:

Selection before projection

… or WHERE before SELECT:

Projection then selection

… or SELECT before WHERE:

Don’t be fooled by relational algebra‘s term “selection”. It does not correspond to the SELECT clause, but to the WHERE clause!

We can prove (let’s leave the exercise to the reader), that both expressions are exactly equivalent, so optimisers can pick whichever they have a more efficient matching algorithm for:

  • Ordinary row stores will probably apply the selection first, before projecting, as the expensive operation is accessing rows and if we can filter rows early (e.g. through indexes) then the whole query is less expensive)
  • A column store might (just a hypothesis) apply the projection first, as the expensive operation is accessing the columns. We then might have to traverse the entire column anyway to filter out the rows

Let’s Talk About JOINs (and Their Elimination)

JOIN elimination is one of the most basic and yet most powerful SQL transformations, which is implemented by most modern databases in one way or another. It is so powerful, because we’re writing (potentially useless) JOINs all the time, when writing a bit more complex queries. See also our article about JOINs for a general overview.

Now, consider the following simplified schema, taken from the Sakila database:

CREATE TABLE address (
  address_id INT NOT NULL,
  address VARCHAR(50) NOT NULL,
  CONSTRAINT pk_address PRIMARY KEY (address_id)
);

CREATE TABLE customer (
  customer_id INT NOT NULL,
  first_name VARCHAR(45) NOT NULL,
  last_name VARCHAR(45) NOT NULL,
  address_id INT NOT NULL,
  CONSTRAINT pk_customer PRIMARY KEY  (customer_id),
  CONSTRAINT fk_customer_address FOREIGN KEY (address_id) 
    REFERENCES address(address_id)
);

Let’s ignore indexing and other useful features for this example.

INNER JOIN Elimination

The following query shows a common JOIN use-case, joining a parent table (ADDRESS) to a child table (CUSTOMER) in a to-one relationship:

SELECT c.*
FROM customer c
JOIN address a 
ON c.address_id = a.address_id

We intended to fetch all customers and their addresses. But observe: We project only columns from the CUSTOMER table and we don’t have any predicates at all, specifically not predicates using the ADDRESS table. So, we’re completely ignoring any contributions from the ADDRESS table. We never really needed the JOIN in the first place!

And in fact, the optimiser can prove this too, because of the FOREIGN KEY constraint on C.ADDRESS_ID, which guarantees that every CUSTOMER record has exactly one corresponding ADDRESS record. The JOIN does not duplicate, nor remove any CUSTOMER rows, so it is unneeded and thus eliminated (by some, not all databases, will list each database at the end of the article).

So, the database can rewrite the SQL statement to the following, equivalent SQL statement in the presence of said FOREIGN KEY:

SELECT *
FROM customer c

Now, quite obviously, this query will be faster than the previous one, if the entire JOIN can be avoided, and thus the entire access to the ADDRESS table. Neat, huh? Who would have thought that FOREIGN KEYs can be so useful in terms of performance.

The above works if there’s also a NOT NULL constraint on the FOREIGN KEY. If there isn’t, e.g. as in this query:

SELECT title
FROM film f
JOIN language l ON f.original_language_id = l.language_id

The JOIN can still be eliminated, but there needs to be a replacement NOT NULL predicate, as such:

SELECT title
FROM film
WHERE original_language_id IS NOT NULL

OUTER JOIN Elimination

A LEFT [ OUTER ] JOIN will JOIN the right table to the left table but keep rows from the left table if there is no match (again, an explanation of joins can be seen here). When we apply LEFT JOIN to the previous query…

SELECT c.*
FROM customer c
LEFT JOIN address a 
ON c.address_id = a.address_id

… then we’ll fetch all rows from CUSTOMER regardless if that customer has any ADDRESS. This is useful if the FOREIGN KEY is optional (nullable), or completely absent, e.g. through:

ALTER TABLE customer DROP CONSTRAINT fk_customer_address

OUTER JOIN is even easier to eliminate, as it doesn’t require any FOREIGN KEY constraint for the database to prove that it is unneeded. A UNIQUE constraint on the parent table (here: ADDRESS.ADDRESS_ID) is sufficient to show that for every CUSTOMER there can be at most one ADDRESS, so the LEFT JOIN won’t duplicate any CUSTOMER rows (Unlike INNER JOIN, OUTER JOIN never remove rows).

Hence, the above query can again be rewritten to the more optimal:

SELECT *
FROM customer c

OUTER JOIN Elimination with DISTINCT

Another interesting case of OUTER JOIN elimination is the following one, which unfortunately didn’t work on Oracle for a customer of ours, recently, in a complex query that ran rogue. Let’s look at some other tables of the Sakila database, which expose a to-many relationship:

CREATE TABLE actor (
  actor_id numeric NOT NULL ,
  first_name VARCHAR(45) NOT NULL,
  last_name VARCHAR(45) NOT NULL,
  CONSTRAINT pk_actor PRIMARY KEY (actor_id)
);

CREATE TABLE film (
  film_id int NOT NULL,
  title VARCHAR(255) NOT NULL,
  CONSTRAINT pk_film PRIMARY KEY (film_id)
);

CREATE TABLE film_actor (
  actor_id INT NOT NULL,
  film_id  INT NOT NULL,
  CONSTRAINT pk_film_actor PRIMARY KEY (actor_id, film_id),
  CONSTRAINT fk_film_actor_actor FOREIGN KEY (actor_id)
    REFERENCES actor (actor_id),
  CONSTRAINT fk_film_actor_film FOREIGN KEY (film_id)
    REFERENCES film (film_id)
);

Now, consider this query:

SELECT DISTINCT first_name, last_name
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id

We’re looking for all actors and their films, but then we project only distinct actors. Again, the JOIN to FILM_ACTOR doesn’t contribute anything to the result, but because we’re joining a to-many relationship here (from parent table ACTOR to child table FILM_ACTOR), the JOIN is producing duplicate rows. Without DISTINCT, we’d get something like:

FIRST_NAME   LAST_NAME
----------------------
...
PENELOPE     GUINESS
PENELOPE     GUINESS
PENELOPE     GUINESS
NICK         WAHLBERG
NICK         WAHLBERG
...

But thanks to the DISTINCT keyword, the result is (provably) no different from the result of this much simpler query:

SELECT DISTINCT first_name, last_name
FROM actor a

(Note, DISTINCT cannot be eliminated, unless we already have a UNIQUE constraint on (FIRST_NAME, LAST_NAME)).

Why not Just Refactor the SQL Manually?

Of course, all of this shouldn’t be needed if our SQL were perfect. In the above trivial examples, the SQL can (and should) be re-written manually to improve quality. But note that:

  • Developers make mistakes, and those mistakes may be very subtle when queries get more complex. I’ll show an example below.
  • The presence of this feature actually encourages writing more complex SQL, especially when using reusable views. I’ll show another example below.
  • Finally, I’ve previously advocated avoiding needless, mandatory work, like SELECT *. Such work is mandatory because the optimiser cannot prove its needlessness. In the case of these JOINs, the optimiser can prove the needlessness, so the work is no longer mandatory. It can be eliminated.

Here are some complex examples as promised, where this optimiser feature really shines:

Subtle Mistakes

Let’s consider the following query (in PostgreSQL syntax):

SELECT c.name, count(*)
FROM actor a
JOIN film_actor fa USING (actor_id)
JOIN film f USING (film_id)
JOIN film_category fc USING (film_id)
JOIN category c USING (category_id)
WHERE actor_id = 1
GROUP BY c.name
ORDER BY count(*) DESC

What does it do? For ACTOR_ID = 1 (Penelope Guiness), we’re looking for all the different film categories she played in, and the number of films per category. This is easier to understand when we look at the result:

NAME         COUNT
Horror       3
Family       2
New          2
Classics     2
Games        2
Music        1
Sci-Fi       1
Animation    1
Sports       1
Children     1
Comedy       1
Documentary  1
Foreign      1

Now, can you spot the unneeded JOINs? In fact, we never needed ACTOR, nor did we need FILM

SELECT c.name, count(*)
FROM film_actor fa
JOIN film_category fc USING (film_id)
JOIN category c USING (category_id)
WHERE actor_id = 1
GROUP BY c.name
ORDER BY count(*) DESC

Cool, eh? The JOINs can be eliminated (again, in some databases, see below) and our “mistake” is no longer relevant to the query. The mistake could have also snuck (or sneaked?) in from a previous query version, which may have looked like this, projecting also the actor information and the list of films per category, in case of which the additional JOIN are needed:

SELECT 
  c.name, count(*), 
  a.first_name, a.last_name, 
  array_agg(f.title ORDER BY f.title)
FROM actor a
JOIN film_actor fa USING (actor_id)
JOIN film f USING (film_id)
JOIN film_category fc USING (film_id)
JOIN category c USING (category_id)
WHERE actor_id = 1
GROUP BY c.name, a.first_name, a.last_name
ORDER BY count(*) DESC

The result being:

NAME         COUNT  FIRST_NAME  LAST_NAME  FILMS
Horror       3      PENELOPE    GUINESS    {"ELEPHANT TROJAN","LADY STAGE","RULES HUMAN"}
Family       2      PENELOPE    GUINESS    {"KING EVOLUTION","SPLASH GUMP"}
New          2      PENELOPE    GUINESS    {"ANGELS LIFE","OKLAHOMA JUMANJI"}
Classics     2      PENELOPE    GUINESS    {"COLOR PHILADELPHIA","WESTWARD SEABISCUIT"}
Games        2      PENELOPE    GUINESS    {"BULWORTH COMMANDMENTS","HUMAN GRAFFITI"}
Music        1      PENELOPE    GUINESS    {"WIZARD COLDBLOODED"}
Sci-Fi       1      PENELOPE    GUINESS    {"CHEAPER CLYDE"}
Animation    1      PENELOPE    GUINESS    {"ANACONDA CONFESSIONS"}
Sports       1      PENELOPE    GUINESS    {"GLEAMING JAWBREAKER"}
Children     1      PENELOPE    GUINESS    {"LANGUAGE COWBOY"}
Comedy       1      PENELOPE    GUINESS    {"VERTIGO NORTHWEST"}
Documentary  1      PENELOPE    GUINESS    {"ACADEMY DINOSAUR"}
Foreign      1      PENELOPE    GUINESS    {"MULHOLLAND BEAST"}

As you can see, this optimisation can be very useful on your legacy SQL, because if we maintain a complex query, we might not always be able to see all the JOINs that are really needed.

Reusable Views

Sometimes, we simply add additional JOINs for convenience, when building complex queries from simpler ones, e.g. by using views (which is a completely underrated RDBMS feature! You should all write more views).

Consider this view:

CREATE VIEW v_customer AS
SELECT 
  c.first_name, c.last_name, 
  a.address, ci.city, co.country
FROM customer c
JOIN address a USING (address_id)
JOIN city ci USING (city_id)
JOIN country co USING (country_id)

It’s not unlikely that we will write a view like this, simply because we’re incredibly bored to constantly join all these tables all the time. Every time we do something with customers and addresses, we need the CITY and COUNTRY table as well.

From now on (with this view), we can simply select from the view and “forget” about how it came to be. Now, let’s consider we completely forget about the underlying table, because the view was so useful all the time. We could think about doing this:

SELECT first_name, last_name
FROM v_customer

What do you think will happen? Exactly. JOIN elimination. A view isn’t really anything special, just a “macro” of some stored SQL (beware of some databases, where this isn’t always the case, e.g. MySQL, which Bill Karwin was kind enough to hint me at). So the above statement will be transformed into:

SELECT first_name, last_name
FROM (
  SELECT 
    c.first_name, c.last_name, 
    a.address, ci.city, co.country
  FROM customer c
  JOIN address a USING (address_id)
  JOIN city ci USING (city_id)
  JOIN country co USING (country_id)
) v_customer

… which can be transformed into this (we don’t need all columns in the nested select):

SELECT first_name, last_name
FROM (
  SELECT 
    c.first_name, c.last_name
  FROM customer c
  JOIN address a USING (address_id)
  JOIN city ci USING (city_id)
  JOIN country co USING (country_id)
) v_customer

… which can be transformed into this (JOINs can be eliminated):

SELECT first_name, last_name
FROM (
  SELECT 
    c.first_name, c.last_name
  FROM customer c
) v_customer

… and finally (the subquery is not useful):

SELECT first_name, last_name
FROM customer

The view is even very useful for this particular query, thanks to JOIN elimination!

Note, the SQL transformations exposed above are simply educational. Actual optimisers may perform transformations in an entirely differently, or in a different order. This is just to show what’s possible, and what kind of stuff is being done.

Cool, So Can My Database Do It?

Perhaps! Let’s look at the three different types of JOIN elimination in the context of these databases:

  • DB2 LUW 10.5
  • MySQL 8.0.2
  • Oracle 12.2.0.1
  • PostgreSQL 9.6
  • SQL Server 2014

INNER JOIN Elimination

Remember, this depends on the presence (and usefulness) of a FOREIGN KEY constraint. The SQL statement we’re using here is:

SELECT first_name, last_name
FROM customer c
JOIN address a ON c.address_id = a.address_id

We’re hoping to get:

SELECT first_name, last_name
FROM customer c

DB2 LUW

The following execution plan (created with Markus Winand’s cool utility) shows that this works in DB2, there’s no access to the ADDRESS table:

Explain Plan                                                           |
-----------------------------------------------------------------------|
ID | Operation                           |                 Rows | Cost |
 1 | RETURN                              |                      |   61 |
 2 |  FETCH CUSTOMER                     | 599 of 599 (100.00%) |   61 |
 3 |   IXSCAN IDX_CUSTOMER_FK_ADDRESS_ID | 599 of 599 (100.00%) |   20 |

MySQL

MySQL 8, apart from finally introducing CTE and window functions (yay), has a lot of new optimiser features, read Morgan Tocker’s useful optimiser guide for details. Unfortunately, INNER JOIN elimination is not implemented:

ID  TABLE  TYPE    REF                  ROWS   EXTRA
1   c      ALL                          599    
1   a      eq_ref  sakila.c.address_id  1      Using index

Not only is the JOIN executed, but it is executed using a nested loop with 599 index lookups, as MySQL still only supports NESTED LOOP JOINs, not HASH JOINs.

Bummer.

Oracle

No problem at all for Oracle:

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |   599 | 28752 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CUSTOMER |   599 | 28752 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------

… the JOIN is eliminated as expected.

PostgreSQL

Unfortunately, PostgreSQL cannot eliminate INNER JOIN:

Hash Join  (cost=19.57..42.79 rows=599 width=13)
  Hash Cond: (c.address_id = a.address_id)
  ->  Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=15)
  ->  Hash  (cost=12.03..12.03 rows=603 width=4)
        ->  Seq Scan on address a  (cost=0.00..12.03 rows=603 width=4)

Not as bad as in MySQL, though, as PostgreSQL chose to use a HASH JOIN to combine the two tables.

SQL Server

No problemo for SQL Server, the ADDRESS table access is gone!

  |--Table Scan(OBJECT:([sakila].[dbo].[customer] AS [c]))

Notice, however, that SQL Server can only eliminate INNER JOIN on NOT NULL FOREIGN KEYs!

Excellent. What about…

OUTER JOIN Elimination

This one is a bit easier to prove for the database, remember? We don’t rely on any FOREIGN KEY anymore. A UNIQUE key in the parent table is sufficient to eliminate an OUTER JOIN. We can safely expect that if the INNER JOIN could be eliminated (DB2, Oracle, SQL Server), then an OUTER JOIN can be eliminated, too.

Here’s the query:

SELECT first_name, last_name
FROM customer c
LEFT JOIN address a ON c.address_id = a.address_id

And the outcome:

DB2 LUW

Good

Explain Plan                                                           |
-----------------------------------------------------------------------|
ID | Operation                           |                 Rows | Cost |
 1 | RETURN                              |                      |   61 |
 2 |  FETCH CUSTOMER                     | 599 of 599 (100.00%) |   61 |
 3 |   IXSCAN IDX_CUSTOMER_FK_ADDRESS_ID | 599 of 599 (100.00%) |   20 |

MySQL

Still nope:

ID  TABLE  TYPE    REF                  ROWS   EXTRA
1   c      ALL                          599    
1   a      eq_ref  sakila.c.address_id  1      Using index

Oracle

Perfect:

------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          |   599 | 28752 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| CUSTOMER |   599 | 28752 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------

PostgreSQL

Unlike INNER JOIN elimination, this works. Great!

Seq Scan on customer c  (cost=0.00..14.99 rows=599 width=13)

SQL Server

As expected, good:

  |--Table Scan(OBJECT:([sakila].[dbo].[customer] AS [c]))

Finally…

OUTER JOIN Elimination with DISTINCT

Remember, the query here navigates a to-many relationship, producing duplicate records of the parent table, but then removes all those duplicates again by

  • Ignoring contributions from the child table
  • Removing duplicates with DISTINCT

It’s easy to prove that this:

SELECT DISTINCT first_name, last_name
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id

Is equivalent to this:

SELECT DISTINCT first_name, last_name
FROM actor a

Remember also, this only works with OUTER JOIN, not with INNER JOIN, as the latter might remove rows, so we have to execute it to see if it does.

DB2 LUW

Cool, this actually works!

Explain Plan                                                          |
----------------------------------------------------------------------|
ID | Operation        |                 Rows | Cost                   |
 1 | RETURN           |                      |   20                   |
 2 |  TBSCAN          | 200 of 200 (100.00%) |   20                   |
 3 |   SORT (UNIQUE)  | 200 of 200 (100.00%) |   20                   |
 4 |    TBSCAN ACTOR  | 200 of 200 (100.00%) |   20                   |

There’s no access to the FILM_ACTOR table, nor to its indexes. Very nice.

MySQL

As this is a more sophisticated transformation than the previous ones, we don’t have high hopes here.

ID  TABLE  TYPE    REF                  ROWS   EXTRA
1   a      ALL                          200    Using temporary
1   fa     ref     sakila.a.actor_id    27     Using index; Distinct

This has become a rather expensive query, again because of the lack of HASH JOIN support in MySQL!

Oracle

I’m very surprised to see that Oracle doesn’t support this optimisation, we’re executing the full query:

---------------------------------------------------------------
| Id  | Operation           | Name                    | Rows  |
---------------------------------------------------------------
|   0 | SELECT STATEMENT    |                         |  5462 |
|   1 |  HASH UNIQUE        |                         |  5462 |
|   2 |   NESTED LOOPS OUTER|                         |  5462 |
|   3 |    TABLE ACCESS FULL| ACTOR                   |   200 |
|*  4 |    INDEX RANGE SCAN | IDX_FK_FILM_ACTOR_ACTOR |    27 |
---------------------------------------------------------------

Curiously, Oracle also chose a NESTED LOOP JOIN in this case, even if we could have loaded the entire index on the FILM_ACTOR table into memory and then HASH JOINed it to ACTOR. Note that the cardinality estimate of the resulting query is quite off, too, despite the DISTINCT operation. This can lead to significant effects in an upstream query, which selects from this query (e.g. if stored as a view) – which is what happened to our customer.

PostgreSQL

PostgreSQL also doesn’t support this elimination, but at least gets cardinality estimates much more accurately and chooses a HASH JOIN operation:

HashAggregate  (cost=193.53..194.81 rows=128 width=13)
  Group Key: a.first_name, a.last_name
  ->  Hash Right Join  (cost=6.50..166.22 rows=5462 width=13)
        Hash Cond: (fa.actor_id = a.actor_id)
        ->  Seq Scan on film_actor fa  (cost=0.00..84.62 rows=5462 width=2)
        ->  Hash  (cost=4.00..4.00 rows=200 width=17)
              ->  Seq Scan on actor a  (cost=0.00..4.00 rows=200 width=17)

SQL Server

The pleasant surprise came from SQL Server, which does support this optimisation too:

  |--Sort(DISTINCT ORDER BY:([a].[first_name] ASC, [a].[last_name] ASC))
       |--Table Scan(OBJECT:([sakila].[dbo].[actor] AS [a]))

As you can see, no access to any FILM_ACTOR related objects!

Summary

Here’s a summary of what databases can eliminate:

Database INNER JOIN:
to-one
INNER JOIN nullable:
to-one
OUTER JOIN:
to-one
OUTER JOIN DISTINCT:
to-many
DB2 LUW 10.5 Yep Yep Yep Yep
MySQL 8.0.2 Nope Nope Nope Nope
Oracle 12.2.0.1 Yep Yep Yep Nope
PostgreSQL 9.6 Nope Nope Yep Nope
SQL Server 2014 Yep Nope Yep Yep

Conclusion

JOIN elimination is a very simple to understand, yet incredibly powerful feature that modern databases support to help developers build and maintain complex SQL queries without worrying too much about performance side effects.

It is possible because SQL is a 4GL (fourth-generation programming language), i.e. a declarative language whose parsed expression trees can “easily” be transformed into equivalent, simpler, and faster to execute expression trees. This work is constantly done by the optimiser behind the scenes, often without you even noticing (see example where unnecessary ACTOR and FILM tables were removed).

Currenly, DB2 and SQL Server are the leaders here with Oracle being a close runner-up, at least in my investigation. There’s hope for PostgreSQL, and a bit less hope for MySQL right now. There are tons of other interesting SQL transformations, which I’ll blog about in future blog posts, which may make the difference in other kinds of complex queries.

If you were intrigued by this sort of functionality, do have a look at my most recent SQL talk, which helps developers understand the real value of SQL in the context of the optimiser:

How to Use SQL INTERSECT to Work Around SQL’s NULL Logic

ANOTHER SQL Post this week? I got nerd-sniped:

Oooooh, challenge accepted!

So, let’s assume we have a table T with columns (A, B, C) like this:

WITH t(a, b, c) AS (
  SELECT 'a', 'b', null FROM dual UNION ALL
  SELECT 'a', null, 'c' FROM dual UNION ALL
  SELECT 'a', 'b', 'c'  FROM dual
)
SELECT * FROM t

As expected, this yields:

A       B       C
-----------------
a       b
a               c
a       b       c

Truly exciting.

Now we want to find all those rows that “match” either ('a', 'b', NULL) or ('a', NULL, 'b'). Clearly, this should produce the first two rows, right?

A       B       C
-----------------
a       b
a               c

Yes. Now the canonical solution would be to tediously write out the entire predicate as such:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a = 'a' AND b = 'b' AND c IS NULL)
OR (a = 'a' AND b IS NULL AND c = 'c')

That’s really boring. Sure, we could have factored out the first, common predicate:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE a = 'a' AND (
     (b = 'b' AND c IS NULL)
  OR (b IS NULL AND c = 'c')
)

That’s certainly better from a performance perspective, but Rafael had a nifty idea. Let’s use row value expressions (tuples) in our predicates:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a, b, c) IN (('a', 'b', NULL), ('a', NULL, 'c'))

Unfortunately this doesn’t yield any results, because nothing is equal to NULL in SQL (not even NULL itself). The above query is the same as this one:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a = 'a' AND b = 'b' AND c = NULL /* oops */)
OR (a = 'a' AND b = NULL /* oops */ AND c = 'c')

D’oh.

Solutions

The lame one

The canonical solution then would be a really lame (but perfectly valid) one. Encode NULL to be some “impossible” string value. Rafael suggested yolo. Fair enough.

This works:

WITH t(a, b, c) AS (...)
SELECT * FROM t
WHERE (a, NVL(b, 'yolo'), NVL(c, 'yolo')) 
  IN (('a', 'b', 'yolo'), ('a', 'yolo', 'c'))

All we have to do now is to always remember the term yolo which means “NULL, but not NULL thank you SQL”

The hipster one

But wait! SQL is painstakingly inconsistent when it comes to NULL. See, NULL really means UNKNOWN in three-valued logic, and this means, we never know if SQL abides to its own rules.

Come in INTERSECT. Like UNION or EXCEPT (MINUS) in Oracle, as well as SELECT DISTINCT, these set operations handle two NULL values as NOT DISTINCT. Yes, they’re not equal but also not distinct. Whatever. Just remember: That’s how it is 🙂

So, we can write this hipster solution to Rafael’s problem:

WITH t(a, b, c) AS (...)
SELECT *
FROM t
WHERE EXISTS (
  SELECT a, b, c FROM dual
  INTERSECT (
    SELECT 'a', 'b', null FROM dual
    UNION ALL
    SELECT 'a', null, 'c' FROM dual
  )
)

We create an intersection of the tuple (a, b, c), the left side of Rafael’s IN predicate, and the desired values on the right side of the IN predicate, and we’re done.

Clearly less tedious than writing the original predicates, right? (We won’t look into performance this time)

Cheers, and a happy weekend.

How to Find Redundant Indexes in SQL

The following two indexes are redundant in most SQL databases:

CREATE INDEX i_actor_1 ON actor (last_name);
CREATE INDEX i_actor_2 ON actor (last_name, first_name);

It is usually safe to drop the first index, because all queries that query the LAST_NAME column only can still profit from the second index I_ACTOR_2. The reason being that LAST_NAME is the first column of the composite index I_ACTOR_2 (it would be a different story, if it weren’t the first column).

Note: It is usually safe to drop the first index, because the benefits probably outweigh the cost:

Benefits of dropping

Costs of dropping

  • Querying a composite index can be slightly slower as can be seen in the below benchmark

Let’s see the costs of dropping the index below for Oracle, PostgreSQL, and SQL Server in this particular case (beware as always when interpreting benchmarks, they heavily depend on a lot of context, especially data size!)

Oracle

Preparation:

CREATE TABLE t (
  a NUMBER(10) NOT NULL,
  b NUMBER(10) NOT NULL
);

INSERT INTO t (a, b)
SELECT level, level
FROM dual
CONNECT BY level <= 100000;

CREATE INDEX i1 ON t(a);
CREATE INDEX i2 ON t(a, b);

EXEC dbms_stats.gather_table_stats('TEST', 'T');

Benchmark:

SET SERVEROUTPUT ON
CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  v_repeat CONSTANT NUMBER := 2000;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT /*+INDEX(t i1)*/ * FROM t WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
  
    INSERT INTO results VALUES (r, 1, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    v_ts := SYSTIMESTAMP;
      
    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT /*+INDEX(t i2)*/ * FROM t WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;
      
    INSERT INTO results VALUES (r, 2, 
      SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed / MIN(elapsed) OVER() AS NUMBER(10, 5)) ratio 
    FROM results
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio);
  END LOOP;
END;
/

DROP TABLE results;

The result being:

Run 1, Statement 1 : 1.4797
Run 1, Statement 2 : 1.45545

Run 2, Statement 1 : 1.1997
Run 2, Statement 2 : 1.01121

Run 3, Statement 1 : 1.13606
Run 3, Statement 2 : 1

Run 4, Statement 1 : 1.13455
Run 4, Statement 2 : 1.00242

Run 5, Statement 1 : 1.13303
Run 5, Statement 2 : 1.00606

Some notes on benchmarks here.

The fastest query execution in the above result yields 1, the other executions are multiples of 1. Yes, there’s a 10% difference in this case, so as you can see. The benefits (faster insertions) certainly should outweight the cost (slower queries), so, don’t apply this advice in a read-heavy / write-rarely database.

PostgreSQL

A similar difference can be seen in a PostgreSQL benchmark. No hints can be used to choose indexes, so we’re simply creating two tables:

CREATE TABLE t1 (
  a INT NOT NULL,
  b INT NOT NULL
);
CREATE TABLE t2 (
  a INT NOT NULL,
  b INT NOT NULL
);

INSERT INTO t1 (a, b)
SELECT s, s
FROM generate_series(1, 100000) AS s(s);

INSERT INTO t2 (a, b)
SELECT s, s
FROM generate_series(1, 100000) AS s(s);

CREATE INDEX i1 ON t1(a);
CREATE INDEX i2 ON t2(a, b);

ANALYZE t1;
ANALYZE t2;

Benchmark:

DO $$
DECLARE
  v_ts TIMESTAMP;
  v_repeat CONSTANT INT := 10000;
  rec RECORD;
BEGIN

  -- Repeat benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM t1 WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 1: %', r, 
      (clock_timestamp() - v_ts); 
    v_ts := clock_timestamp();

    FOR i IN 1..v_repeat LOOP
      FOR rec IN (
        SELECT * FROM t2 WHERE a = 1
      ) LOOP
        NULL;
      END LOOP;
    END LOOP;

    RAISE INFO 'Run %, Statement 2: %', r, 
      (clock_timestamp() - v_ts); 
    RAISE INFO '';
  END LOOP;
END$$;

Result:

INFO:  Run 1, Statement 1: 00:00:00.071891
INFO:  Run 1, Statement 2: 00:00:00.080833

INFO:  Run 2, Statement 1: 00:00:00.076329
INFO:  Run 2, Statement 2: 00:00:00.079772

INFO:  Run 3, Statement 1: 00:00:00.073137
INFO:  Run 3, Statement 2: 00:00:00.079483

INFO:  Run 4, Statement 1: 00:00:00.073456
INFO:  Run 4, Statement 2: 00:00:00.081508

INFO:  Run 5, Statement 1: 00:00:00.077148
INFO:  Run 5, Statement 2: 00:00:00.083535

SQL Server

Preparation:

CREATE TABLE t (
  a INT NOT NULL,
  b INT NOT NULL
);

WITH s(s) AS (
  SELECT 1
  UNION ALL
  SELECT s + 1 FROM s WHERE s < 100
)
INSERT INTO t
SELECT TOP 100000 
  row_number() over(ORDER BY (SELECT 1)), 
  row_number() over(ORDER BY (select 1)) 
FROM s AS s1, s AS s2, s AS s3;

CREATE INDEX i1 ON t(a);
CREATE INDEX i2 ON t(a, b);

UPDATE STATISTICS t;

Benchmark:

DECLARE @ts DATETIME;
DECLARE @repeat INT = 2000;
DECLARE @r INT;
DECLARE @i INT;
DECLARE @dummy VARCHAR;

DECLARE @s1 CURSOR;
DECLARE @s2 CURSOR;

DECLARE @results TABLE (
  run     INT,
  stmt    INT,
  elapsed DECIMAL
);

SET @r = 0;
WHILE @r < 5
BEGIN
  SET @r = @r + 1

  SET @s1 = CURSOR FOR 
    SELECT b FROM t WITH (INDEX (i1)) WHERE a = 1;

  SET @s2 = CURSOR FOR 
    SELECT b FROM t WITH (INDEX (i2)) WHERE a = 1;

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s1;
    FETCH NEXT FROM @s1 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s1 INTO @dummy;
    END;

    CLOSE @s1;
  END;

  DEALLOCATE @s1;
  INSERT INTO @results 
    VALUES (@r, 1, DATEDIFF(ms, @ts, current_timestamp));

  SET @ts = current_timestamp;
  SET @i = 0;
  WHILE @i < @repeat
  BEGIN
    SET @i = @i + 1

    OPEN @s2;
    FETCH NEXT FROM @s2 INTO @dummy;
    WHILE @@FETCH_STATUS = 0
    BEGIN
      FETCH NEXT FROM @s2 INTO @dummy;
    END;

    CLOSE @s2;
  END;

  DEALLOCATE @s2;
  INSERT INTO @results 
    VALUES (@r, 2, DATEDIFF(ms, @ts, current_timestamp));
END;

SELECT 'Run ' + CAST(run AS VARCHAR) + 
  ', Statement ' + CAST(stmt AS VARCHAR) + ': ' + 
  CAST(CAST(elapsed / MIN(elapsed) OVER() AS DECIMAL(10, 5)) AS VARCHAR)
FROM @results;

Result:

Run 1, Statement 1: 1.22368
Run 1, Statement 1: 1.09211

Run 2, Statement 1: 1.05263
Run 2, Statement 1: 1.09211

Run 3, Statement 1: 1.00000
Run 3, Statement 1: 1.05263

Run 4, Statement 1: 1.05263
Run 4, Statement 1: 1.00000

Run 5, Statement 1: 1.09211
Run 5, Statement 1: 1.05263

As can be seen, predictably, in all databases the smaller non-composite index is slightly faster for this type of query than the composite index. In this particular benchmark, this is specifically true because the composite index acts as a covering index.

Yet both indexes can be used for the query in a reasonable way, so if disk space / insertion speed is an issue, the redundant single-column index can be dropped.

How to find such indexes

The following query will help you detect such indexes in Oracle, PostgreSQL, and SQL Server:

Oracle

WITH indexes AS (
  SELECT
    i.owner,
    i.index_name,
    i.table_name,
    listagg(c.column_name, ', ')
      WITHIN GROUP (ORDER BY c.column_position)
      AS columns
  FROM all_indexes i
  JOIN all_ind_columns c
    ON i.owner = c.index_owner
    AND i.index_name = c.index_name
  GROUP BY i.owner, i.table_name, i.index_name, i.leaf_blocks
)
SELECT
  i.owner,
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.owner = j.owner
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns || ',%'

Result:

TABLE_NAME   delete index   columns   existing index   columns
-------------------------------------------------------------------------
T            I1             A         I2               A, B

In short, it lists all the indexes whose columns are a prefix of another index’s columns

PostgreSQL

Get ready for a really nifty query. Here’s how to discover redundant indexes in PostgreSQL, which unfortunately doesn’t seem to have an easy, out-of-the-box dictionary view to discover index columns:

WITH indexes AS (
  SELECT 
    tnsp.nspname AS schema_name,
    trel.relname AS table_name,
    irel.relname AS index_name,
    string_agg(a.attname, ', ' ORDER BY c.ordinality) AS columns
  FROM pg_index AS i
  JOIN pg_class AS trel ON trel.oid = i.indrelid
  JOIN pg_namespace AS tnsp ON trel.relnamespace = tnsp.oid
  JOIN pg_class AS irel ON irel.oid = i.indexrelid
  JOIN pg_attribute AS a ON trel.oid = a.attrelid
  JOIN LATERAL unnest(i.indkey) 
    WITH ORDINALITY AS c(colnum, ordinality)
      ON a.attnum = c.colnum
  GROUP BY i, tnsp.nspname, trel.relname, irel.relname
)
SELECT
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.schema_name = j.schema_name
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns || ',%';

This is a really nice case of lateral unnesting with ordinality, which you should definitely add to your PostgreSQL tool chain.

SQL Server

Now, SQL Server doesn’t have a nice STRING_AGG function (yet), but we can work around this using STUFF and XML to get the same query.

Of course, there are other solutions using recursive SQL, but I’m too lazy to translate the simple string pattern-matching approach to something recursive.

WITH 
  i AS (
    SELECT 
	  s.name AS schema_name,
      t.name AS table_name,
      i.name AS index_name,
      c.name AS column_name,
      ic.index_column_id
    FROM sys.indexes i 
    JOIN sys.index_columns ic 
      ON i.object_id = ic.object_id 
      AND i.index_id = ic.index_id 
    JOIN sys.columns c 
      ON ic.object_id = c.object_id 
      AND ic.column_id = c.column_id 
    JOIN sys.tables t 
      ON i.object_id = t.object_id 
	JOIN sys.schemas s
	  ON t.schema_id = s.schema_id
  ),
  indexes AS (
    SELECT
	  schema_name,
      table_name,
      index_name,
      STUFF((
        SELECT ',' + j.column_name 
        FROM i j
        WHERE i.table_name = j.table_name 
        AND i.index_name = j.index_name 
        FOR XML PATH('') -- Yay, XML in SQL!
      ), 1, 1, '') columns
    FROM i
    GROUP BY schema_name, table_name, index_name
  )
SELECT
  i.schema_name,
  i.table_name,
  i.index_name AS "Deletion candidate index",
  i.columns AS "Deletion candidate columns",
  j.index_name AS "Existing index",
  j.columns AS "Existing columns"
FROM indexes i
JOIN indexes j
  ON i.schema_name = j.schema_name
  AND i.table_name = j.table_name
  AND j.columns LIKE i.columns + ',%';

A note on partial indexes

SQL Server and PostgreSQL support “partial indexes”, i.e. indexes that contain only parts of your data (and Oracle can emulate them in various ways). Such indexes might appear in the resulting list – you may want to be careful to check if they’re really redundant or not. Chances are, they’re there for a very good reason.

Conclusion

Now go run the above query on your production database and… Very carefully and reasonably think about whether you really want to drop those indexes 😉