How to Use SQL UPDATE .. RETURNING to Run DML More Efficiently

At a customer site, I recently refactored a “slow-by-slow” PL/SQL loop and turned that into an efficient set based UPDATE statement saving many lines of code and running much faster. In this blog post, I will show how that can be done. The blog post will focus on Oracle and UPDATE, but rest assured, this technique can be implemented in other databases too, and also with other DML statements, such as INSERT, DELETE, and depending on the vendor, even MERGE.

The Schema

The original logic that needed refactoring worked on the following data set (simplified for this blog post):

-- Table definition
CREATE TABLE t (
  id NUMBER(10) GENERATED ALWAYS AS IDENTITY NOT NULL PRIMARY KEY,
  category NUMBER(10) NOT NULL,
  counter NUMBER(10),
  text VARCHAR2(10) NOT NULL
);

-- Sample data
INSERT INTO t (category, text)
SELECT dbms_random.value(1, 10), dbms_random.string('a', 10)
FROM dual
CONNECT BY level <= 100;

-- Output of data
SELECT *
FROM t
ORDER BY counter DESC NULLS LAST, category, id;

The sample data generated above might look like this:

ID   CATEGORY   COUNTER   TEXT
16   1                    UIXSzJxDez
25   1                    hkvvrTRbTC
29   1                    IBOJYveDgf
44   1                    VhcwOugrWB
46   1                    gBJFJrPQYy
47   1                    bVzfHznOUj
10   2                    KpHHgsRXwR
11   2                    vpkhTrkaaU
14   2                    fDlNtRdvBE

So, there were certain records belonging to some category, and there’s a counter indicating how often each record has been encountered in some system.

The “slow-by-slow” PL/SQL Logic

(“slow-by-slow” rhymes with “row-by-row”. You get the idea)

Every now and then, there was a message from another system that should:

  • Fetch all the rows of a category
  • Increase the counter on each element of that category
  • Concatenate all the texts of that category and return those

Sounds like something that can be done very easily using a loop. In PL/SQL (but imagine you could be doing this in Java just the same):

SET SERVEROUTPUT ON
DECLARE
  v_text VARCHAR2(2000);
  v_updated PLS_INTEGER := 0;
BEGIN
  FOR r IN (
    SELECT * FROM t WHERE category = 1
  ) LOOP
    v_updated := v_updated + 1;
    
    IF v_text IS NULL THEN
      v_text := r.text;
    ELSE
      v_text := v_text || ', ' || r.text;
    END IF;
    
    IF r.counter IS NULL THEN
      UPDATE t SET counter = 1 WHERE id = r.id;
    ELSE
      UPDATE t SET counter = counter + 1 WHERE id = r.id;
    END IF;
  END LOOP;
  
  COMMIT;
  dbms_output.put_line('Rows updated: ' || v_updated);
  dbms_output.put_line('Returned:     ' || v_text);
END;
/

The result of this block would be:

Rows updated: 6
Returned:     UIXSzJxDez, hkvvrTRbTC, IBOJYveDgf, VhcwOugrWB, gBJFJrPQYy, bVzfHznOUj

And the data is now:

ID   CATEGORY   COUNTER   TEXT
16   1          1         UIXSzJxDez
25   1          1         hkvvrTRbTC
29   1          1         IBOJYveDgf
44   1          1         VhcwOugrWB
46   1          1         gBJFJrPQYy
47   1          1         bVzfHznOUj
10   2                    KpHHgsRXwR
11   2                    vpkhTrkaaU
14   2                    fDlNtRdvBE

Wonderful. What’s wrong with this? The logic is straightforward and runs quite quickly. Until you run this many many many times per second – then it suddenly starts to hurt.

Thinking Set Based

Whenever you work with RDBMS, try to think in terms of data sets and try running a bulk operation on such a data set. (Exceptions exist, see caveats below). The modification of the data can be written in a single SQL statement, instead of updating the same table many times.

Here’s the SQL statement in Oracle, that does precisely the same thing:

SET SERVEROUTPUT ON
DECLARE
  v_text VARCHAR2(2000);
  v_updated PLS_INTEGER := 0;
BEGIN
  UPDATE t
  SET counter = nvl(counter, 0) + 1
  WHERE category = 1
  RETURNING
    listagg (text, ', ') WITHIN GROUP (ORDER BY text),
    count(*)
  INTO
    v_text,
    v_updated;
    
  COMMIT;
  dbms_output.put_line('Rows updated: ' || v_updated);
  dbms_output.put_line('Returned:     ' || v_text);
END;
/

Again, the same output:

Rows updated: 6
Returned:     UIXSzJxDez, hkvvrTRbTC, IBOJYveDgf, VhcwOugrWB, gBJFJrPQYy, bVzfHznOUj

And the data set is now:

ID   CATEGORY   COUNTER   TEXT
16   1          2         UIXSzJxDez
25   1          2         hkvvrTRbTC
29   1          2         IBOJYveDgf
44   1          2         VhcwOugrWB
46   1          2         gBJFJrPQYy
47   1          2         bVzfHznOUj
10   2                    KpHHgsRXwR
11   2                    vpkhTrkaaU
14   2                    fDlNtRdvBE

Below, you can see each piece of logic of the original PL/SQL block, and the corresponding logic in the revised SQL statement

There are 4 areas of interest:

  1. Red: The category predicate
    In the PL/SQL version, this predicate is a simple access predicate for the SELECT statement, over whose implicit cursor we’re iterating. In the set based SQL version, that predicate has been moved into the single bulk UPDATE statement. Thus: we’re modifying the exact same set of rows.
  2. Blue: The number of updated rows
    Before, we had a count variable that counted the number of iterations over the implicit cursor. Now, we can simply count the number of rows being updated in the bulk update statement, conveniently in the RETURNING clause. An alternative (in Oracle) would have been to use SQL%ROWCOUNT, which is available for free after a single bulk UPDATE statement.
  3. Orange: The string concatenation
    The requirement was to concatenate all the texts which are being updated. In the “slow-by-slow” PL/SQL approach, we’re again keeping around a local variable and concatenate new values to it, doing some NULL handling, initially. In the set based SQL version, we can simply use LISTAGG() in the RETURNING clause. Notice, there seems to be a bug with this usage of LISTAGG. The ORDER BY clause has no effect.
  4. Green: The actual update
    In the “slow-by-slow” version, we run 1 UPDATE statement per row, which can turn out to be devastating, if we’re updating a lot of rows. Besides, in this particular case, the developer(s) have been unaware of the possibility of NULL handling using NVL() (or COALESCE() or similar). There is really only one UPDATE statement necessary here.

That already looks a lot neater.

How does it perform?

In a quick test script, which I’ve linked here, I could observe the following times for the above test data set, when running each approach 5 x 10000 times:

Run 1, Statement 1 : 2.63841 (avg : 2.43714)
Run 1, Statement 2 : 1.11019 (avg : 1.04562)
Run 2, Statement 1 : 2.35626 (avg : 2.43714)
Run 2, Statement 2 : 1.05716 (avg : 1.04562)
Run 3, Statement 1 : 2.38004 (avg : 2.43714)
Run 3, Statement 2 : 1.05153 (avg : 1.04562)
Run 4, Statement 1 : 2.47451 (avg : 2.43714)
Run 4, Statement 2 : 1.00921 (avg : 1.04562)
Run 5, Statement 1 : 2.33649 (avg : 2.43714)
Run 5, Statement 2 : 1.00000 (avg : 1.04562)

As always, I’m not publishing actual benchmark times, but relative times compared to the fastest run. The set based approach is consistently 2.5x faster on my machine (Oracle 18c on Docker on Windows 10 / SSD). This is updating 6 rows per execution.

When we remove the WHERE category = 1 predicate, updating the entirety of the 100 rows each time, we get even more drastic results. I’m now running this 5 x 2000 times to get:

Run 1, Statement 1 : 10.21833 (avg : 11.98154)
Run 1, Statement 2 : 1.219130 (avg : 1.739260)
Run 2, Statement 1 : 10.17014 (avg : 11.98154)
Run 2, Statement 2 : 3.027930 (avg : 1.739260)
Run 3, Statement 1 : 9.444620 (avg : 11.98154)
Run 3, Statement 2 : 1.000000 (avg : 1.739260)
Run 4, Statement 1 : 20.54692 (avg : 11.98154)
Run 4, Statement 2 : 1.193560 (avg : 1.739260)
Run 5, Statement 1 : 9.527690 (avg : 11.98154)
Run 5, Statement 2 : 2.255680 (avg : 1.739260)

At this point, no one needs to be convinced anymore that a set based approach is much better for updating your data than a row-by-row approach in a language like PL/SQL or Java, etc.

Caveats

Bulk updates are much better than row-by-row (remember: “slow-by-slow”) updates, regardless if you’re using PL/SQL or Java or whatever client language. This is because the optimiser can plan the update much more efficiently when it knows which rows will be updated in bulk, rather than seeing each individual row update afresh, not being able to plan ahead for the remaining number of updates.

However, in situations where a lot of other processes are reading the same data while you’re bulk updating them, you need to be more careful. In such cases, a bulk update can cause trouble keeping locks and log files busy while you’re updating and while the other processes need to access the data prior to your update.

One size never fits all, but at least, in every situation where you loop over a result set to update some data (or fetch additional data), ask yourself: Could I have written that logic in a single SQL statement? The answer is very often: Yes.

Other databases

A few other databases support similar language features. These include:

The DB2 syntax is quite noteworthy, because:

  • It is very elegant
  • It corresponds to the SQL standard

The UPDATE statement would have been nested in a SELECT statement:

SELECT 
  listagg (text, ', ') WITHIN GROUP (ORDER BY id),
  count(*)
FROM FINAL TABLE (
  UPDATE t
  SET counter = nvl(counter, 0) + 1
  WHERE category = 1
)

The Performance Difference Between SQL Row-by-row Updating, Batch Updating, and Bulk Updating

Something that has been said many times, but needs constant repeating until every developer is aware of the importance of this is the performance difference between row-by-row updating and bulk updating. If you cannot guess which one will be much faster, remember that row-by-row kinda rhymes with slow-by-slow (hint hint).

Disclaimer: This article will discuss only non-concurrent updates, which are much easier to reason about. In a concurrent update situation, a lot of additional factors will add complexity to the problem, including the locking strategy, transaction isolation levels, or simply how the database vendor implements things in detail. For the sake of simplicity, I’ll assume no concurrent updates are being made.

Example query

Let’s say we have a simple table for our blog posts (using Oracle syntax, but the effect is the same on all databases):

CREATE TABLE post (
  id INT NOT NULL PRIMARY KEY,
  text VARCHAR2(1000) NOT NULL,
  archived NUMBER(1) NOT NULL CHECK (archived IN (0, 1)),
  creation_date DATE NOT NULL
);

CREATE INDEX post_creation_date_i ON post (creation_date);

Now, let’s add some 10000 rows:

INSERT INTO post
SELECT 
  level,
  lpad('a', 1000, 'a'),
  0 AS archived,
  DATE '2017-01-01' + (level / 100)
FROM dual
CONNECT BY level <= 10000;

EXEC dbms_stats.gather_table_stats('TEST', 'POST');

Now imagine, we want to update this table and set all posts to ARCHIVED = 1 if they are from last year, e.g. CREATION_DATE < DATE '2018-01-01'. There are various ways to do this, but you should have built an intuition that doing the update in one single UPDATE statement is probably better than looping over each individual row and updating each individual row explicitly. Right?

Right.

Then, why do we keep doing it?

Let me ask this differently:

Does it matter?

The best way to find out is to benchmark. I’m doing two benchmarks for this:

  1. One that is run in PL/SQL, showing the performance difference between different approaches that are available to PL/SQL (namely looping, the FORALL syntax, and a single bulk UPDATE)
  2. One that is run in Java, doing JDBC calls, showing the performance difference between different approaches available to Java (namely looping, caching PreparedStatement but still looping, batching, and a single bulk UPDATE)

Benchmarking PL/SQL

The code of the benchmark can be found in this gist. I will also include it at the bottom of this blog post. The results are:

Run 1, Statement 1 : .01457 (avg : .0098)
Run 1, Statement 2 : .0133  (avg : .01291)
Run 1, Statement 3 : .02351 (avg : .02519)
Run 2, Statement 1 : .00882 (avg : .0098)
Run 2, Statement 2 : .01159 (avg : .01291)
Run 2, Statement 3 : .02348 (avg : .02519)
Run 3, Statement 1 : .01012 (avg : .0098)
Run 3, Statement 2 : .01453 (avg : .01291)
Run 3, Statement 3 : .02544 (avg : .02519)
Run 4, Statement 1 : .00799 (avg : .0098)
Run 4, Statement 2 : .01346 (avg : .01291)
Run 4, Statement 3 : .02958 (avg : .02519)
Run 5, Statement 1 : .00749 (avg : .0098)
Run 5, Statement 2 : .01166 (avg : .01291)
Run 5, Statement 3 : .02396 (avg : .02519)

The difference between Statement 1 and 3 is a factor of 2.5x

Showing the time it takes for each statement type to complete, each time updating 3649 / 10000 rows. The winner is:

Statement 1, running a bulk update

It looks like this:

UPDATE post
SET archived = 1
WHERE archived = 0 AND creation_date < DATE '2018-01-01';

Runner-up (not too far away) is:

Statement 2, using the PL/SQL FORALL syntax

It works like this:

DECLARE
  TYPE post_ids_t IS TABLE OF post.id%TYPE;
  v_post_ids post_ids_t;
BEGIN
  SELECT id 
  BULK COLLECT INTO v_post_ids
  FROM post 
  WHERE archived = 0 AND creation_date < DATE '2018-01-01';

  FORALL i IN 1 .. v_post_ids.count
    UPDATE post
    SET archived = 1
    WHERE id = v_post_ids(i);
END;

Loser (by a factor of 2.5x on our specific data set) is:

Statement 3, using an ordinary LOOP and running row-by-row updates

FOR rec IN (
  SELECT id 
  FROM post 
  WHERE archived = 0 AND creation_date < DATE '2018-01-01'
) LOOP
  UPDATE post
  SET archived = 1
  WHERE id = rec.id;
END LOOP;

It does not really come as a surprise. We’re switching between the PL/SQL engine and the SQL engine many many times, and also, instead of running through the post table only once in O(N) time, we’re looking up individual ID values in O(log N) time, N times, so the complexity went from

O(N) -> O(N log N)

We’d get far worse results for larger tables!

What about doing this from Java?

The difference is much more drastic if each call to the SQL engine has to be done over the network from another process. Again, the benchmark code is available from a gist, and I will paste it to the end of this blog post as well.

The result is (same time unit):

Run 0, Statement 1: PT4.546S
Run 0, Statement 2: PT3.52S
Run 0, Statement 3: PT0.144S
Run 0, Statement 4: PT0.028S
Run 1, Statement 1: PT3.712S
Run 1, Statement 2: PT3.185S
Run 1, Statement 3: PT0.138S
Run 1, Statement 4: PT0.025S
Run 2, Statement 1: PT3.481S
Run 2, Statement 2: PT3.007S
Run 2, Statement 3: PT0.122S
Run 2, Statement 4: PT0.026S
Run 3, Statement 1: PT3.518S
Run 3, Statement 2: PT3.077S
Run 3, Statement 3: PT0.113S
Run 3, Statement 4: PT0.027S
Run 4, Statement 1: PT3.54S
Run 4, Statement 2: PT2.94S
Run 4, Statement 3: PT0.123S
Run 4, Statement 4: PT0.03S

The difference between Statement 1 and 4 is a factor of 100x !!

So, who’s winning? Again (by far):

Statement 4, running the bulk update

In fact, the time is not too far away from the time taken by PL/SQL. With larger data sets being updated, the two results will converge. The code is:

try (Statement s = c.createStatement()) {
    s.executeUpdate(
        "UPDATE post\n" +
        "SET archived = 1\n" +
        "WHERE archived = 0\n" +
        "AND creation_date < DATE '2018-01-01'\n");
}

Followed by the not that much worse (but still 3.5x worse):

Statement 3, running the batch update

Batching can be compared to PL/SQL’s FORALL statement. While we’re running individual row-by-row updates, we’re sending all the update statements in one batch to the SQL engine. This does save a lot of time on the network and all the layers in between.

The code looks like this:

try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    );
    PreparedStatement u = c.prepareStatement(
        "UPDATE post SET archived = 1 WHERE id = ?"
    )) {

    while (rs.next()) {
        u.setInt(1, rs.getInt(1));
        u.addBatch();
    }

    u.executeBatch();
}

Followed by the losers:

Statement 1 and 2, running row by row updates

The difference between statement 1 and 2 is that 2 caches the PreparedStatement, which allows for reusing some resources. This can be a good thing, but didn’t have a very significant effect in our case, compared to the batch / bulk alternatives. The code is:

// Statement 1:
try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post\n"
      + "WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    )) {

    while (rs.next()) {
        try (PreparedStatement u = c.prepareStatement(
            "UPDATE post SET archived = 1 WHERE id = ?"
        )) {
            u.setInt(1, rs.getInt(1));
            u.executeUpdate();
        }
    }
}

// Statement 2:
try (Statement s = c.createStatement();
    ResultSet rs = s.executeQuery(
        "SELECT id FROM post\n"
      + "WHERE archived = 0\n"
      + "AND creation_date < DATE '2018-01-01'"
    );
    PreparedStatement u = c.prepareStatement(
        "UPDATE post SET archived = 1 WHERE id = ?"
    )) {

    while (rs.next()) {
        u.setInt(1, rs.getInt(1));
        u.executeUpdate();
    }
}

Conclusion

As shown previously on this blog, there is a significant cost of JDBC server roundtrips, which can be seen in the JDBC benchmark. This cost is much more severe if we unnecessarily create many server roundtrips for a task that could be done in a single roundtrip, namely by using a SQL bulk UPDATE statement.

This is not only true for updates, but also for all the other statements, including SELECT, DELETE, INSERT, and MERGE. If doing everything in a single statement isn’t possible due to the limitations of SQL, we can still save roundtrips by grouping statements in a block, either by using an anonymous block in databases that support them:

BEGIN
  statement1;
  statement2;
  statement3;
END;

(you can easily send these anonymous blocks over JDBC, as well!)

Or, by emulating anonymous blocks using the JDBC batch API (has its limitations), or by writing stored procedures.

The performance gain is not always worth the trouble of moving logic from the client to the server, but very often (as in the above case), the move is a no-brainer and there’s absolutely no reason against it.

So, remember: Stop doing row-by-row (slow-by-slow) operations when you could run the same operation in bulk, in a single SQL statement.

Hint: Always know what your ORM (if you’re using one) is doing, because the ORM can help you with automatic batching / bulking in many cases. But it often cannot, or it is too difficult to make it do so, so resorting to SQL is the way to go.

Code

PL/SQL benchmark

SET SERVEROUTPUT ON

DROP TABLE post;

CREATE TABLE post (
  id INT NOT NULL PRIMARY KEY,
  text VARCHAR2(1000) NOT NULL,
  archived NUMBER(1) NOT NULL CHECK (archived IN (0, 1)),
  creation_date DATE NOT NULL
);

CREATE INDEX post_creation_date_i ON post (creation_date);

ALTER SYSTEM FLUSH SHARED_POOL;
ALTER SYSTEM FLUSH BUFFER_CACHE;

CREATE TABLE results (
  run     NUMBER(2),
  stmt    NUMBER(2),
  elapsed NUMBER
);

DECLARE
  v_ts TIMESTAMP WITH TIME ZONE;
  
  PROCEDURE reset_post IS
  BEGIN
    EXECUTE IMMEDIATE 'TRUNCATE TABLE post';
    INSERT INTO post
    SELECT 
      level AS id,
      lpad('a', 1000, 'a') AS text,
      0 AS archived,
      DATE '2017-01-01' + (level / 100) AS creation_date
    FROM dual
    CONNECT BY level <= 10000;
    dbms_stats.gather_table_stats('TEST', 'POST');
  END reset_post;
BEGIN

  -- Repeat the whole benchmark several times to avoid warmup penalty
  FOR r IN 1..5 LOOP
  
    reset_post;
    v_ts := SYSTIMESTAMP;
    
    UPDATE post
    SET archived = 1
    WHERE archived = 0 AND creation_date < DATE '2018-01-01';
  
    INSERT INTO results VALUES (r, 1, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    
    reset_post;
    v_ts := SYSTIMESTAMP;
    
    DECLARE
      TYPE post_ids_t IS TABLE OF post.id%TYPE;
      v_post_ids post_ids_t;
    BEGIN
      SELECT id 
      BULK COLLECT INTO v_post_ids
      FROM post 
      WHERE archived = 0 AND creation_date < DATE '2018-01-01';
    
      FORALL i IN 1 .. v_post_ids.count
        UPDATE post
        SET archived = 1
        WHERE id = v_post_ids(i);
    END;
    
    INSERT INTO results VALUES (r, 2, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
    
    reset_post;
    v_ts := SYSTIMESTAMP;
      
    FOR rec IN (
      SELECT id 
      FROM post 
      WHERE archived = 0 AND creation_date < DATE '2018-01-01'
    ) LOOP
      UPDATE post
      SET archived = 1
      WHERE id = rec.id;
    END LOOP;
      
    INSERT INTO results VALUES (r, 3, SYSDATE + ((SYSTIMESTAMP - v_ts) * 86400) - SYSDATE);
  END LOOP;
  
  FOR rec IN (
    SELECT 
      run, stmt, 
      CAST(elapsed AS NUMBER(10, 5)) ratio,
      CAST(AVG(elapsed) OVER (PARTITION BY stmt) AS NUMBER(10, 5)) avg_ratio
    FROM results
    ORDER BY run, stmt
  )
  LOOP
    dbms_output.put_line('Run ' || rec.run || 
      ', Statement ' || rec.stmt || 
      ' : ' || rec.ratio || ' (avg : ' || rec.avg_ratio || ')');
  END LOOP;
  
  dbms_output.put_line('');
  dbms_output.put_line('Copyright Data Geekery GmbH');
  dbms_output.put_line('https://www.jooq.org/benchmark');
END;
/

DROP TABLE results;

JDBC benchmark

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.time.Duration;
import java.time.Instant;
import java.util.Properties;

public class OracleUpdate {

    public static void main(String[] args) throws Exception {
        Class.forName("oracle.jdbc.OracleDriver");

        String url = "jdbc:oracle:thin:@192.168.99.100:1521:ORCLCDB";
        String user = "TEST";
        String password = "TEST";

        Properties properties = new Properties();
        properties.setProperty("user", user);
        properties.setProperty("password", password);

        try (Connection c = DriverManager.getConnection(url, properties)) {
            for (int i = 0; i < 5; i++) {
                Instant ts;

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    )) {

                    while (rs.next()) {
                        try (PreparedStatement u = c.prepareStatement(
                            "UPDATE post SET archived = 1 WHERE id = ?"
                        )) {
                            u.setInt(1, rs.getInt(1));
                            u.executeUpdate();
                        }
                    }
                }

                System.out.println("Run " + i + ", Statement 1: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    );
                    PreparedStatement u = c.prepareStatement(
                        "UPDATE post SET archived = 1 WHERE id = ?"
                    )) {

                    while (rs.next()) {
                        u.setInt(1, rs.getInt(1));
                        u.executeUpdate();
                    }
                }

                System.out.println("Run " + i + ", Statement 2: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement();
                    ResultSet rs = s.executeQuery(
                        "SELECT id FROM post WHERE archived = 0 AND creation_date < DATE '2018-01-01'"
                    );
                    PreparedStatement u = c.prepareStatement(
                        "UPDATE post SET archived = 1 WHERE id = ?"
                    )) {

                    while (rs.next()) {
                        u.setInt(1, rs.getInt(1));
                        u.addBatch();
                    }

                    u.executeBatch();
                }
                System.out.println("Run " + i + ", Statement 3: " + Duration.between(ts, Instant.now()));

                resetPost(c);
                ts = Instant.now();

                try (Statement s = c.createStatement()) {
                    s.executeUpdate("UPDATE post\n" +
                        "SET archived = 1\n" +
                        "WHERE archived = 0 AND creation_date < DATE '2018-01-01'\n");
                }

                System.out.println("Run " + i + ", Statement 4: " + Duration.between(ts, Instant.now()));
            }
        }
    }

    static void resetPost(Connection c) throws SQLException {
        try (Statement s = c.createStatement()) {
            s.executeUpdate("TRUNCATE TABLE post");
            s.executeUpdate("INSERT INTO post\n" +
                "    SELECT \n" +
                "      level,\n" +
                "      lpad('a', 1000, 'a'),\n" +
                "      0,\n" +
                "      DATE '2017-01-01' + (level / 10)\n" +
                "    FROM dual\n" +
                "    CONNECT BY level <= 10000");
            s.executeUpdate("BEGIN dbms_stats.gather_table_stats('TEST', 'POST'); END;");
        }
    }
}