Arcane magic with the SQL:2003 MERGE statement

Every now and then, we feel awkward about having to distinguish INSERT from UPDATE for any of the following reasons:
  • We have to issue at least two statements
  • We have to think about performance
  • We have to think about race conditions
  • We have to choose between [UPDATE; IF UPDATE_COUNT = 0 THEN INSERT] and [INSERT; IF EXCEPTION THEN UPDATE]
  • We have to do those statements once per updated / inserted record
All in all, this is a big source of error and frustration. When at the same time, it could’ve been so easy with the SQL MERGE statement!

A typical situation for MERGE

Among many other use-cases, the MERGE statement may come in handy when handling many-to-many relationships. Let’s say we have this schema:

CREATE TABLE documents (
  id NUMBER(7) NOT NULL,
  CONSTRAINT docu_id PRIMARY KEY (id)
);

CREATE TABLE persons (
  id NUMBER(7) NOT NULL,
  CONSTRAINT pers_id PRIMARY KEY (id)
);

CREATE TABLE document_person (
  docu_id NUMBER(7) NOT NULL,
  pers_id NUMBER(7) NOT NULL,
  flag NUMBER(1) NULL,

  CONSTRAINT docu_pers_pk PRIMARY KEY (docu_id, pers_id),
  CONSTRAINT docu_pers_fk_docu
    FOREIGN KEY (docu_id) REFERENCES documents(id),
  CONSTRAINT docu_pers_fk_pers
    FOREIGN KEY (pers_id) REFERENCES persons(id)
);

The above tables are used to model which person has read (flag=1) / deleted (flag=2) what document. To make things simple, the “document_person” entity is usually OUTER JOINed to “documents”, such that the presence or absence of a “document-person” record may have the same semantics: “flag IS NULL” means the document is unread. Now when you want to mark a document as read, you have to decide whether you INSERT a new “document_person”, or whether to UPDATE the existing one. Same with deletion. Same with marking all documents as read, or deleting all documents.

Use MERGE instead

You can do it all in one statement! Let’s say, you want to INSERT/UPDATE one record, in order to mark one document as read for a person:

-- The target table
MERGE INTO document_person dst

-- The data source. In this case, just a dummy record
USING (
  SELECT :docu_id as docu_id,
         :pers_id as pers_id,
         :flag    as flag
  FROM DUAL
) src

-- The merge condition (if true, then update, else insert)
ON (dst.docu_id = src.docu_id AND dst.pers_id = src.pers_id)

-- The update action
WHEN MATCHED THEN UPDATE SET
  dst.flag = src.flag

-- The insert action
WHEN NOT MATCHED THEN INSERT (
  dst.docu_id,
  dst.pers_id,
  dst.flag
)
VALUES (
  src.docu_id,
  src.pers_id,
  src.flag
)

This looks quite similar, yet incredibly more verbose than MySQL’s INSERT .. ON DUPLICATE KEY UPDATE statement, which is a bit more concise.

Taking it to the extreme

But you can go further! As I said previously, you may also want to mark ALL documents as read, for a given person. No problem with MERGE. The following statement does the same as the previous one, if you specify :docu_id. If you leave it null, it will just mark all documents as :flag:

MERGE INTO document_person dst

-- The data source is now all "documents" (or just :docu_id) left outer
-- joined with the "document_person" mapping
USING (
  SELECT d.id     as docu_id,
         :pers_id as pers_id,
         :flag    as flag
  FROM documents d
  LEFT OUTER JOIN document_person d_p
  ON d.id = d_p.docu_id AND d_p.pers_id = :pers_id
  -- If :docu_id is set, select only that document
  WHERE (:docu_id IS NOT NULL AND d.id = :docu_id)
  -- Otherwise, select all documents
     OR (:docu_id IS NULL)
) src

-- If the mapping already exists, update. Else, insert
ON (dst.docu_id = src.docu_id AND dst.pers_id = src.pers_id)

-- The rest stays the same
WHEN MATCHED THEN UPDATE SET
  dst.flag = src.flag
WHEN NOT MATCHED THEN INSERT (
  dst.docu_id,
  dst.pers_id,
  dst.flag
)
VALUES (
  src.docu_id,
  src.pers_id,
  src.flag
)

MERGE support in jOOQ

MERGE is also fully supported in jOOQ. See the manual for more details (scroll to the bottom): https://www.jooq.org/manual/JOOQ/Query/ Happy merging! :-)

Leave a Reply