Calculating Weighted Averages When Joining Tables in SQL

I stumbled upon a very interesting jOOQ question on Stack Overflow that required the calculation of a weighted average. Why is that. Problem description Assuming you have this database (using PostgreSQL syntax): create table transactions ( id bigint not null primary key, lines bigint not null, price numeric(18, 2) not null, profit numeric(18, 2) not … Continue reading Calculating Weighted Averages When Joining Tables in SQL

How to Calculate a Cumulative Percentage in SQL

A fun report to write is to calculate a cumulative percentage. For example, when querying the Sakila database, we might want to calculate the percentage of our total revenue at any given date. The result might look like this: Notice the beautifully generated data. Or as raw data: payment_date |amount |percentage -------------|--------|---------- 2005-05-24 |29.92 |0.04 … Continue reading How to Calculate a Cumulative Percentage in SQL

How to Emulate PERCENTILE_DISC in MySQL and Other RDBMS

In my previous article, I showed what the very useful percentile functions (also known as inverse distribution functions) can be used for. Unfortunately, these functions are not ubiquitously available in SQL dialects. As of jOOQ 3.11, they are known to work in these dialects: DialectAs aggregate functionAs window functionMariaDB 10.3.3NoYesOracle 18cYesYesPostgreSQL 11YesNoSQL Server 2017NoYesTeradata 16YesNo … Continue reading How to Emulate PERCENTILE_DISC in MySQL and Other RDBMS

Calculate Percentiles to Learn About Data Set Skew in SQL

B-Tree indexes are perfect when your data is uniformly distributed. They are not really useful, when you have skewed data. I'll explain later why this is the case, but let's first learn how to detect "skew" What is skew? Skew is a term from statistics when a normal distribution is not symmetric. The example given … Continue reading Calculate Percentiles to Learn About Data Set Skew in SQL

How to Work Around ORA-38104: Columns referenced in the ON Clause cannot be updated

Standard SQL is a beautiful language. Vendor specific implementations, however, have their warts. In Oracle, for example, it's not possible to update any columns in a MERGE statement, which have been referenced by the ON clause. For example: CREATE TABLE person ( id NUMBER(18) NOT NULL PRIMARY KEY, user_name VARCHAR2(50) NOT NULL UNIQUE, score NUMBER(18) … Continue reading How to Work Around ORA-38104: Columns referenced in the ON Clause cannot be updated

How to Aggregate an Archive Log’s Deltas into a Snapshot with SQL

A customer of my popular SQL training (which you should book!) has recently challenged me to optimise a hierarchical query that merges an archive log's deltas in order to obtain a snapshot of some record at a given point in time. In this article, I will reproduce their problem statement in a simplified version and … Continue reading How to Aggregate an Archive Log’s Deltas into a Snapshot with SQL

How to Write a Multiplication Aggregate Function in SQL

Everyone knows the SQL SUM() aggregate function (and many people also know its window function variant). When querying the Sakila database, we can get the daily revenue (using PostgreSQL syntax): WITH p AS ( SELECT CAST (payment_date AS DATE) AS date, amount FROM payment ) SELECT date, SUM (amount) AS daily_revenue, SUM (SUM (amount)) OVER … Continue reading How to Write a Multiplication Aggregate Function in SQL

Beware of Hidden PL/SQL to SQL Context Switches

I recently stumbled upon a curious query on a customer's productive Oracle database: SELECT USER FROM SYS.DUAL Two things caught my attention: The query was executed many billions of times per month, accounting for about 0.3% of that system's load. That's 0.3% for something extremely silly! I don't think that customer would ever qualify the … Continue reading Beware of Hidden PL/SQL to SQL Context Switches

Find the Next Non-NULL Row in a Series With SQL

I've stumbled across this fun SQL question on reddit, recently. The question was looking at a time series of data points where some events happened. For each event, we have the start time and the end time timestamp start end ----------------------------------- 2018-09-03 07:00:00 1 null 2018-09-03 08:00:00 null null 2018-09-03 09:00:00 null null 2018-09-03 10:00:00 … Continue reading Find the Next Non-NULL Row in a Series With SQL

How to Write Multiset Conditions With Oracle VARRAY Types

Oracle is one of the few databases that implements the SQL standard ORDBMS extensions, which essentially allow for nested collections. Other databases that have these features to some extent are CUBRID, Informix, PostgreSQL. Oracle has two types of nested collections: -- Nested tables CREATE TYPE t1 AS TABLE OF VARCHAR2(10); / -- Varrays CREATE TYPE … Continue reading How to Write Multiset Conditions With Oracle VARRAY Types